Data Engineering
525 skills in Data & AI > Data Engineering
cicd-pipeline-architecture
Use when setting up CI/CD pipelines, experiencing deployment failures, slow feedback loops, or production incidents after deployment - provides deployment strategies, test gates, rollback mechanisms, and environment promotion patterns to prevent downtime and enable safe continuous delivery
container-hadolint
Dockerfile security linting and best practice validation using Hadolint with 100+ built-in rules aligned to CIS Docker Benchmark. Use when: (1) Analyzing Dockerfiles for security misconfigurations and anti-patterns, (2) Enforcing container image security best practices in CI/CD pipelines, (3) Detecting hardcoded secrets and credentials in container builds, (4) Validating compliance with CIS Docker Benchmark requirements, (5) Integrating shift-left container security into developer workflows, (6) Providing remediation guidance for insecure Dockerfile instructions.
dast-zap
Dynamic application security testing (DAST) using OWASP ZAP (Zed Attack Proxy) with passive and active scanning, API testing, and OWASP Top 10 vulnerability detection. Use when: (1) Performing runtime security testing of web applications and APIs, (2) Detecting vulnerabilities like XSS, SQL injection, and authentication flaws in deployed applications, (3) Automating security scans in CI/CD pipelines with Docker containers, (4) Conducting authenticated testing with session management, (5) Generating security reports with OWASP and CWE mappings for compliance.
dimensional-modeling
Design Kimball star/snowflake schemas for analytics and data warehousing with fact/dimension tables and SCD patterns.
stream-processing
Use when designing real-time data processing systems, choosing stream processing frameworks, or implementing event-driven architectures. Covers Kafka, Flink, and streaming patterns.
devsecops-expert
Expert DevSecOps engineer specializing in secure CI/CD pipelines, shift-left security, security automation, and compliance as code. Use when implementing security gates, container security, infrastructure scanning, secrets management, or building secure supply chains.
migration-planning
Plan ETL/ELT pipelines, data migrations, change data capture, and rollback strategies.
adw-design
Guide creation of AI Developer Workflows (ADWs) that combine deterministic orchestration code with non-deterministic agents. Use when building automated development pipelines, designing AFK agent systems, or implementing the PITER framework.
bff-patterns
Backend-for-Frontend architecture patterns for API aggregation, data transformation,and client-specific optimization. Activates when designing API layers betweenbackends and frontends or implementing data transformation pipelines.
aps-doc-core
Core documentation generation patterns and framework for Treasure Data pipeline layers. Provides shared templates, quality validation, testing framework, and Confluence integration used by all layer-specific documentation skills.
cicd-pipeline-security-expert
Expert in CI/CD pipeline design with focus on secret management, code signing, artifact security, and supply chain protection for desktop application builds
cicd-expert
Elite CI/CD pipeline engineer specializing in GitHub Actions, GitLab CI, Jenkins automation, secure deployment strategies, and supply chain security. Expert in building efficient, secure pipelines with proper testing gates, artifact management, and ArgoCD/GitOps patterns. Use when designing pipelines, implementing security gates, or troubleshooting CI/CD issues.
hook-event-architecture
Design hook-based event systems for ADW observability. Use when implementing real-time event broadcasting, creating hook pipelines, or building agent activity monitoring.
secrets-management
Comprehensive guidance for secure secrets management including storage solutions (Vault, AWS Secrets Manager, Azure Key Vault), environment variables, secret rotation, scanning tools, and CI/CD pipeline security. Use when implementing secrets storage, configuring secret rotation, preventing secret leaks, or reviewing credentials handling.
etl-elt-patterns
Use when designing data pipelines, choosing between ETL and ELT approaches, or implementing data transformation patterns. Covers modern data pipeline architecture.
langgraph-orchestration
LangGraph stateful multi-agent graphs with categorical coordination patterns and cyclic workflows. Use when building stateful AI agent systems, implementing multi-agent orchestration with conditional routing, creating cyclic workflows with persistence, or designing graph-based AI pipelines with checkpointing and human-in-the-loop patterns.
devops
DevOps practices for web development including Docker, CI/CD, deployment, monitoring, and infrastructure as code. Use when setting up deployment pipelines, containerizing applications, configuring servers, or implementing DevOps workflows.
data-transformation
データ変換パイプラインの設計・実装・検証を整理するスキル。スキーママッピング、ETL設計、品質確認までの実務フローを提供する。Anchors:• Designing Data-Intensive Applications / 適用: データモデリング / 目的: 変換の整合性確保• Designing Data-Intensive Applications / 適用: スキーマ設計 / 目的: マッピングの明確化• Designing Data-Intensive Applications / 適用: パイプライン設計 / 目的: 伸縮性と監視性の確保Trigger:Use when designing data transformation pipelines, defining schema mappings, implementing ETL processes, or optimizing data flows.data transformation, schema mapping, etl design, pipeline optimization, data modeling
building-gitops-workflows
This skill enables Claude to construct GitOps workflows using ArgoCD and Flux. It is designed to generate production-ready configurations, implement best practices, and ensure a security-first approach for Kubernetes deployments. Use this skill when the user explicitly requests "GitOps workflow", "ArgoCD", "Flux", or asks for help with setting up a continuous delivery pipeline using GitOps principles. The skill will generate the necessary configuration files and setup code based on the user's specific requirements and infrastructure.
preprocessing-data-with-automated-pipelines
This skill empowers Claude to preprocess and clean data using automated pipelines. It is designed to streamline data preparation for machine learning tasks, implementing best practices for data validation, transformation, and error handling. Claude should use this skill when the user requests data preprocessing, data cleaning, ETL tasks, or mentions the need for automated pipelines for data preparation. Trigger terms include "preprocess data", "clean data", "ETL pipeline", "data transformation", and "data validation". The skill ensures data quality and prepares it for effective analysis and model training.