數據工程
525 skills in 數據與 AI > 數據工程
building-gitops-workflows
This skill enables Claude to construct GitOps workflows using ArgoCD and Flux. It is designed to generate production-ready configurations, implement best practices, and ensure a security-first approach for Kubernetes deployments. Use this skill when the user explicitly requests "GitOps workflow", "ArgoCD", "Flux", or asks for help with setting up a continuous delivery pipeline using GitOps principles. The skill will generate the necessary configuration files and setup code based on the user's specific requirements and infrastructure.
instagram-pipeline-expert
Expert knowledge on Instagram search providers (Serper vs Apify), rate limiting, data normalization, and cost optimization. Use this skill when user asks about "instagram search", "serper", "apify", "scraping instagram", "provider selection", "instagram pipeline", "instagram reels", or "normalize creators".
n8n-integration-patterns
Decide when to use n8n workflows versus Next.js server actions for backend logic. Use when implementing complex multi-step workflows, AI agent pipelines, or external service integrations. Provides patterns for both runtime webhook integration and development-time architectural decisions.
working-with-intervals
Work with Interval datasets (time-bounded data) using OPAL. Use when analyzing data with start and end timestamps like distributed traces, batch jobs, or CI/CD pipeline runs. Covers duration calculations, temporal filtering, and aggregating by time properties. Intervals are immutable completed activities with two timestamps, distinct from Events (single timestamp) and Resources (mutable state).
spec-pipeline
Explains the required sequence and success signals for each stage.
stream-chain
Stream-JSON chaining for multi-agent pipelines, data transformation, and sequential workflows
data-engineer
Data engineering agent for ETL pipelines, data warehousing, and analytics
devops
Deploy and manage cloud infrastructure on Cloudflare (Workers, R2, D1, KV, Pages, Durable Objects, Browser Rendering), Docker containers, and Google Cloud Platform (Compute Engine, GKE, Cloud Run, App Engine, Cloud Storage). Use when deploying serverless functions to the edge, configuring edge computing solutions, managing Docker containers and images, setting up CI/CD pipelines, optimizing cloud infrastructure costs, implementing global caching strategies, working with cloud databases, or building cloud-native applications. | Sử dụng khi: triển khai, Docker, Kubernetes, CI/CD, container, cấu hình server.
workflow-orchestration
Coordinates multi-step CI/CD pipelines by chaining autonomous-ci, code-review, smart-commit, and jules-integration plugins. Use when executing validation-to-PR workflows or recovering from CI failures.
media-processing
Process multimedia files with FFmpeg (video/audio encoding, conversion, streaming, filtering, hardware acceleration) and ImageMagick (image manipulation, format conversion, batch processing, effects, composition). Use when converting media formats, encoding videos with specific codecs (H.264, H.265, VP9), resizing/cropping images, extracting audio from video, applying filters and effects, optimizing file sizes, creating streaming manifests (HLS/DASH), generating thumbnails, batch processing images, creating composite images, or implementing media processing pipelines. Supports 100+ formats, hardware acceleration (NVENC, QSV), and complex filtergraphs. | Sử dụng khi: xử lý hình ảnh, video, audio, FFmpeg, ImageMagick, chuyển đổi media.
database-schema-extension
Extend the Supabase PostgreSQL database schema following this project's declarative schema patterns, migration workflow, and type generation pipeline. Use when adding tables, columns, enums, RLS policies, triggers, or database functions.
dataset-comparer
Compare two datasets to find differences, added/removed rows, changed values. Use for data validation, ETL verification, or tracking changes.
github-workflow-automation
Advanced GitHub Actions workflow automation with AI swarm coordination, intelligent CI/CD pipelines, and comprehensive repository management
incremental-fetch
Build resilient data ingestion pipelines from APIs. Use when creating scripts that fetch paginated data from external APIs (Twitter, exchanges, any REST API) and need to track progress, avoid duplicates, handle rate limits, and support both incremental updates and historical backfills. Triggers: 'ingest data from API', 'pull tweets', 'fetch historical data', 'sync from X', 'build a data pipeline', 'fetch without re-downloading', 'resume the download', 'backfill older data'. NOT for: simple one-shot API calls, websocket/streaming connections, file downloads, or APIs without pagination.
dabs-writer
Create and configure Databricks Asset Bundles (DABs) with best practices for multi-environment deployments. Use when working with: (1) Creating new DAB projects, (2) Adding resources (dashboards, pipelines, jobs, alerts), (3) Configuring multi-environment deployments, (4) Setting up permissions, (5) Deploying or running bundle resources
esphome-box3-builder
This skill should be used when the user asks to "configure esp32-s3-box-3", "set up box-3", "create box-3 voice assistant", "display lambda on box-3", "configure ili9xxx display", "set up gt911 touch", "configure i2s audio", "es7210 microphone", "es8311 speaker", "box-3 audio pipeline", or mentions error messages like "I2S DMA buffer error", "Touch not responding", "Display flicker", "Audio popping", "PSRAM not detected". Provides complete ESP32-S3-BOX-3 hardware templates, display lambda cookbook, touch patterns, and voice assistant configurations.
github-workflow-automation
Advanced GitHub Actions workflow automation with AI swarm coordination, intelligent CI/CD pipelines, and comprehensive repository management
no-runtime-code
Guardrails to keep the pipeline pure: specs + SEA + generators only.
data-architecture
Design data architectures with modeling, pipelines, and governance
vsa-pattern-selector
Blazor VSA パターンカタログからの適切なパターン選択支援。新機能追加、 CRUD 操作、クエリ実装、状態遷移、バウンダリー設計などの文脈で、 catalog/index.json の ai_decision_matrix に基づいて最適なパターンを 提案する。Feature Slice、Pipeline Behavior、Domain Pattern、 Query Pattern などから文脈に応じたパターンを選択。