🔧

Monitoring

153 skills in DevOps > Monitoring

server-management

Server management procedures including PM2, monitoring, and log management. CRITICAL for production operations.

xenitV1/claude-code-maestro
62
15
업데이트 5d ago

monitoring-observability

Implement comprehensive monitoring, logging, metrics, tracing, and alerting for production applications to ensure reliability and quick incident response. Use when setting up application monitoring, implementing structured logging, creating metrics and dashboards, setting up alerts, implementing distributed tracing, monitoring performance, tracking errors, or building observability into applications.

korallis/Droidz
49
6
업데이트 5d ago

sentry-performance-monitoring

Marketplace

Use when setting up performance monitoring, distributed tracing, or profiling with Sentry. Covers transactions, spans, and performance insights.

TheBushidoCollective/han
47
5
업데이트 5d ago

sre-monitoring-and-observability

Marketplace

Use when building comprehensive monitoring and observability systems.

TheBushidoCollective/han
47
5
업데이트 5d ago

aws-cost-operations

Marketplace

This skill provides AWS cost optimization, monitoring, and operational best practices with integrated MCP servers for billing analysis, cost estimation, observability, and security assessment.

zxkane/aws-skills
40
7
업데이트 5d ago

prometheus-monitoring

Set up Prometheus monitoring for applications with custom metrics, scraping configurations, and service discovery. Use when implementing time-series metrics collection, monitoring applications, or building observability infrastructure.

aj-geddes/useful-ai-prompts
25
1
업데이트 5d ago

correlation-tracing

Implement distributed tracing with correlation IDs, trace propagation, and span tracking across microservices. Use when debugging distributed systems, monitoring request flows, or implementing observability.

aj-geddes/useful-ai-prompts
25
1
업데이트 5d ago

log-aggregation

Implement centralized logging with ELK Stack, Loki, or Splunk for log collection, parsing, storage, and analysis across infrastructure.

aj-geddes/useful-ai-prompts
25
1
업데이트 5d ago

dev-sre

Marketplace

Gate 2 of the development cycle. VALIDATES that observability was correctly implementedby developers. Does NOT implement observability code - only validates it.

LerianStudio/ring
25
1
업데이트 5d ago

infrastructure-monitoring

Set up comprehensive infrastructure monitoring with Prometheus, Grafana, and alerting systems for metrics, health checks, and performance tracking.

aj-geddes/useful-ai-prompts
25
1
업데이트 5d ago

qa-observability

Production observability and performance engineering with OpenTelemetry, distributed tracing, metrics, logging, SLO/SLI design, capacity planning, performance profiling, APM integration, and observability maturity progression for modern cloud-native systems.

vasilyu1983/AI-Agents-public
21
6
업데이트 5d ago

performance-monitor

Expert performance monitor specializing in system-wide metrics collection, analysis, and optimization. Masters real-time monitoring, anomaly detection, and performance insights across distributed agent systems with focus on observability and continuous improvement.

zenobi-us/dotfiles
21
4
업데이트 5d ago

site-reliability-engineer

Production monitoring, observability, SLO/SLI management, and incident response.Trigger terms: monitoring, observability, SRE, site reliability, alerting, incident response,SLO, SLI, error budget, Prometheus, Grafana, Datadog, New Relic, ELK stack, logs, metrics,traces, on-call, production monitoring, health checks, uptime, availability, dashboards,post-mortem, incident management, runbook.Completes SDD Stage 8 (Monitoring) with comprehensive production observability:- SLI/SLO definitions and tracking- Monitoring stack setup (Prometheus, Grafana, ELK, Datadog, etc.)- Alert rules and notification channels- Incident response runbooks- Observability dashboards (logs, metrics, traces)- Post-mortem templates and analysis- Health check endpoints- Error budget trackingUse when: user needs production monitoring, observability platform, alerting, SLOs,incident response, or post-deployment health tracking.

nahisaho/MUSUBI
19
2
업데이트 5d ago

site-reliability-engineer

Production monitoring, observability, SLO/SLI management, and incident response.Trigger terms: monitoring, observability, SRE, site reliability, alerting, incident response,SLO, SLI, error budget, Prometheus, Grafana, Datadog, New Relic, ELK stack, logs, metrics,traces, on-call, production monitoring, health checks, uptime, availability, dashboards,post-mortem, incident management, runbook.Completes SDD Stage 8 (Monitoring) with comprehensive production observability:- SLI/SLO definitions and tracking- Monitoring stack setup (Prometheus, Grafana, ELK, Datadog, etc.)- Alert rules and notification channels- Incident response runbooks- Observability dashboards (logs, metrics, traces)- Post-mortem templates and analysis- Health check endpoints- Error budget trackingUse when: user needs production monitoring, observability platform, alerting, SLOs,incident response, or post-deployment health tracking.

nahisaho/MUSUBI
19
2
업데이트 5d ago

grey-haven-observability-engineering

Marketplace

Production-ready monitoring, logging, and tracing using Prometheus, Grafana, OpenTelemetry, DataDog, and Sentry. Use when setting up production monitoring, implementing SLOs, distributed tracing, or performance tracking.

greyhaven-ai/claude-code-config
15
2
업데이트 5d ago

observability-instrumentation

Marketplace

Comprehensive observability methodology implementing three pillars (logs, metrics, traces) with structured logging using Go slog, Prometheus-style metrics, and distributed tracing patterns. Use when adding observability from scratch, logs unstructured or inadequate, no metrics collection, debugging production issues difficult, or need performance monitoring. Provides structured logging patterns (contextual logging, log levels DEBUG/INFO/WARN/ERROR, request ID propagation), metrics instrumentation (counter/gauge/histogram patterns, Prometheus exposition), tracing setup (span creation, context propagation, sampling strategies), and Go slog best practices (JSON formatting, attribute management, handler configuration). Validated in meta-cc with 23-46x speedup vs ad-hoc logging, 90-95% transferability across languages (slog specific to Go but patterns universal).

yaleh/meta-cc
15
1
업데이트 5d ago

ln-367-observability-auditor

Marketplace

Observability audit worker (L3). Checks structured logging, health check endpoints, metrics collection, request tracing, log levels. Returns findings with severity, location, effort, recommendations.

levnikolaevich/claude-code-skills
13
1
업데이트 5d ago

monitoring-expert

Marketplace

Use when setting up monitoring systems, logging, metrics, tracing, or alerting. Invoke for dashboards, Prometheus/Grafana, load testing, profiling, capacity planning. Keywords: monitoring, observability, logging, metrics, tracing, alerting, Prometheus, Grafana.

Jeffallan/claude-skills
12
1
업데이트 5d ago

monitoring

Monitoring standards for monitoring in Devops environments. Covers best

williamzujkowski/standards
11
0
업데이트 5d ago

agent-performance-monitor

Expert performance monitor specializing in system-wide metrics collection, analysis, and optimization. Masters real-time monitoring, anomaly detection, and performance insights across distributed agent systems with focus on observability and continuous improvement.

Tony363/SuperClaude
10
0
업데이트 5d ago