depth-estimation
SharpAI/DeepCameraReal-time depth map privacy transforms using Depth Anything v2 (CoreML + PyTorch)
SKILL.md
name: depth-estimation description: "Real-time depth map privacy transforms using Depth Anything v2 (CoreML + PyTorch)" version: 1.2.0 category: privacy
parameters:
-
name: model label: "Depth Model" type: select options: ["depth-anything-v2-small", "depth-anything-v2-base", "depth-anything-v2-large"] default: "depth-anything-v2-small" group: Model
-
name: variant label: "CoreML Variant (macOS)" type: select options: ["DepthAnythingV2SmallF16", "DepthAnythingV2SmallF16INT8", "DepthAnythingV2SmallF32"] default: "DepthAnythingV2SmallF16" group: Model
-
name: blend_mode label: "Display Mode" type: select options: ["depth_only", "overlay", "side_by_side"] default: "depth_only" group: Display
-
name: opacity label: "Overlay Opacity" type: number min: 0.0 max: 1.0 default: 0.5 group: Display
-
name: colormap label: "Depth Colormap" type: select options: ["inferno", "viridis", "plasma", "magma", "jet", "turbo", "hot", "cool"] default: "viridis" group: Display
-
name: device label: "Device" type: select options: ["auto", "cpu", "cuda", "mps"] default: "auto" group: Performance
capabilities: live_transform: script: scripts/transform.py description: "Real-time depth estimation overlay on live feed"
Depth Estimation (Privacy)
Real-time monocular depth estimation using Depth Anything v2. Transforms camera feeds with colorized depth maps — near objects appear warm, far objects appear cool.
When used for privacy mode, the depth_only blend mode fully anonymizes the scene while preserving spatial layout and activity, enabling security monitoring without revealing identities.
Hardware Backends
| Platform | Backend | Runtime | Model |
|---|---|---|---|
| macOS | CoreML | Apple Neural Engine | apple/coreml-depth-anything-v2-small (.mlpackage) |
| Linux/Windows | PyTorch | CUDA / CPU | depth-anything/Depth-Anything-V2-Small (.pth) |
On macOS, CoreML runs on the Neural Engine, leaving the GPU free for other tasks. The model is auto-downloaded from HuggingFace and stored at ~/.aegis-ai/models/feature-extraction/.
What You Get
- Privacy anonymization — depth-only mode hides all visual identity
- Depth overlays on live camera feeds
- 3D scene understanding — spatial layout of the scene
- CoreML acceleration — Neural Engine on Apple Silicon (3-5x faster than MPS)
Interface: TransformSkillBase
This skill implements the TransformSkillBase interface. Any new privacy skill can be created by subclassing TransformSkillBase and implementing two methods:
from transform_base import TransformSkillBase
class MyPrivacySkill(TransformSkillBase):
def load_model(self, config):
# Load your model, return {"model": "...", "device": "..."}
...
def transform_frame(self, image, metadata):
# Transform BGR image, return BGR image
...
Protocol
Aegis → Skill (stdin)
{"event": "frame", "frame_id": "cam1_1710001", "camera_id": "front_door", "frame_path": "/tmp/frame.jpg", "timestamp": "..."}
{"command": "config-update", "config": {"opacity": 0.8, "blend_mode": "overlay"}}
{"command": "stop"}
Skill → Aegis (stdout)
{"event": "ready", "model": "coreml-DepthAnythingV2SmallF16", "device": "neural_engine", "backend": "coreml"}
{"event": "transform", "frame_id": "cam1_1710001", "camera_id": "front_door", "transform_data": "<base64 JPEG>"}
{"event": "perf_stats", "total_frames": 50, "timings_ms": {"transform": {"avg": 12.5, ...}}}
Setup
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt