Industrial‑first labelling, tied to your SOPs
Generic labels miss context. We encode your definitions of defects, safe states, and pass/fail criteria, then bind expert feedback to exact clips and timestamps.
SOP‑aligned ontology
Shared taxonomy for parts, tools, defects, and outcomes—version‑controlled.
Expert‑in‑the‑loop
Supervisors can annotate live or review passively; disagreements are resolved via playbook.
Export‑ready
COCO / YOLO / Pascal VOC, CSV, JSONL; video event spans and transcripts included.
Label types we support
Bounding boxes
Detect parts, tools, and PPE states in images/video.
Polygons / masks
Instance/semantic segmentation for surface defects and zones.
Keypoints
Pose/landmarks (e.g., hand placement, alignment markers).
Temporal events
Start/stop, interventions, pass/fail, near‑miss windows.
Sequences
Step ordering and compliance with SOP checklists.
3D / point clouds
Annotations for depth/LiDAR where applicable.
OCR & transcripts
Read gauges, screens; align audio text with frames.
Multimodal
Join video, sensor logs, and operator comments.
Our process
From raw footage to a reliable dataset, fast.
1) Intake & ontology
Collect SOPs, defect catalogs, and prior examples; define label schema and classes.
2) Guideline authoring
Create visual playbooks with positive/negative examples and edge‑case rules.
3) Pilot & calibration
Label a small slice, measure agreement, refine definitions until stable.
4) Production labelling
Trained annotators label at scale; disagreements routed to experts.
5) QA & gold tests
Inject gold items, run double‑blind checks, and compute agreement metrics.
6) Handoff to training
Export datasets + docs; support model training and error analysis.
Quality assurance
Multi‑pass review
1st pass label → 2nd pass review → lead auditor sign‑off on edge cases.
Agreement metrics
Inter‑annotator agreement and error heatmaps guide retraining.
Gold‑set governance
Seeded checks with precision/recall scored per labeler and class.
We maintain versioned guidelines and changelogs so labels remain consistent over time.
What you get
Labeling guidelines
Illustrated rules with examples, edge cases, and decision trees.
Ontology & schema
Classes, attributes, and relationships—version‑controlled.
Labeled dataset slice
Images/video/audio with annotations + timestamps; transcripts when relevant.
QA report
Agreement, precision/recall on gold items, and error analysis.
Exports & tooling files
COCO, YOLO, Pascal VOC, CSV, JSONL; export scripts if needed.
Data room setup
Folder structure, permissions, and retention policy aligned to your rules.
How we measure
Throughput
Frames/hour or events/hour by labeler, normalized by class difficulty.
Agreement
Inter‑annotator agreement on a rolling window; trend to target before scale‑up.
Quality
Precision/recall on gold items and auditor spot‑checks.
Latency
Capture‑to‑label and label‑to‑train cycle times.
Coverage
Edge‑case representation and class balance over time.
Cost per unit
Transparent pricing per asset/hour with QA overhead visible.
Security & IP
- Deployment: on‑prem or your private VPC; data never leaves your control.
- Access control & audit: least‑privilege roles, audit logs, and SSO.
- Privacy: redaction zones, blur for faces/badges/screens; configurable retention.
- Ownership: labels and resulting models are your IP; no cross‑customer training.
We align to your compliance requirements and document controls during onboarding.
Tooling & integrations
Use your stack or ours
We can label in your environment or provide a managed stack with dashboards and exports.
Formats & pipelines
COCO / YOLO / Pascal VOC / CSV / JSONL; simple scripts to push to your training jobs.
We also support linking operator comments and transcripts directly to labeled moments for RLHF workflows.
FAQ
Minimum dataset size?
We can start with a small pilot slice (e.g., a few hours of video or a few thousand frames) to stabilize guidelines before scaling.
Who owns the labels?
You do. All annotations, guidelines, and derived datasets are your IP.
Can you label audio and logs?
Yes—transcripts can be aligned to frames; machine logs can be joined for multimodal labels.
Pricing?
Transparent per‑asset or per‑hour rates with QA tiers; we’ll scope during the free consult.