Real-world video data for embodied AI

Your dataset,
recorded by hand.

A global contributor network films the exact clips your model needs — first-person or third-person, verified, delivered in weeks. Not scraped. Not synthetic. Real.

Book a discovery call See how it works

30K

Clip capacity per task

100+

Countries sourced

2 wks

Typical turnaround

01 · The gap we close

There are four ways to get training data.
Only one of them learns.

Scraped, synthetic, staged, or sourced. Each has a place. Only sourced data gives your embodied AI the real hands, real homes, and real edge cases it actually needs.

Option 01 — Scraped

YouTube rips, web crawls

No license. No angle control. No task specificity. Drift between what’s online and what your robot needs to learn.

Option 02 — Synthetic

Simulation, generated

Sim-to-real gap is real. Misses the failure modes that only happen when a human is tired, distracted, or improvising.

Option 03 — Staged

In-house lab footage

Expensive. Homogeneous. Five people in one city filming the same kitchen. Your model will overfit to whoever you hired.

Option 04 — Sourced

RoboReels network

Real humans. Real kitchens. Real hands. 100+ countries. Licensed, verified, delivered to your spec.

02 · What we deliver

Every clip comes with the metadata you’d build yourself.

Every submission is voice-verified against a unique per-task phrase. Every approved clip ships with a complete metadata record — ready to drop into your training pipeline.

Task types

Household manipulation, kitchen tasks, cleaning, folding, cooking — expanding weekly. Custom tasks designed to your brief.

Camera angles

First-person POV (head-mounted, GoPro-style) and third-person fixed. Mix the ratio per dataset — 100% POV for imitation learning, blended for general VLA training.

Metadata per clip

Contributor country, timestamp, clip duration, camera angle, contributor rank, voice-code transcript, task ID. Delivered as JSON alongside the MP4.

Quality gate

Automated voice-code verification (via Whisper) + human admin review on every clip. Rejected clips never touch your dataset.

Licensing

Contributors grant training rights at submission. No residual claims. No scrape-risk lawsuits. Enterprise contract available for production volumes.

Delivery format

MP4 per clip, JSON metadata file, optional S3 bucket handoff or direct download. Custom formats on request.

03 · How it works

Brief to delivered in four steps.

You don’t manage contributors. You don’t QA clips. You tell us what you need — we handle the rest.

Brief

You describe the task, camera angle, minimum clip length, and target volume. We scope it and quote it — usually in a day.

Deploy

The task goes live across our contributor network within 24 hours. Voice-code verification and quality gates are configured per-brief.

Collect

Clips flow in. Each one gets voice-verified, human-reviewed, and geo-tagged. You watch the counter tick up in your dashboard.

Deliver

You get the full dataset — MP4s and metadata — via S3 handoff or direct download. Typical turnaround: 2 weeks for a 1,000-clip dataset.

04 · Quality mechanics

The data doesn’t drift because the incentives don’t.

Most crowdsourced data platforms reward volume. Ours rewards quality: contributors earn more by submitting better clips, and they lose rank by submitting worse ones. The system corrects itself.

Voice-code verification

Every task generates a unique 3-word phrase the contributor must speak at the start of the video. Whisper transcription validates it. Prevents re-uploads, AI-generated deepfakes, and staged footage from old phones.

4-rank contributor ladder

Bronze → Silver → Gold → Platinum. Promotion requires both volume and an approval rate above 80–90%. Bad actors plateau; quality contributors earn multipliers on every future payout.

Geographic diversity by default

First contributor from a new country gets a 1.5× bonus. The network self-diversifies without us handpicking contributors — so your dataset represents real global variance, not a single city’s kitchen.

Human admin review on every clip

Voice-code match is necessary but not sufficient. A reviewer watches every clip before it enters your dataset. Rejected clips never ship. Three rejections locks a contributor out of that task permanently.

Crypto rail, global reach

Contributors paid in TON, instantly. No banking friction in 150+ countries. You pay one invoice; we handle every contributor worldwide. No contractor agreements, no FX, no net-30.

Loyalty compounds

Contributors who stick around earn progressively larger flat bonuses per clip (up to +75 pts = +0.75 TON). Veterans are the highest-quality segment of the network, and the economics keep them there.

05 · Pricing

No tier table. Your task is the pricing model.

Every dataset is quoted against its actual spec — complexity, clip length, angle mix, volume, geography, turnaround. Ballpark ranges come back within a day of your brief.

Complexity

Fixed setup (“dishwasher, front angle”) costs less than free-form (“any kitchen task, your choice”). The tighter the brief, the lower the unit cost.

Clip length

30 seconds is the baseline. Longer clips scale roughly proportionally. Contributor time is the main cost driver.

Angle mix

First-person POV commands a premium — harder to produce, more valuable for imitation learning. Third-person is cheaper per clip.

Volume

Hundreds of clips vs. thousands. Unit economics improve at scale; ask about committed volumes.

Geography

Any-country is cheapest. Specific countries or balanced global mixes cost more. Rare markets surface slower.

Turnaround

Standard is ~2 weeks for a 1,000-clip brief. Rush options available. Priority routing adds a premium.

Tell us your task.
We’ll quote it.

30-minute discovery call. We’ll talk through your spec and come back with a concrete quote within 24 hours.

Book a discovery call

06 · Start shipping

Real data. Recorded by hand. Ready when you are.

30 minutes. No pitch. We’ll talk about what you’re training, where your current data is falling short, and whether a RoboReels pilot makes sense.