Contact Us
PICO VR · HEAD-MOUNTED CAPTURE

First-person egocentric demonstrations captured on a consumer Pico VR HMD. Operators perform real household manipulation tasks while wearing the headset; every modality the device exposes is recorded synchronously.

STEREO RGB · TOF DEPTH · IMU · HANDS

Each episode ships a stereo color pair (60 fps), a 320×240 ToF depth stream (5 fps, uint16 mm), 6-axis IMU, head 6-DoF pose, and per-frame skeletal tracking of 25 joints on each hand.

PRE-RECTIFIED & CALIBRATED

Left, right, and depth cameras come with full intrinsics + 4×4 extrinsics referenced to the HMD body frame. Stereo baseline is shipped per episode — usable for triangulation, point-cloud lifting, or vision-language pretraining.

FRAME-LEVEL CHINESE ACTION LABELS

Each episode is segmented into hierarchical pick / place / wipe / rearrange / fold spans with start & end frames, bilingual text, and a clear “Action end” sentinel marking when the operator releases the task.

SAMPLE COVERAGE
— task seconds
10 episodes · 2 lens generations · 60 fps stereo + 5 fps ToF depth
SCENE DISTRIBUTION · across the full catalog
Preview on this page
01 cushion dust · 02 pick & throw trash · 03 wipe & stack · 04 wipe & restack · 05 arrange tableware · 06 sink & chopsticks · 07 fold cloth & wipe · 08 cushion beat dust · 09 pick floor trash · 10 place chairs
WIDE FoV
STANDARD
LEFT EYE · 60fps · pre-rectified
RIGHT EYE · stereo pair
DEPTH · 320×240 · 5fps · uint16 mm · jet
DEPTH STATS m · scene span nearest median farthest
IMU accel · gyro accel gyro
HEAD 6DOF pos x/y/z · m x y z
SHIPPED WITH EPISODE Stereo RGB (L|R) ToF depth (320×240, uint16 mm) Audio AAC + Opus Head 6-DoF IMU 6-axis L/R wrist 6-DoF L/R hand 25 joints Intrinsic × 3 Extrinsic × 3 Stereo baseline ZH action labels
00:00 / 00:00
EPISODE
HARDWARE
DURATION
TASK