Data Collection
The SO-101 is one of the most common data collection arms in the LeRobot community. This guide covers everything from hardware connections to recording episodes and pushing your dataset to HuggingFace.
Hardware Setup for Recording
The SO-101 data collection setup is simpler than CAN-bus arms — everything runs over USB. Here is what to connect.
Workspace Camera
USB webcam pointed at the workspace from above or from the side. Mount at a fixed position — do not move it between episodes. Verify: ls /dev/video*
Wrist Camera (optional)
Small USB camera mounted on the end-effector. Adds first-person view. A second USB port required. LeRobot supports multi-camera sync.
Follower Arm (USB serial)
The arm that executes actions. Connect via USB servo controller. Verify port with ls /dev/ttyUSB*
Leader Arm (USB serial)
A second SO-101 in compliance mode — move it with your hand to drive the follower. Connect on a second USB port. Gives highest quality demonstrations.
No ROS, no kernel drivers: Unlike CAN-bus setups, the SO-101 data collection stack runs entirely over USB serial. You can record on a MacBook or Windows laptop — no Ubuntu required.
Step-by-Step Recording Workflow
Verify calibration is current
Run calibration before each new session if the arm was disassembled or moved. See Software → Calibration.
python -m lerobot.scripts.control_robot \
--robot.type=so101 --robot.port=/dev/ttyUSB0 \
--control.type=calibrate
Verify camera feeds
python -c "
import cv2
for i in range(4):
cap = cv2.VideoCapture(i)
if cap.isOpened():
print(f'Camera {i}: OK')
cap.release()
"
Move arm to home position
Place the follower arm at the home position (fully extended, end-effector pointing forward). Reset leader arm to the same position before starting teleop.
Set up the task scene
Place objects in their consistent starting positions. Mark the table if needed — consistent initial conditions are critical for policy generalization.
Start LeRobot recording
python -m lerobot.scripts.control_robot \
--robot.type=so101 \
--robot.port=/dev/ttyUSB1 \
--robot.leader_arms.main.type=so101 \
--robot.leader_arms.main.port=/dev/ttyUSB0 \
--control.type=record \
--control.fps=30 \
--control.repo_id=your-username/so101-pick-place-v1 \
--control.num_episodes=50 \
--control.single_task="Pick the red block and place it in the bin" \
--control.warmup_time_s=3 \
--control.reset_time_s=8
LeRobot prompts before each episode. During warmup you can adjust your grip on the leader arm before recording starts.
Review and replay episodes
python -m lerobot.scripts.visualize_dataset \
--repo_id=your-username/so101-pick-place-v1 \
--episode_index=0
Delete poor-quality episodes immediately. Check for dropped camera frames, erratic joint velocities, or incomplete task execution.
Push to HuggingFace Hub
huggingface-cli login
python -m lerobot.scripts.push_dataset_to_hub \
--repo_id=your-username/so101-pick-place-v1
SO-101 Dataset Format
The SO-101 uses the standard LeRobot / HuggingFace dataset format — identical schema to OpenArm, Koch, and other LeRobot arms. This means your datasets are directly compatible with the full LeRobot training ecosystem.
Episode data schema
SO-101 specific notes
The SO-101 action space uses joint positions in degrees (Feetech servo units), not radians. When mixing SO-101 and OpenArm datasets for cross-platform training, normalize both to radians first using the stats in meta/stats.json.
Quality Checklist for Collected Data
Run through this after each recording session before pushing to the Hub.
-
1Episode lengths are consistent Outlier-length episodes usually mean the operator paused, the gripper slipped, or recording was interrupted. Keep within ±30% of median length.
-
2No servo velocity spikes The STS3215 servos have limited bandwidth — sudden velocity spikes in
observation.stateindicate a serial bus dropout. Delete those episodes. -
3Camera frames are aligned with joint data Check that camera timestamps and joint timestamps are within 20ms of each other. USB serial latency can cause drift over long recordings. Re-sync cameras every 100 episodes.
-
4Leader arm tracking was smooth If the follower lagged noticeably during recording (due to USB serial latency), the action labels will be time-shifted from observations. Replay to check.
-
5Task scene was consistent at start of each episode Objects in the same position and orientation. The SO-101's lower repeatability (vs CAN arms) makes this especially important — variance in initial conditions hurts policy training.
-
6Gripper open/close is clearly recorded The SO-101 gripper state is joint 6. Verify that grasp events show a clear joint position transition (open → closed) in the data, not a gradual drift.
Training a Policy from Your Dataset
Once your dataset passes quality checks, train ACT or Diffusion Policy with LeRobot.
Train ACT
python -m lerobot.scripts.train \
--policy.type=act \
--dataset.repo_id=your-username/so101-pick-place-v1 \
--policy.chunk_size=100 \
--training.num_epochs=5000 \
--output_dir=outputs/act-so101-pick-place
Train Diffusion Policy
python -m lerobot.scripts.train \
--policy.type=diffusion \
--dataset.repo_id=your-username/so101-pick-place-v1 \
--training.num_epochs=8000 \
--output_dir=outputs/diffusion-so101-pick-place
Community datasets: The SO-101 has one of the largest community dataset collections in the LeRobot ecosystem. Before collecting your own data, check HuggingFace Hub for existing SO-101 datasets — you may be able to fine-tune from an existing base dataset and save recording time.