Skip to content

Latest commit

 

History

History
100 lines (89 loc) · 4.5 KB

REAL-WORLD_POSTPROCESS.md

File metadata and controls

100 lines (89 loc) · 4.5 KB

Post-process

Self-collected data and pre-training datasets often exhibit certain challenges that can negatively affect model performance. Below, we outline these issues and provide standardized solutions for data formatting and post-processing.

  • ⚠️ Varied Action Labels: Different embodiments, and sometimes even identical ones, may use diverse action labels such as:
    • Absolute target joint positions (qpos)
    • Absolute target joint velocites (qvel)
    • Absolute end-effector poses (ee-pose)
    • Delta target end-effector poses (delta ee-pose) Furthermore, rotation representations may vary, including quaternions, Euler angles, rotation vectors, and rotation matrices.
  • ⚠️ Jittering and Long Pauses: Fresh data collectors often introduce hesitation, leading to long pauses or jittering during data collection. Without proper filtering, such data significantly degrades model performance.
  • ⚠️ Quick Gripper Open/Close Actions: A frequency mismatch between camera capture and gripper control often results in abrupt changes in gripper states, especially during grasping or releasing motions.

To address these issues, we recommend a uniform, clear, and effective format for saving self-collected data and provide tools for post-processing.

❗ Data Format

For each task, we collect 100 demos. The recommended directory structure is:

0000 (exp_id)
|—— 000000 (episode_id)
    |—— steps 
        |—— 0000 (timestep_id, start)
            |—— image_primary.jpg (Eye-on-Base camera rgb image)
            |—— image_wrist.jpg (Eye-on-Hand camera rgb image)
            └── other.npz (robot state, language, action)
        |—— ......
        └── xxxx (timestep_id, end)
|—— 000001 (episode_id)
    |—— steps
        |—— ......
|—— ......
└── 000099 (episode_id)
    |—— steps
        |—— ......

File Details:

  • image_primary.jpg and image_wrist.jpg: Images saved with a resolution of 640 x 480 pixels.
  • other.npz: Contains key robot metadata. An example of the saved format is:
# at each timestep i
npz_path = f"other.npz"

# absolute current gripper pose in robot space, position + euler angles, the unit is m and rad.
gripper_pose = np.array([x, y, z, euler_x, euler_y, euler_z])

# absolute current gripper open state
gripper_open_state = np.array([1.0]) if gripper is opened else np.array([-1.0]) 

# absolute current joints position (qpos)
joints = np.array([q0, q1, q2, q3, q4, q5, q6])

# language instruction
language_instruction = f"Pick the apple." 

# absolute target pose action label (target_gripper_open_or_close is 1.0 if targetting open, else -1.0)
action_gripper_pose = np.array([target_x, target_y, target_z, target_euler_x, target_euler_y, target_euler_z, target_gripper_open_or_close])

# delta pose action label 
delta_cur_2_last_action = np.array([target_delta_x, target_delta_y, target_delta_z, target_delta_euler_x, target_delta_euler_y, target_delta_euler_z, target_gripper_open_or_close])

# save npz
np.savez_compressed(
    npz_path,
    joints=joints,
    gripper_pose=gripper_pose,
    gripper_open_state=gripper_open_state,
    action_gripper_pose=action_gripper_pose,
    delta_cur_2_last_action=delta_cur_2_last_action,
    language_instruction=language_instruction,
)

For most robotic systems, all metadata except delta_cur_2_last_action can be directly extracted. We provide a helper function to compute the delta pose action label in the script:

compute_delta_action(
    data_list,
)

⭐ Post-processing Self-collected Data

  • Filtering Jitter and Pauses: To filter out jittering and long pauses, use the following function in the script:
filter_real_data(
    exp_id, 
    root_path, # path to your raw data 
    save_data_path, # a desired path to save filterd data
    save_gif_path # a desired path to save the filtered gif (only for visualization and debugging)
)
  • Data Augmentation for Gripper Actions: To augment data by increasing sampling ratios during gripper open/close events, use the following function in the same script:
make_aug_short_real_dataset_info(
    root_path, # path to your filterd data 
    root_info_path, # path to your data info, it should be like xxx/Seer/data_info
    dataset_name, # your dataset name, e.g. "ft"
    select_ratio=1.0,
    sequence_length=7, 
    action_pred_steps=3, 
    replicate_steps=10
)