Skip to content

AbbeGo/Active_Table_TSDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Active_Table_TSDF

Based on Dual Robot Manipulators, active 3D reconstruction with TSDF on Table

This README explains how to use the provided Python script to build a TSDF-based 3D map from an RGB-D sequence and camera poses, using GPU acceleration via Open3D’s Tensor VoxelBlockGrid.


1. Overview

This code provides:

  1. Camera intrinsic loading from cam_params.json
  2. Pose loading from traj.txt (supports both 9+3 and 16-float formats)
  3. RGB & depth frame pairing with name-based ordering
  4. Optional ROI mask input per frame
  5. GPU-based TSDF fusion using Open3D Tensor VoxelBlockGrid
  6. Mesh or point-cloud extraction from the TSDF volume

When ROI masks are enabled, only depth values inside the masked region are fused into TSDF, effectively restricting reconstruction to selected objects or regions.


2. Environment & Dependencies

2.1 Recommended Python Version

  • Python 3.8 – 3.11

2.2 Install Dependencies

# 1) Create and activate a virtual environment
conda create -n tsdf_env python=3.10 -y
conda activate tsdf_env

# 2) Install required packages
pip install numpy scipy tifffile imageio open3d pillow matplotlib

**Dependency roles:**

* `numpy` — array operations
* `scipy` — rotation utilities
* `tifffile` — reading `.tif` depth
* `imageio` — RGB reading
* `open3d` — TSDF, visualization, mesh/point cloud
* `Pillow` — loading `.jpg/.png`
* `matplotlib` — debugging plots

### 2.3 CUDA Requirements

The TSDF module uses:

```python
device = o3c.Device("CUDA:0")
vbg = create_gpu_tsdf(..., device=device)

You need:

  • NVIDIA GPU
  • CUDA drivers installed
  • CUDA-enabled Open3D

If you have no GPU:

device = o3c.Device("CPU:0")

Performance will be slower.


3. Dataset Structure & File Formats

3.1 Directory Layout

scene_dir = "/path/to/scene"
json_path  = os.path.join(scene_dir, "cam_params.json")
pose_txt   = os.path.join(scene_dir, "traj.txt")
rgb_dir    = os.path.join(scene_dir, "results")
depth_dir  = os.path.join(scene_dir, "results")

Suggested directory structure:

scene/
├── cam_params.json
├── traj.txt
├── results/
│ ├── frame0000.jpg
│ ├── frame0001.jpg
│ ├── depth0000.png
│ ├── depth0001.png
│ └── ...
└── masks/ # OPTIONAL
├── mask0000.png
├── mask0001.png
└── ...

File matching used in script:

rgb_paths   = sorted(glob.glob(os.path.join(rgb_dir, "frame*.jpg")))[::stride]
depth_paths = sorted(glob.glob(os.path.join(depth_dir, "depth*.png")))[::stride]

Modify this if your naming is different.


3.2 Camera Intrinsic File cam_params.json

Example:

{
  "camera": {
    "fx": 1000.0,
    "fy": 1000.0,
    "cx": 640.0,
    "cy": 360.0,
    "scale": 1000.0,
    "w": 1280,
    "h": 720
  }
}

Meaning:

  • fx, fy, cx, cy — intrinsic parameters
  • scale — depth scaling; real depth = pixel_value / scale
  • w, h — image size

Loaded via:

K, scale, (w, h) = load_camera_intrinsics(json_path)

3.3 Pose File traj.txt (16-float 4×4 format)

Used by:

poses = load_poses_mat16(pose_txt)

Format per line:

t00 t01 t02 t03 t10 t11 t12 t13 t20 t21 t22 t23 t30 t31 t32 t33

Also supports 9+3 format:

r11 r12 r13 r21 r22 r23 r31 r32 r33 tx ty tz

TSDF path uses the 16-float version by default.

Assumption:

  • Pose represents world_T_cam
  • Solver uses its inverse for TSDF:
pose_wc = np.linalg.inv(cam_T_world)

3.4 Depth Map Format

Example from main script:

depth_raw = np.array(Image.open(depth_paths[i])).astype(np.uint16)
depth_np = depth_raw

Requirements:

  • Depth stored in uint16
  • Actual depth (meters) = depth_np / depth_scale
  • depth_max is defined in meters

Mask Requirements

  • Shape: (H, W)
  • Type: uint8
  • Semantics:
    • mask == 1 → valid ROI
    • mask == 0 → ignored region
  • Mask resolution must match RGB & depth resolution.

4. Enabling Mask Support

4.1 Configuration Flag

use_masks = True

When enabled, masks are loaded and applied per frame.


4.2 Loading Mask Paths

if use_masks:
    mask_dir = os.path.join(scene_dir, "masks")
    mask_paths = sorted(glob.glob(os.path.join(mask_dir, "mask*.png")))[::stride]

The number of masks must match RGB / depth frames.


4.3 Applying Mask Before TSDF Integration (Critical Step)

Inside the main streaming loop:

rgb_np = np.array(Image.open(rgb_paths[i]))       # (H, W, 3), uint8
depth_raw = np.array(Image.open(depth_paths[i])).astype(np.uint16)

depth_np = depth_raw

if use_masks:
    mask = np.array(Image.open(mask_paths[i])).astype(np.uint8)
    depth_np[mask == 0] = 0     # <-- ROI filtering happens here

This ensures that only masked pixels contribute to TSDF fusion.


The ROI logic is fully handled at the depth-image level.


5. Script Structure Summary

Pose Loading

  • load_poses_txt — 12-float R+t → 4×4
  • load_poses_mat16 — 16-float → 4×4

RGB-D → Point Cloud (non-TSDF)

  • rgbd_to_pcd(...)

Non-TSDF Fusion

  • integrate_sequence
  • integrate_sequence_streaming

TSDF Core

  • load_camera_from_K
  • create_gpu_tsdf
  • integrate_frame_gpu
  • extract_open3d_pcd_from_tsdf

Main Execution

  • Load intrinsics
  • Load poses
  • Build TSDF VoxelBlockGrid
  • Loop through frames and integrate
  • Extract mesh and visualize

6. Step-By-Step Usage

Step 0 — Prepare Environment

Install dependencies (Section 2).

Step 1 — Prepare Data

Ensure:

  • cam_params.json exists
  • traj.txt uses the 16-float pose format
  • results/ contains frame*.jpg and depth*.png

Step 2 — Edit Script Paths

if __name__ == "__main__":
    scene_dir = "/your/absolute/path"

    json_path = os.path.join(scene_dir, "cam_params.json")
    K, scale, _ = load_camera_intrinsics(json_path)

    pose_txt  = os.path.join(scene_dir, "traj.txt")
    rgb_dir   = os.path.join(scene_dir, "results")
    depth_dir = os.path.join(scene_dir, "results")

TSDF hyperparameters:

voxel_size = 0.02
sdf_trunc  = 0.08
depth_max  = 4.0
stride     = 1
max_steps  = None

TSDF Voxel Grid:

vbg = create_gpu_tsdf(
    voxel_size=voxel_size,
    block_resolution=8,
    block_count=100000,
    device=device,
)

Step 3 — Run TSDF Fusion

python tsdf_tsdf_mapping.py

Pipeline executed:

  1. Load intrinsics

  2. Load & subsample poses

  3. Create TSDF grid

  4. For each RGB-D frame:

    • Load RGB
    • Load depth
    • Load pose
    • Integrate
  5. Extract mesh → visualize

If you want a coordinate frame, uncomment:

axis = o3d.geometry.TriangleMesh.create_coordinate_frame(
    size=0.5, origin=[0, 0, 0]
)

Otherwise use:

o3d.visualization.draw_geometries([mesh])

7. Important Parameters

voxel_size

  • Smaller → higher detail, higher memory
  • Larger → faster, coarse
  • Recommended: 0.01–0.05

sdf_trunc

  • Typically 3 * voxel_size
  • Too large → blurred edges
  • Too small → missing surfaces

depth_max

  • Valid depth range
  • Indoor scenes: 3–6m

stride

  • 1 = use all frames
  • 1 = skip frames (faster)

TSDF Grid Size

  • block_resolution — size of block
  • block_count — maximum allocated blocks
  • Large scenes require larger block_count

Use:

integrate_sequence(...)

or

integrate_sequence_streaming(...)

About

Based on Dual Robot Manipulators, active 3D reconstruction with TSDF on Table

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages