Active_Table_TSDF

Based on Dual Robot Manipulators, active 3D reconstruction with TSDF on Table

This README explains how to use the provided Python script to build a TSDF-based 3D map from an RGB-D sequence and camera poses, using GPU acceleration via Open3D’s Tensor VoxelBlockGrid.

1. Overview

This code provides:

Camera intrinsic loading from cam_params.json
Pose loading from traj.txt (supports both 9+3 and 16-float formats)
RGB & depth frame pairing with name-based ordering
Optional ROI mask input per frame
GPU-based TSDF fusion using Open3D Tensor VoxelBlockGrid
Mesh or point-cloud extraction from the TSDF volume

When ROI masks are enabled, only depth values inside the masked region are fused into TSDF, effectively restricting reconstruction to selected objects or regions.

2. Environment & Dependencies

2.1 Recommended Python Version

Python 3.8 – 3.11

2.2 Install Dependencies

# 1) Create and activate a virtual environment
conda create -n tsdf_env python=3.10 -y
conda activate tsdf_env

# 2) Install required packages
pip install numpy scipy tifffile imageio open3d pillow matplotlib

**Dependency roles:**

* `numpy` — array operations
* `scipy` — rotation utilities
* `tifffile` — reading `.tif` depth
* `imageio` — RGB reading
* `open3d` — TSDF, visualization, mesh/point cloud
* `Pillow` — loading `.jpg/.png`
* `matplotlib` — debugging plots

### 2.3 CUDA Requirements

The TSDF module uses:

```python
device = o3c.Device("CUDA:0")
vbg = create_gpu_tsdf(..., device=device)

You need:

NVIDIA GPU
CUDA drivers installed
CUDA-enabled Open3D

If you have no GPU:

device = o3c.Device("CPU:0")

Performance will be slower.

3. Dataset Structure & File Formats

3.1 Directory Layout

scene_dir = "/path/to/scene"
json_path  = os.path.join(scene_dir, "cam_params.json")
pose_txt   = os.path.join(scene_dir, "traj.txt")
rgb_dir    = os.path.join(scene_dir, "results")
depth_dir  = os.path.join(scene_dir, "results")

Suggested directory structure:

scene/
├── cam_params.json
├── traj.txt
├── results/
│ ├── frame0000.jpg
│ ├── frame0001.jpg
│ ├── depth0000.png
│ ├── depth0001.png
│ └── ...
└── masks/ # OPTIONAL
├── mask0000.png
├── mask0001.png
└── ...

File matching used in script:

rgb_paths   = sorted(glob.glob(os.path.join(rgb_dir, "frame*.jpg")))[::stride]
depth_paths = sorted(glob.glob(os.path.join(depth_dir, "depth*.png")))[::stride]

Modify this if your naming is different.

3.2 Camera Intrinsic File `cam_params.json`

Example:

{
  "camera": {
    "fx": 1000.0,
    "fy": 1000.0,
    "cx": 640.0,
    "cy": 360.0,
    "scale": 1000.0,
    "w": 1280,
    "h": 720
  }
}

Meaning:

fx, fy, cx, cy — intrinsic parameters
scale — depth scaling; real depth = pixel_value / scale
w, h — image size

Loaded via:

K, scale, (w, h) = load_camera_intrinsics(json_path)

3.3 Pose File `traj.txt` (16-float 4×4 format)

Used by:

poses = load_poses_mat16(pose_txt)

Format per line:

t00 t01 t02 t03 t10 t11 t12 t13 t20 t21 t22 t23 t30 t31 t32 t33

Also supports 9+3 format:

r11 r12 r13 r21 r22 r23 r31 r32 r33 tx ty tz

TSDF path uses the 16-float version by default.

Assumption:

Pose represents world_T_cam
Solver uses its inverse for TSDF:

pose_wc = np.linalg.inv(cam_T_world)

3.4 Depth Map Format

Example from main script:

depth_raw = np.array(Image.open(depth_paths[i])).astype(np.uint16)
depth_np = depth_raw

Requirements:

Depth stored in uint16
Actual depth (meters) = depth_np / depth_scale
depth_max is defined in meters

Mask Requirements

Shape: (H, W)
Type: uint8
Semantics:
- mask == 1 → valid ROI
- mask == 0 → ignored region
Mask resolution must match RGB & depth resolution.

4. Enabling Mask Support

4.1 Configuration Flag

use_masks = True

When enabled, masks are loaded and applied per frame.

4.2 Loading Mask Paths

if use_masks:
    mask_dir = os.path.join(scene_dir, "masks")
    mask_paths = sorted(glob.glob(os.path.join(mask_dir, "mask*.png")))[::stride]

The number of masks must match RGB / depth frames.

4.3 Applying Mask Before TSDF Integration (Critical Step)

Inside the main streaming loop:

rgb_np = np.array(Image.open(rgb_paths[i]))       # (H, W, 3), uint8
depth_raw = np.array(Image.open(depth_paths[i])).astype(np.uint16)

depth_np = depth_raw

if use_masks:
    mask = np.array(Image.open(mask_paths[i])).astype(np.uint8)
    depth_np[mask == 0] = 0     # <-- ROI filtering happens here

This ensures that only masked pixels contribute to TSDF fusion.

The ROI logic is fully handled at the depth-image level.

5. Script Structure Summary

Pose Loading

load_poses_txt — 12-float R+t → 4×4
load_poses_mat16 — 16-float → 4×4

RGB-D → Point Cloud (non-TSDF)

rgbd_to_pcd(...)

Non-TSDF Fusion

integrate_sequence
integrate_sequence_streaming

TSDF Core

load_camera_from_K
create_gpu_tsdf
integrate_frame_gpu
extract_open3d_pcd_from_tsdf

Main Execution

Load intrinsics
Load poses
Build TSDF VoxelBlockGrid
Loop through frames and integrate
Extract mesh and visualize

6. Step-By-Step Usage

Step 0 — Prepare Environment

Install dependencies (Section 2).

Step 1 — Prepare Data

Ensure:

cam_params.json exists
traj.txt uses the 16-float pose format
results/ contains frame*.jpg and depth*.png

Step 2 — Edit Script Paths

if __name__ == "__main__":
    scene_dir = "/your/absolute/path"

    json_path = os.path.join(scene_dir, "cam_params.json")
    K, scale, _ = load_camera_intrinsics(json_path)

    pose_txt  = os.path.join(scene_dir, "traj.txt")
    rgb_dir   = os.path.join(scene_dir, "results")
    depth_dir = os.path.join(scene_dir, "results")

TSDF hyperparameters:

voxel_size = 0.02
sdf_trunc  = 0.08
depth_max  = 4.0
stride     = 1
max_steps  = None

TSDF Voxel Grid:

vbg = create_gpu_tsdf(
    voxel_size=voxel_size,
    block_resolution=8,
    block_count=100000,
    device=device,
)

Step 3 — Run TSDF Fusion

python tsdf_tsdf_mapping.py

Pipeline executed:

Load intrinsics
Load & subsample poses
Create TSDF grid
For each RGB-D frame:
- Load RGB
- Load depth
- Load pose
- Integrate
Extract mesh → visualize

If you want a coordinate frame, uncomment:

axis = o3d.geometry.TriangleMesh.create_coordinate_frame(
    size=0.5, origin=[0, 0, 0]
)

Otherwise use:

o3d.visualization.draw_geometries([mesh])

7. Important Parameters

`voxel_size`

Smaller → higher detail, higher memory
Larger → faster, coarse
Recommended: 0.01–0.05

`sdf_trunc`

Typically 3 * voxel_size
Too large → blurred edges
Too small → missing surfaces

`depth_max`

Valid depth range
Indoor scenes: 3–6m

`stride`

1 = use all frames
1 = skip frames (faster)

TSDF Grid Size

block_resolution — size of block
block_count — maximum allocated blocks
Large scenes require larger block_count

Use:

integrate_sequence(...)

or

integrate_sequence_streaming(...)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

AbbeGo/Active_Table_TSDF

Folders and files

Latest commit

History

Repository files navigation

Active_Table_TSDF

1. Overview

2. Environment & Dependencies

2.1 Recommended Python Version

2.2 Install Dependencies

3. Dataset Structure & File Formats

3.1 Directory Layout

3.2 Camera Intrinsic File cam_params.json

3.3 Pose File traj.txt (16-float 4×4 format)

3.4 Depth Map Format

Mask Requirements

4. Enabling Mask Support

4.1 Configuration Flag

4.2 Loading Mask Paths

4.3 Applying Mask Before TSDF Integration (Critical Step)

5. Script Structure Summary

Pose Loading

RGB-D → Point Cloud (non-TSDF)

Non-TSDF Fusion

TSDF Core

Main Execution

6. Step-By-Step Usage

Step 0 — Prepare Environment

Step 1 — Prepare Data

Step 2 — Edit Script Paths

Step 3 — Run TSDF Fusion

7. Important Parameters

voxel_size

sdf_trunc

depth_max

stride

TSDF Grid Size

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages