# suds

**Repository Path**: flycloud2009_cloudlou/suds

## Basic Information

- **Project Name**: suds
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-04-30
- **Last Updated**: 2025-04-30

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# SUDS: Scalable Urban Dynamic Scenes

[Haithem Turki](https://haithemturki.com), [Jason Y. Zhang](https://jasonyzhang.com/), [Francesco Ferroni](https://www.francescoferroni.com/), [Deva Ramanan](http://www.cs.cmu.edu/~deva)

[Project Page](https://haithemturki.com/suds) / [Paper](https://haithemturki.com/suds/paper.pdf)


This repository contains the code needed to train [SUDS](https://haithemturki.com/suds/) models.

## Citation

```
@misc{turki2023suds,
   title={SUDS: Scalable Urban Dynamic Scenes},
   author={Haithem Turki and Jason Y. Zhang and Francesco Ferroni and Deva Ramanan},
   year={2023},
   eprint={2303.14536},
   archivePrefix={arXiv},
   primaryClass={cs.CV}
}
```

## Setup

```
conda env create -f environment.yml
conda activate suds
python setup.py install
```

The codebase has been mainly tested against CUDA >= 11.3 and A100/A6000 GPUs. GPUs with compute capability greater or equal to 7.5 should generally work, although you may need to adjust batch sizes to fit within GPU memory constraints.

## Data Preparation

### KITTI

1. Download the following from the [KITTI MOT dataset](http://www.cvlibs.net/datasets/kitti/eval_tracking.php):
   1. [Left color images](http://www.cvlibs.net/download.php?file=data_tracking_image_2.zip)
   2. [Right color images](http://www.cvlibs.net/download.php?file=data_tracking_image_3.zip)
   3. [GPS/IMU data](http://www.cvlibs.net/download.php?file=data_tracking_oxts.zip)
   4. [Camera calibration files](http://www.cvlibs.net/download.php?file=data_tracking_calib.zip)
   5. [Velodyne point clouds](http://www.cvlibs.net/download.php?file=data_tracking_velodyne.zip)
   6. (Optional) [Semantic labels](https://storage.googleapis.com/gresearch/tf-deeplab/data/kitti-step.tar.gz)

2. Extract everything to ```./data/kitti``` and keep the data structure
3. Generate depth maps from the Velodyne point clouds: ```python scripts/create_kitti_depth_maps.py --kitti_sequence $SEQUENCE```
4. (Optional) Generate sky and static masks from semantic labels: ```python scripts/create_kitti_masks.py --kitti_sequence $SEQUENCE```
5. Create metadata file: ```python scripts/create_kitti_metadata.py --config_file scripts/configs/$CONFIG_FILE```
6. Extract DINO features:
   1. ```python scripts/extract_dino_features.py --metadata_path $METADATA_PATH``` or ```python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS scripts/extract_dino_features.py --metadata_path $METADATA_PATH``` for multi-GPU extraction
   2. ```python scripts/run_pca.py --metadata_path $METADATA_PATH```
7. Extract DINO correspondences: ```python scripts/extract_dino_correspondences.py --metadata_path $METADATA_PATH``` or ```python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS scripts/extract_dino_correspondences.py --metadata_path $METADATA_PATH``` for multi-GPU extraction
8. (Optional) Generate feature clusters for visualization: ```python scripts/create_kitti_feature_clusters.py --metadata_path $METADATA_PATH --output_path $OUTPUT_PATH```

### VKITTI2

1. Download the following from the [VKITTI2 dataset](https://europe.naverlabs.com/research/computer-vision/proxy-virtual-worlds-vkitti-2/):
   1. [RGB images](http://download.europe.naverlabs.com//virtual_kitti_2.0.3/vkitti_2.0.3_rgb.tar)
   2. [Depth images](http://download.europe.naverlabs.com//virtual_kitti_2.0.3/vkitti_2.0.3_depth.tar)
   3. [Camera intrinsics/extrinsics](http://download.europe.naverlabs.com//virtual_kitti_2.0.3/vkitti_2.0.3_textgt.tar.gz)
   4. (Optional) [Ground truth forward flow](http://download.europe.naverlabs.com//virtual_kitti_2.0.3/vkitti_2.0.3_forwardFlow.tar)
   5. (Optional) [Ground truth backward flow](http://download.europe.naverlabs.com//virtual_kitti_2.0.3/vkitti_2.0.3_backwardFlow.tar)
   6. (Optional) [Semantic labels](http://download.europe.naverlabs.com//virtual_kitti_2.0.3/vkitti_2.0.3_classSegmentation.tar)

2. Extract everything to ```./data/vkitti2``` and keep the data structure
3. (Optional) Generate sky and static masks from semantic labels: ```python scripts/create_vkitti2_masks.py --vkitti2_path $SCENE_PATH```
4. Create metadata file: ```python scripts/create_vkitti2_metadata.py --config_file scripts/configs/$CONFIG_FILE```
5. Extract DINO features:
   1. ```python scripts/extract_dino_features.py --metadata_path $METADATA_PATH``` or ```python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS scripts/extract_dino_features.py --metadata_path $METADATA_PATH``` for multi-GPU extraction
   2. ```python scripts/run_pca.py --metadata_path $METADATA_PATH```
6. If not using the ground truth flow provided by VKITTI2, extract DINO correspondences: ```python scripts/extract_dino_correspondences.py --metadata_path $METADATA_PATH``` or ```python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS scripts/extract_dino_correspondences.py --metadata_path $METADATA_PATH``` for multi-GPU extraction
7. (Optional) Generate feature clusters for visualization: ```python scripts/create_vkitti2_feature_clusters.py --metadata_path $METADATA_PATH --vkitti2_path $SCENE_PATH --output_path $OUTPUT_PATH```

## Training

```python suds/train.py suds --experiment-name $EXPERIMENT_NAME --pipeline.datamanager.dataparser.metadata_path $METADATA_PATH [--pipeline.feature_clusters $FEATURE_CLUSTERS]```

## Evaluation

```python suds/eval.py --load_config $SAVED_MODEL_PATH``` or ```python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS suds/eval.py --load_config $SAVED_MODEL_PATH``` for multi-GPU evaluation

## Acknowledgements

This project is built on [Nerfstudio](https://github.com/nerfstudio-project/nerfstudio) and [tiny-cuda-nn](https://github.com/NVlabs/tiny-cuda-nn). The DINO feature extraction scripts are based on [ShirAmir's implementation](https://github.com/ShirAmir/dino-vit-features) and parts of the KITTI processing code from [Neural Scene Graphs](https://github.com/princeton-computational-imaging/neural-scene-graphs).