# HED-document-detection **Repository Path**: xsdf1985/HED-document-detection ## Basic Information - **Project Name**: HED-document-detection - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-10-26 - **Last Updated**: 2023-10-26 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # A Pytorch Implementation of HED for Document Detection ### Introduction This project is a Pytorch implementation of HED algorithm([Holistically-Nested Edge Detection](https://arxiv.org/abs/1504.06375)) for document detection. The features are summarized blow: - It is pure Pytorch code. - It can automatically generate training samples to train the HED. Some examples are shown bellow. - Similar colors of the document and background:

- The results are better when there is a clear distinction between foreground and background:

### Requirements - python3 - Any version of Pytorch version > 1.0 should be ok ### Demo Downloaded the trained model: - Baidu Cloud Disk: [Download ](https://pan.baidu.com/s/1bVM_38M-GIkS7tSHslXAHw) Password: `rb3r` and put it in `./savecheckpoint` run ``` python testHED.py --testImgPath=./demo \ --saveOutPath=./demoout \ --checkpointPath=./savecheckpoint/58000_net.pkl \ --gpu_list=2 ``` The result edge image will be then written to the saveOutPath path. ### Train We have implemented two training methods to train HED document detection. - Train without annotated data - Train with annotated data #### Train without annotated data This implementation of HED document detection can be trained without any annotated data, except a set of foreground images and a set of background images Put the foreground images to `./DATASET/source_image/foreground_images/` Put the background images to `./DATASET/source_image/background_images/` Some validation images should be provided in `./DATASET/dataset/testData/` ``` python trainHED_Online.py --fgpath=./DATASET/source_image/foreground_images/ \ --bgpath=./DATASET/source_image/background_images/ \ --test_data_dir=./DATASET/dataset/testData/ \ --SaveCheckpointPath= ./saveModelTrainedOnline \ --SaveOutImgPath=./saveImgTrainedOnline \ --gpu_list=2 ``` #### Train with data saved on the hard disk - To train the model with your dataset, just provide the dataset by a .CSV file which include the image and the groundtruth pairs, see [HED_Dataset.csv](./DATASET/dataset/HED_Dataset.csv) for an example. We provide a script for geting the .csv file [./tools/createDatasetListCSV.py](./tools/createDatasetListCSV.py). - Annotated images for training HED model are difficult to obtain. A script to generate training samples is provided, see [./tools/generate_data.py](./tools/generate_data.py) for more details. An example of a pair of image and the groundtruth is shown bellow:

``` python trainHED.py --train_csv_file=./DATASET/dataset/HED_Dataset.csv \ --test_data_dir=./DATASET/dataset/testData \ --rootdir=./DATASET/dataset/ \ --SaveCheckpointPath= ./saveModelTrainedOnline \ --SaveOutImgPath=./saveImgTrainedOnline \ --gpu_list=2 ```