# deep-learning-for-image-processing **Repository Path**: netweather/deep-learning-for-image-processing ## Basic Information - **Project Name**: deep-learning-for-image-processing - **Description**: deep learning for image processing including classification and object-detection etc. - **Primary Language**: Unknown - **License**: GPL-3.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2024-11-03 - **Last Updated**: 2024-11-03 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## 241026 * 编解码器 在原模型编码器 ```python def make_features(cfg: list): layers = [] #编码器部分 in_channels = 3 #因为输入的是RGB彩色图片 for v in cfg: #如果是M就表示是最大池化层 if v == "M": layers += [nn.MaxPool2d(kernel_size=2, stride=2)] #VGG中所有的最大池化下采样的池化核大小和步距都是2 else: conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1) #这几个参数的作用分别是输入通道数、输出通道数、卷积核大小、填充大小。 layers += [conv2d, nn.ReLU(True)] in_channels = v #此处加解码器=========================== return nn.Sequential(*layers) cfgs = { 'vgg11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 'vgg13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 'vgg16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'], 'vgg19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'], } ``` 基础上增加解码器: ```python for v in [512, 512, 512, 512, 512, 512, 256, 256, 256,128, 128 ,64, 64]: # for v in [512, 512, 512, 512, 512, 512, 256, 256]: layers += [ nn.ConvTranspose2d(in_channels, v, kernel_size=3, stride=2, padding=1, output_padding=1), nn.ReLU(True) ] in_channels = v ``` 训练时爆显存: ``` torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 12.25 GiB. GPU 0 has a total capacity of 15.56 GiB of which 10.15 GiB is free. Including non-PyTorch memory, this process has 5.40 GiB memory in use. Of the allocated memory 4.98 GiB is allocated by PyTorch, and 251.37 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) ``` 将batch_size从32递减至1、 使用多卡进行训练, 依旧无法解决。 ```python # 使用 DataParallel 进行多 GPU 训练 if torch.cuda.device_count() > 1: print("Let's use", torch.cuda.device_count(), "GPUs!") # 将模型包装在 DataParallel 中 net = nn.DataParallel(net) ``` * 涉及文件: >[model.py](./pytorch_classification/Test3_vggnet/model.py) >[train.py](./pytorch_classification/Test3_vggnet/train.py) ## 241025 * 自训练权重识别效果: |分类|预测| |--|--| |对于图片[sunflower2.jpg](./pytorch_classification/Test3_vggnet/predict_demo/sunflower2.jpg)|预测结果:| |分类: daisy |概率: 0.00689| |分类: dandelion |概率: 0.277| |分类: roses |概率: 0.00516| |分类: **sunflowers** |概率: **0.698** √| |分类: tulips |概率: 0.0127| |对于图片[daisy2.jpg](./pytorch_classification/Test3_vggnet/predict_demo/daisy2.jpg)|预测结果:| |分类: **daisy** |概率: **0.999** √| |分类: dandelion |概率: 0.000496| |分类: roses |概率: 2.19e-05| |分类: sunflowers |概率: 2.61e-05| |分类: tulips |概率: 4.24e-06| |对于图片[dandelion1.jpg](./pytorch_classification/Test3_vggnet/predict_demo/dandelion1.jpg)|预测结果:| |分类: daisy |概率: 0.0192| |分类: **dandelion** |概率: **0.963** √| |分类: roses |概率: 0.01| |分类: sunflowers |概率: 0.00309| |分类: tulips |概率: 0.00441| |对于图片[tulips1.jpg](./pytorch_classification/Test3_vggnet/predict_demo/tulips1.jpg)|预测结果:| |分类: daisy |概率: 0.00132| |分类: dandelion |概率: 0.000501| |分类: roses |概率: 0.146| |分类: sunflowers |概率: 0.00369| |分类: **tulips** |概率: **0.848** √| |对于图片[rose3.jpg](./pytorch_classification/Test3_vggnet/predict_demo/rose3.jpg)|预测结果:| |分类: daisy |概率: 0.00105| |分类: dandelion |概率: 0.000177| |分类: **roses** |概率: **0.309**| |分类: sunflowers |概率: 0.00194| |分类: tulips |概率: 0.688 √| ## 241025 * 阅读理解train.py和model.py * 尝试使用不同参数进行训练 * 默认train脚本只调用1张卡,尝试使用如下代码调用多张GPU: ```python # 使用 DataParallel 进行多 GPU 训练 if torch.cuda.device_count() > 1: print("Let's use", torch.cuda.device_count(), "GPUs!") # 将模型包装在 DataParallel 中 net = nn.DataParallel(net) ``` 但训练速度没有提升,反倒有少许降低。 遂增加 batch_size: 32 → 128 、 增加 lr: 0.0001 → 0.0004 训练速度有所提升,但模型始终无法收敛。 > batch_size 、 lr 、epoch 的最优化设置? > 如何高效利用多卡训练? * 最后采用默认的batch_size=32 、 lr=0.0001 、 单卡训练 ## 241025 * 于服务器部署VGG及其他相关项目 * 理解和调试predict.py * ~~运行时报错待解决:~~ ```sh (VGG) (base) wbw@gpuadmin-2288H-V6:~/Arcadias_exercises/deep-learning-for-image-processing/pytorch_classification/Test3_vggnet$ /home/wbw/anaconda3/envs/VGG/bin/python /home/wbw/Arcadias_exercises/deep-learning-for-image-processing/pytorch_classification/Test3_vggnet/predict.py Current working directory: /home/wbw/Arcadias_exercises/deep-learning-for-image-processing/pytorch_classification/Test3_vggnet /home/wbw/Arcadias_exercises/deep-learning-for-image-processing/pytorch_classification/Test3_vggnet/predict.py:56: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. model.load_state_dict(torch.load(weights_path, map_location=device),strict=False) #strict=False 表示在加载状态字典时不严格匹配键的名称。 Traceback (most recent call last): File "/home/wbw/Arcadias_exercises/deep-learning-for-image-processing/pytorch_classification/Test3_vggnet/predict.py", line 80, in main() File "/home/wbw/Arcadias_exercises/deep-learning-for-image-processing/pytorch_classification/Test3_vggnet/predict.py", line 56, in main model.load_state_dict(torch.load(weights_path, map_location=device),strict=False) #strict=False 表示在加载状态字典时不严格匹配键的名称。 File "/home/wbw/anaconda3/envs/VGG/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2215, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for VGG: size mismatch for classifier.6.weight: copying a param with shape torch.Size([1000, 4096]) from checkpoint, the shape in current model is torch.Size([5, 4096]). size mismatch for classifier.6.bias: copying a param with shape torch.Size([1000]) from checkpoint, the shape in current model is torch.Size([5]). ``` > ~~疑似是因为权重文件的版本和当前的pytorch版本不匹配导致?~~ > ~~模型的输出层的参数尺寸与加载的权重不匹配,模型当前期望的输出为` 5 `类,而加载的权重是为` 1000 `类(通常是用于 ImageNet 数据集的权重)。~~ > >使用自训练的权重文件,可正常预测