MMDetection 目标检测使用 | 字数总计: 3.7k | 阅读时长: 16分钟 | 阅读量:
OpenMMLab 是香港中文大学-商汤科技联合实验室 MMLab 自 2018 年 10 月开源的一个计算机视觉领域的 AI 算法框架。其包含了众多计算机视觉多个方向的算法框架,本篇介绍 MMDetection 库,运行服务器 Ubuntu 18.04。本篇以 mmdet 2.x 版本为例,可能有些模块、类、函数在最新版中会有改变。请访问官网 docs 查看更新。
安装 MMDetection 的安装需要在一些基础库之上进行,如 PyTorch,mmcv 等。假设显卡驱动、cuda、cudnn 等均已安装配置好,接下来就是安装 python 和一些包。我们这里采用 miniconda 来进行 python 安装和环境配置。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh chmod +x miniconda.shbash ./miniconda.sh -b -p /opt/miniconda echo "export MINICONDA_HOME=/opt/miniconda" >> ~/.bashrcecho "export PATH=$MINICONDA_HOME /bin:$PATH " >> ~/.bashrcsource ~/.bashrcconda create -n mm python=3.9 -y conda activate mm pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html pip install mmcv -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html git clone -b v2.28.0 https://github.com/open-mmlab/mmdetection.git cd mmdetectionpip install -v -e . pip install -U openmim mim install mmengine mim install "mmcv>=2.0.0"
每个版本的具体安装方法,请参考官方文档:https://mmdetection.readthedocs.io/zh_CN/latest/get_started.html, 左下角可以选择不同版本。
打开 Python 命令提示符,输入如下内容,如果正常显示版本信息,说明 mmdetection 安装成功:
1 2 import mmdetprint (mmdet.__version__)
使用 MMDetection 进行图片目标检测 安装成功后,在克隆的仓库里有源代码,里面包含了很多算法的配置信息。我们这里以 faster-rcnn 为例,进行图片目标检测应用。
首先需要下载预训练好的模型,访问网址:mmdetection-model_zoo ,选择 faster-rcnn :
1 2 3 cd mmdetectionmkdir checkpointswget https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth -P checkpoints
然后,进行图片目标检测(假设下面代码运行目录是 mmdetection):show_result_pyplot 函数在 mmdet 3.x 中被删除
1 2 3 4 5 6 7 8 9 10 11 12 13 14 from mmdet.apis import inference_detector, init_detector, show_result_pyplotconfig = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py' checkpoint = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth' device = "cuda:0" model = init_detector(config, checkpoint, device=device) img = 'demo/demo.jpg' result = inference_detector(model, img) show_result_pyplot(model, img, result, score_thr=0.9 )
mmdet 3.x 可视化方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 from mmdet.apis import inference_detector, init_detector, show_result_pyplotconfig = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py' checkpoint = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth' device = "cuda:0" model = init_detector(config, checkpoint, device=device) img = 'demo/demo.jpg' result = inference_detector(model, img) import mmcvfrom mmdet.registry import VISUALIZERSvisualizer = VISUALIZERS.build(model.cfg.visualizer) visualizer.dataset_meta = model.dataset_meta image = mmcv.imread('demo/demo.jpg' , channel_order="rgb" ) visualizer.add_datasample( "result" , image, data_sample=result, draw_gt=None , wait_time=0 ) visualizer.show()
rpn 区域提议 首先需要下载预训练好的模型,访问网址:mmdetection-model_zoo ,选择 rpn :
1 2 cd mmdetectionwget https://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r50_fpn_1x_coco/rpn_r50_fpn_1x_coco_20200218-5525fa2e.pth -P checkpoints
然后,进行图片目标检测(假设下面代码运行目录是 mmdetection):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 from mmdet.apis import inference_detector, init_detector, show_result_pyplotconfig = 'configs/rpn/rpn_r50_fpn_1x_coco.py' checkpoint = 'checkpoints/rpn_r50_fpn_1x_coco_20200218-5525fa2e.pth' device = "cuda:0" model = init_detector(config, checkpoint, device=device) img = 'demo/demo.jpg' rpn_result = inference_detector(model, img) model.show_result(img, rpn_result, top_k=100 )
使用 mmdetection 模型微调 现在我们模型框架、预训练模型都有了,只需要使用自己的数据集进行再次训练,使得模型能够更加准确的在我们自己数据上进行预测。
对于个人数据集,有三种组织数据集方法来结合 mmdetection 进行模型微调:
把数据集转化为 CoCo 组织形式;
使用 mmdetection 提供的数据类型 CustomDataset,把数据写成 pkl 保存本地使用;
继承 mmdetection 提供的数据类型 CustomDataset,撰写自己的数据类型,省去 pkl,节省本地空间增加运行速度。
首先,下载数据集,这里我们采用 kitti_tiny 数据集:
1 wget https://download.openmmlab.com/mmdetection/data/kitti_tiny.zip -P data
其次,将下载的数据保存并解压到目录下:
1 2 cd dataunzip -q kitti_tiny.zip
最后,根据不同的数据组织形式进行配置文件生产和模型训练。
COCO 格式 请自行组织。
CustomDataset 格式 把数据转化为中间 pkl 格式:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 import os.path as osp import numpy as np import mmcv def convert_kitti_to_middle(ann_file, out_file, img_prefix): CLASSES = ('Car' , 'Pedestrian' , 'Cyclist' ) cat2label = {k: i for i, k in enumerate(CLASSES)} image_list = mmcv.list_from_file(ann_file) data_infos = [] for image_id in image_list: filename = f'{img_prefix}/{image_id}.jpeg' image = mmcv.imread(filename) height, width = image.shape[:2] data_info = dict(filename=f'{image_id}.jpeg' , width=width, height=height) label_prefix = img_prefix.replace('image_2' , 'label_2' ) lines = mmcv.list_from_file(osp.join(label_prefix, f'{image_id}.txt' )) content = [line.strip().split (' ' ) for line in lines] bbox_names = [x[0] for x in content] bboxes = [[float (info) for info in x[4:8]] for x in content] gt_bboxes = [] gt_labels = [] gt_bboxes_ignore = [] gt_labels_ignore = [] for bbox_name, bbox in zip(bbox_names, bboxes): if bbox_name in cat2label: gt_labels.append(cat2label[bbox_name]) gt_bboxes.append(bbox) else : gt_labels_ignore.append(-1) gt_bboxes_ignore.append(bbox) data_anno = dict( bboxes = np.array(gt_bboxes, dtype=np.float32).reshape(-1, 4), labels = np.array(gt_labels, dtype=np.longlong), bboxes_ignore = np.array(gt_bboxes_ignore, dtype=np.float32).reshape(-1, 4), labels_ignore = np.array(gt_labels_ignore, dtype=np.longlong) ) data_info.update(ann=data_anno) data_infos.append(data_info) mmcv.dump(data_infos, out_file) convert_kitti_to_middle('data/kitti_tiny/train.txt' , 'data/kitti_tiny/train_middle.pkl' , 'data/kitti_tiny/training/image_2' ) convert_kitti_to_middle('data/kitti_tiny/val.txt' , 'data/kitti_tiny/val_middle.pkl' , 'data/kitti_tiny/training/image_2' )
生产配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 from mmcv import Config from mmdet.apis import set_random_seed cfg = Config.fromfile("configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py" ) cfg.device='cuda' classes = ('Car' , 'Pedestrain' , 'Cyclist' ) cfg.dataset_type = 'CustomDataset' cfg.data_root = 'data/kitti_tiny' cfg.classes = classes dtype = 'CustomDataset' droot = 'data/kitti_tiny' cfg.data.test.type = dtype cfg.data.test.data_root = droot cfg.data.test.ann_file = 'train_middle.pkl' cfg.data.test.img_prefix = 'training/image_2' cfg.data.test.classes = classes cfg.data.train.type = dtype cfg.data.train.data_root = droot cfg.data.train.ann_file = 'train_middle.pkl' cfg.data.train.img_prefix = 'training/image_2' cfg.data.train.classes = classes cfg.data.val.type = dtype cfg.data.val.data_root = droot cfg.data.val.ann_file = 'val_middle.pkl' cfg.data.val.img_prefix = 'training/image_2' cfg.data.val.classes = classes cfg.model.roi_head.bbox_head.num_classes = 3 cfg.load_from = 'checkpoints/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco-5324cff8.pth' cfg.work_dir = 'work_dir' if not os.path.exists(cfg.work_dir): os.path.makedirs(cfg.work_dir) cfg.optimizer.lr = 0.02 / 8 cfg.lr_config.warmup = None cfg.log_config.interval = 10 cfg.evaluation.metric = 'mAP' cfg.evaluation.interval = 12 cfg.checkpoint_config.interval = 12 cfg.seed = 0 set_random_seed(0, deterministic=False) cfg.gpu_ids = range(1) print (f"Config: \n {cfg.pretty_text}" )cfg.dump(F'{cfg.work_dir}/customformat.py' )
训练模型:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 import mmcv from mmdet.datasets import build_dataset from mmdet.models import build_detector from mmdet.apis import train_detector import os.path as osp from mmcv import Config cfg = Config.fromfile("work_dir/customformat.py" ) datasets = [build_dataset(cfg.data.train)] model = build_detector(cfg.model) model.CLASSES = datasets[0].CLASSES mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir)) train_detector(model, datasets, cfg, distributed=False, validate=True)
自定义 KittiTinyDataset 格式 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 import os.path as osp import mmcv import numpy as np from mmdet.datasets.builder import DATASETS from mmdet.datasets.custom import CustomDataset from mmcv import Config from mmdet.apis import set_random_seed @DATASETS.register_module() class KittiTinyDataset(CustomDataset): CLASSES = ('Car' , 'Pedestrain' , 'Cyclist' ) def load_annotations(self, ann_file): cat2label = {k: i for i, k in enumerate(self.CLASSES)} image_list = mmcv.list_from_file(self.ann_file) data_infos = [] for image_id in image_list: filename = f"{self.img_prefix}/{image_id}.jpeg" image = mmcv.imread(filename) height, width = image.shape[:2] data_info = dict(filename=f'{image_id}.jpeg' , width=width, height=height) label_prefix = self.img_prefix.replace('image_2' , 'label_2' ) lines = mmcv.list_from_file(osp.join(label_prefix, f'{image_id}.txt' )) content = [line.strip().split (' ' ) for line in lines] bbox_names = [x[0] for x in content] bboxes = [[float (info) for info in x[4:8]] for x in content] gt_bboxes = [] gt_labels = [] gt_bboxes_ignore = [] gt_labels_ignore = [] for bbox_name, bbox in zip(bbox_names, bboxes): if bbox_name in cat2label: gt_labels.append(cat2label[bbox_name]) gt_bboxes.append(bbox) else : gt_labels_ignore.append(-1) gt_bboxes_ignore.append(bbox) data_anno = dict( bboxes=np.array(gt_bboxes, dtype=np.float32).reshape(-1, 4), labels=np.array(gt_labels, dtype=np.longlong), bboxes_ignore=np.array(gt_bboxes_ignore, dtype=np.float32).reshape(-1, 4), labels_ignore=np.array(gt_labels_ignore, dtype=np.longlong) ) data_info.update(ann=data_anno) data_infos.append(data_info) return data_infos cfg = Config.fromfile("configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py" ) cfg.device='cuda' classes = ('Car' , 'Pedestrain' , 'Cyclist' ) cfg.dataset_type = 'KittiTinyDataset' cfg.data_root = 'data/kitti_tiny' cfg.classes = classes dtype = 'KittiTinyDataset' droot = 'data/kitti_tiny' cfg.data.test.type = dtype cfg.data.test.data_root = droot cfg.data.test.ann_file = 'train.txt' cfg.data.test.img_prefix = 'training/image_2' cfg.data.test.classes = classes cfg.data.train.type = dtype cfg.data.train.data_root = droot cfg.data.train.ann_file = 'train.txt' cfg.data.train.img_prefix = 'training/image_2' cfg.data.train.classes = classes cfg.data.val.type = dtype cfg.data.val.data_root = droot cfg.data.val.ann_file = 'val.txt' cfg.data.val.img_prefix = 'training/image_2' cfg.data.val.classes = classes cfg.model.roi_head.bbox_head.num_classes = 3 cfg.load_from = 'checkpoints/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco-5324cff8.pth' cfg.work_dir = 'work_dir_2' if not os.path.exists(cfg.work_dir): os.path.makedirs(cfg.work_dir) cfg.optimizer.lr = 0.02 / 8 cfg.lr_config.warmup = None cfg.log_config.interval = 10 cfg.evaluation.metric = 'mAP' cfg.evaluation.interval = 12 cfg.checkpoint_config.interval = 12 cfg.seed = 0 set_random_seed(0, deterministic=False) cfg.gpu_ids = range(1) print (f"Config: \n {cfg.pretty_text}" )cfg.dump(F'{cfg.work_dir}/customKittiFormat.py' ) import mmcv from mmdet.datasets import build_dataset from mmdet.models import build_detector from mmdet.apis import train_detector import os.path as osp from mmcv import Config datasets = [build_dataset(cfg.data.train)] model = build_detector(cfg.model) model.CLASSES = datasets[0].CLASSES mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir)) train_detector(model, datasets, cfg, distributed=False, validate=True)
QA
workflow 和 evaluation 的验证测量区别? workflow 一般默认是 [(‘train’, 1)],如果用户想要查看当前模型在验证集上的 loss, accuracy 度量值,可以设置 workflow = [(‘train’, 1), (‘val’, 1)],这样,将迭代运行 1 个训练 epoch 和 1 个验证 epoch。但是 val 不影响总 epochs,即,total_epochs 或 max_epochs 仅控制训练 epoch 的数量,不会影响验证工作流。在 mmdetection 中,开发者没有为验证集专门配置 val 工作流。如果您在工作流中指定 2 个阶段,则需要 2 个数据加载器。检测任务中,在训练期间实施 mAP 评估(evaluation 中的 val,metric=’bbox’ or [‘bbox’, ‘segm’])作为验证,因此 workflow 里面的 val 阶段是不必要的。如果想运行 val epochs 并查看验证损失 loss,您可以修改代码为运行器构建 2 个数据加载器。更多关于该话题的讨论请参考GitHub issues: 1. 271. workflow not work , 2. 171. Load train_dataloader and val_dataloader , 3. 1093. Allowing validation dataset for computing validation loss , 4. [Bug] ‘ConfigDict’ object has no attribute ‘dataset’ #9633
训练日志 time 和 data_time 时间表示什么? 使用 mmdetection 进行模型训练,打印的日志一般如下:
1 2 2023-08-06 07:46:54,707 - mmdet - INFO - Epoch [1][9000/29317] lr: 2.000e-02, eta: 14:11:42, time: 0.229, data_time: 0.006, memory: 3947, loss_rpn_cls: 0.0809, loss_rpn_bbox: 0.0757, loss_cls: 0.3745, acc: 90.4531, loss_bbox: 0.2941, loss: 0.8252 2023-08-06 07:47:17,408 - mmdet - INFO - Epoch [1][9100/29317] lr: 2.000e-02, eta: 14:06:03, time: 0.227, data_time: 0.006, memory: 3947, loss_rpn_cls: 0.0820, loss_rpn_bbox: 0.0685, loss_cls: 0.3663, acc: 90.9038, loss_bbox: 0.2829, loss: 0.7996
“2023-08-06 07:46:54,707” 表示当前系统时间,”mmdet” 表示 openmmlab 目标检测包,”Epoch [1][9000/29317]” 表示当前 Epoch 为 1,日志区间为 100 个迭代,即迭代 100 次打印一次日志。我这里训练集有 117268 张图片,用 2 个 GPU 并行训练,每个 GPU 批次是 2,因此每个 GPU 共需要 117268 / 2 / 2 = 29317 次迭代(一个 epoch),每次迭代一批次数据(batch_size)需要耗费时间 0.229 秒(网络转发和后处理的总时间,不包括数据加载时间),加载数据时间消耗 0.006 秒(数据加载时间)。因此,100次迭代的时间大概是 0.227 * 100 = 22.7 秒,大约就是系统时间的差值:2023-08-06 07:47:17,408 - 2023-08-06 07:46:54,707,这也是用户看到的日志打印等待时间。更多关于日志打印等请查看 mmdet 2.x: mmcv 1.7.1 .runner.hooks.logger.text.py 里面的内容,如果是 mmdet 3.x 请查看 mmengine 。 eta(Estimated Time of Arrival)本意为预计到达时间,这里指距离训练结束的大概需要的时间。memory 表示每个 GPU 使用的显存大小,单位是 M,获取方法是 torch.cuda.max_memory_allocated(device=device)。剩余的参数为当前模型的性能指标,有各项损失、准确率和总损失。注意和评估(evaluation)中的 metric=’bbox’ 的 mAP 的区别。
参考文献
open-mmlab/mmdetection
【OpenMMLab 公开课】目标检测与 MMDetection 下