OpenMMLab 是香港中文大学-商汤科技联合实验室 MMLab 自 2018 年 10 月开源的一个计算机视觉领域的 AI 算法框架。其包含了众多计算机视觉多个方向的算法框架,本篇介绍 MMClassification 库,运行服务器 Ubuntu 18.04。

安装

MMClassification 的安装需要在一些基础库之上进行,如 pytorch,mmcv 等。假设显卡驱动、cuda、cudnn 等均已安装配置好,接下来就是安装 python 和一些包。我们这里采用 miniconda 来进行 python 安装和环境配置。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# 安装 miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
chmod +x miniconda.sh
bash ./miniconda.sh -b -p /opt/miniconda

# 配置 miniconda 环境
echo "export MINICONDA_HOME=/opt/miniconda" >> ~/.bashrc
echo "export PATH=$MINICONDA_HOME/bin:$PATH" >> ~/.bashrc
source ~/.bashrc

# 安装虚拟环境
conda create -n mm python=3.8
conda activate mm

# 安装 pytorch,假设这里的 cuda 版本是 11.1
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

# 安装 mmcv,注意这里的 mmcv 索引链接地址中需要根据上面 cuda 和 torch 版本进行下载安装
pip install mmcv -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html

# 安装 mmclassification
git clone https://github.com/open-mmlab/mmclassification.git
cd mmclassification
pip install -e .

打开 Python 命令提示符,输入如下内容,如果正常显示,说明 mmclassification 安装成功:

1
2
import mmcls
print(mmcls.__version__)

使用 mmclassification 进行图片分类

安装成功后,在克隆的仓库里有源代码,里面包含了很多算法的配置信息。我们这里以 mobilenet-v2 为例,进行图片分类应用。

首先需要下载预训练好的模型,访问网址:mmclassification-model_zoo

1
2
3
cd mmclassification
mkdir checkpoints
wget https://download.openmmlab.com/mmclassification/v0/mobilenet_v2/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth -P checkpoints

其次,下载测试图片:

1
wget https://media-cldnry.s-nbcnews.com/image/upload/t_fit-760w,f_auto,q_auto:best/rockcms/2022-10/bananas-mc-221004-02-3ddd88.jpg -O demo/banana.jpg

最后,进行图片类别预测(假设下面代码运行目录是 mmclassification):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 导入依赖包
from mmcls.apis import inference_model, init_model, show_result_pyplot
# 指定配置文件
config_file = "configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py"
checkpoint_file = "checkpoints/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth"
# 指定设备,这里使用第一块GPU
device = "cuda:0"
# 初始化模型
model = init_model(config_file, checkpoint_file, device=device)
# 模型推断
img = "demo/banana.jpg"
result = inference_model(model, img)
# 把结果画在图片上
show_result_pyplot(model, img, result)

使用 mmclassification 模型微调

现在我们模型框架、预训练模型都有了,只需要使用自己的数据集进行再次训练,使得模型能够更加准确的在我们自己数据上进行预测。
首先,下载数据集,这里我们采用猫狗数据集,访问链接:microsoft-cats_dogs_dataset
登录微软即可下载。

其次,将下载的数据保存并解压到 mmclassification/data 目录下:

1
unzip -q kagglecatsanddogs_5340.zip -d ./data/

然后,把数据按照 imagenet 的格式进行存放:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# 清洗数据,如果不清洗,那么会因为错误数据,导致训练出错 NameError: name 'dataset_type' is not defined
# 删除缩略图缓存
path = "data/PetImages"
catPath = "data/PetImages/Cat"
dogPath = "data/PetImages/Dog"
for x in os.listdir(catPath):
if x.split(".")[-1] != "jpg":
print(x)
os.remove(os.path.join(catPath, x))

for x in os.listdir(dogPath):
if x.split(".")[-1] != "jpg":
print(x)
os.remove(os.path.join(dogPath, x))
# 删除错误数据
for root, dirs, files in os.walk(path):
for f in files:
if os.path.splitext(f)[1] == ".jpg":
img_path = os.path.join(root, f)
try:
img = cv2.imread(img_path)
if img is None:
# 把错误图片删除
print(img_path)
os.remove(img_path)
else:
# 重写,消除Corrupt JPEG data 警告
cv2.imwrite(img_path, img)
except Exception as e:
print("Bad file:", img_path)
# 数据集划分,8:1:1
total = len(os.listdir(catPath))
train_num = int(total * 0.8)
val_num = int(total * 0.1)
test_num = total - train_num - val_num

cats = sorted(os.listdir(catPath))
dogs = sorted(os.listdir(dogPath))

training_cat = "data/cats_dogs_dataset/training_set/training_set/cats"
training_dog = "data/cats_dogs_dataset/training_set/training_set/dogs"
val_cat = "data/cats_dogs_dataset/val_set/val_set/cats"
val_dog = "data/cats_dogs_dataset/val_set/val_set/dogs"
test_cat = "data/cats_dogs_dataset/test_set/test_set/cats"
test_dog = "data/cats_dogs_dataset/test_set/test_set/dogs"

for p in [training_cat, training_dog, val_cat, val_dog, test_cat, test_dog]:
os.makedirs(p)
# 生成类别文件
with open("data/cats_dogs_dataset/classes.txt", "w") as f:
f.write("cats")
f.write("\n")
f.write("dogs")
# 生成标注文件
fval = open("data/cats_dogs_dataset/val.txt", "w")
ftest = open("data/cats_dogs_dataset/test.txt", "w")
for i, (c, d) in enumerate(zip(cats, dogs)):
c_path = os.path.join(catPath, c)
d_path = os.path.join(dogPath, d)
c_new = "cat." + c
d_new = "dog." + d
if i < train_num:
shutil.copy(c_path, os.path.join(training_cat, c_new))
shutil.copy(d_path, os.path.join(training_dog, d_new))
elif i < train_num + val_num:
shutil.copy(c_path, os.path.join(val_cat, c_new))
shutil.copy(d_path, os.path.join(val_dog, d_new))
fval.write(f"cats/{c_new} 0\n")
fval.write(f"dogs/{d_new} 1\n")

else:
shutil.copy(c_path, os.path.join(test_cat, c_new))
shutil.copy(d_path, os.path.join(test_dog, d_new))
ftest.write(f"cats/{c_new} 0\n")
ftest.write(f"dogs/{d_new} 1\n")
fval.close()
ftest.close()

再然后,生成配置文件:
把一下内容增加到新配置文件 configs/mobilenet_v2/mobilenet-v2_cats_dogs.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
_base_ = [
'../_base_/models/mobilenet_v2_1x.py',
'../_base_/datasets/imagenet_bs32_pil_resize.py',
'../_base_/schedules/imagenet_bs256.py',
'../_base_/default_runtime.py'
]

# model settings
model = dict(
head=dict(
num_classes=2, # 两类
topk=(1,),
)
)

# dataset settings
dataset_type = 'ImageNet'
data = dict(
samples_per_gpu=32,
workers_per_gpu=1,
train=dict(
type=dataset_type,
data_prefix='/workspace/mmclassification/data/cats_dogs_dataset/training_set/training_set', # 我这里使用全路径,根据需要替换
classes='/workspace/mmclassification/data/cats_dogs_dataset/classes.txt'
),
val=dict(
data_prefix='/workspace/mmclassification/data/cats_dogs_dataset/val_set/val_set',
ann_file='/workspace/mmclassification/data/cats_dogs_dataset/val.txt',
classes='/workspace/mmclassification/data/cats_dogs_dataset/classes.txt'
),
test=dict(
# replace `data/val` with `data/test` for standard test
data_prefix='/workspace/mmclassification/data/cats_dogs_dataset/test_set/test_set',
ann_file='/workspace/mmclassification/data/cats_dogs_dataset/test.txt',
classes='/workspace/mmclassification/data/cats_dogs_dataset/classes.txt'
)
)

evaluation = dict(metric_options={"topk": (1, )})

# optimizer
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[3, 6, 9])
runner = dict(type='EpochBasedRunner', max_epochs=12)


load_from = "/workspace/mmclassification/checkpoints/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth"

最后进行训练:

1
2
3
cd mmclassification
# 程序会自动在当前目录下创建 work_dirs/mobilenet-v2 文件夹,并把运行的结果、日志等保存到该文件夹下
python tools/train.py configs/mobilenet_v2/mobilenet-v2_cats_dogs.py --work-dir work_dirs/mobilenet-v2

训练后,新模型在猫狗数据上达到了更高的准确率。

参考文献

  1. open-mmlab/OpenMMLabCourse
  2. 图像分类与 MMClassification
  3. Yolov3模型框架darknet研究(九)解决Corrupt JPEG data问题