06月26, 2021

Image Classification using Pre-trained models

使用预训练模型完成图像分类

ImageNet 数据集介绍

共有1000个分类, 1400万张图片. 大量模型使用imagenet作为基准数据集训练模型

ImageNet dataset has over 14 million images maintained by Stanford University. It is extensively used for a large variety of Image related deep learning projects. The images belong to various classes or labels. Even though we can use both the terms interchangeably, we will stick to classes. The aim of the pre-trained models like AlexNet and ResNet101 is to take an image as an input and predict it’s class.

执行步骤

  1. Reading the input image
  2. Performing transformations on the image. For example – resize, center crop, normalization, etc.
  3. Forward Pass: Use the pre-trained weights to find out the output vector. Each element in this output vector describes the confidence with which the model predicts the input image to belong to a particular class.
  4. Based on the scores obtained (elements of the output vector we mentioned in step-3), display the predictions.

加载预训练模型

from torchvision import models
import torch

dir(models)
alexnet = models.alexnet(pretrained=True)

# You will see a similar output as below
# Downloading: "https://download.pytorch.org/models/alexnet-owt- 4df8aa71.pth" to /home/hp/.cache/torch/checkpoints/alexnet-owt-4df8aa71.pth

### 指定图像转换

from torchvision import transforms
transform = transforms.Compose([            #[1]
 transforms.Resize(256),                    #[2]
 transforms.CenterCrop(224),                #[3]
 transforms.ToTensor(),                     #[4]
 transforms.Normalize(                      #[5]
 mean=[0.485, 0.456, 0.406],                #[6]
 std=[0.229, 0.224, 0.225]                  #[7]
 )])

Line [1]: Here we are defining a variable transform which is a combination of all the image transformations to be carried out on the input image.

Line [2]: Resize the image to 256×256 pixels.

Line [3]: Crop the image to 224×224 pixels about the center.

Line [4]: Convert the image to PyTorch Tensor data type.

Line [5-7]: Normalize the image by setting its mean and standard deviation to the specified values.

加载图片和预处理

# Import Pillow
from PIL import Image
img = Image.open("dog.jpg")
img_t = transform(img)
batch_t = torch.unsqueeze(img_t, 0)

alexnet.eval()
out = alexnet(batch_t)
print(out.shape)

读取类别标签并转换成预测结果

with open('imagenet_classes.txt') as f:
  classes = [line.strip() for line in f.readlines()]

_, index = torch.max(out, 1)

percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100

print(labels[index[0]], percentage[index[0]].item())

_, indices = torch.sort(out, descending=True)
[(labels[idx], percentage[idx].item()) for idx in indices[0][:5]]

本文链接:http://57km.cc/post/Image Classification using Pre-trained models.html

-- EOF --

Comments