Project: Dog vs Cat Classifier using Pytorch

Learn via video courses
Topics Covered

Overview

Dog v/s Cat Classifier is a hello world Project for anyone learning deep learning. This project will give you a brief idea of building a classification using a deep learning project having image data with the PyTorch library. This article will offer a head start towards the PyTorch library for projects, computer vision, and the classification deep learning project. So, without any further due, let's explore and learn.

What Are We Building?

In this project, we will build a model to predict an image of a dog or a cat. In addition, this project will explore the approach to solving a classification problem with the PyTorch library. Also, we will learn how to train, validate and test the model.

Pre-requisites

The prerequisites for this project are knowledge of Python, concepts of neural networks and especially convolutional neural network will help, and basic Pytorch. Also, We will be doing the code in Jupyter Notebook/Colab ( having some information about these will help)

How Are We Going to Build This?

We will follow part of the machine learning life cycle to complete the project. The below block diagram will explain the process of the project. The General Steps of the Machine Learning project are

  1. Aqucisition of Data
  2. Preprocessing and extracting features from data
  3. Model Preparation, training, and validating
  4. Testing and reiterating some of the previous steps for performance enhancement
  5. Deployment of the model

We will focus on most of the steps in this project. We will focus on most of the steps in this project. The general approach for building this project is visualized in the below block diagram.

block diagram of dog vs cat

Requirements

The requirements for this project are divided into three sections.

  1. Python Libraries used in the project

  2. Kaggle Account for downloading the data and kaggle.json file.

  3. Use Google Colab or any other notebook platform to run the code with GPU acceleration. Running the code on the CPU can be difficult because of the computation. So it is preferred to use any platform, or if you have the GPU in the system, use a Jupyter Notebook.

Building the Classifier

Downloading Cat vs. Dog Image Data from Kaggle

Before downloading the data, you must have a kaggle.json file in the project directory (Resource for downloading that file). If you are using google colab, later you can upload the file to Colab via the below command.

Now, let's write a bash script to download the data and place it in the proper places, also removing the junk files.

Create the Bash script

Script to download Data

You can write the below code in the "download_data.sh" file. This script downloads data zip files, extracts them, and places those in assigned directories. You can customize this code to perform more tasks.

Run the Bash script

By now, the data will download to the directory data/. Also, train and test data are in the respective directories.

1. Importing Libraries

Let's import the libraries and functions that are needed for us to build this classifier.

2. Loading the Data

First, Let's organize the data in the below format, allowing us to create our data loader module.

The code for structuring the data is given below.

Now, we will write the code for splitting the data to train and validate data. Also, we will write the code for the Data loader used to load the data batch-wise while training and validating the model. We will inherit the Dataset class from the torch utils, giving us features used while training the model. There are mainly three magic methods in this class that is needed; we will edit those functions according to our application.

3. Visualizing the Bata

Next, we will visualize the data from the train data loader. Matplotlib is used to visualize the augmented data. The images are batch-wise; we will display one batch with 16 image data.

output of dog vs cat

4. Building the Model

Let's Build the Model using a pre-trained neural network called VGG16. This method of using a pre-trained network is known as transfer learning, allowing us to train the model faster and better since it has previous knowledge. Main points to remember before building the model

  1. The Input channel size is 3 (RGB)
  2. Last Layer (Output Layer) should be either sigmoid with one output or softmax with two outputs.

5. Training the Model

We will write some functions to make our tasks easier. The points to remember before writing code are

  1. The loss function should be Binary Cross Entropy Loss (BCELoss) if the last layer is sigmoid (the loss function should be Cross Entropy Loss if the last layer is softmax).
  2. The optimizer can be any; in this case, we can use the Adam optimizer. Also, you can adjust the learning rate.
  3. The data, labels, and model must be on the same device.
  4. While writing the train_batch snippet, remember the below points:
    • Set optimizer.zero_grad(), which removes the gradients present that help frees up memory.
    • After computing the criterion (loss), execute the criterion.backward(); this performs the backward propagation and computes the gradients.
    • The code optimizer.step() will iterate through the model and adjust the parameters.
  5. Points to note for the validate_batch code are
    • Do not use .backward() and .step() because the batch is not being trained.
    • Use torch.no_grad() before giving the data to the model; this will ensure that gradients are not stored in memory.

Next, we will write the loop for the model to train and validate. The loop is based on the number of epochs we choose. The tqdm library is used to visualize the training and validation. Inside the main loop are two parallel loops for training and validation. First, the inner loop will provide batch-wise data. Then it's given to train_batch and validate_batch, which we had defined earlier. Next, compute and print the losses and accuracy.

6. Results & Inferences

The Results of this model are

ProcessBCE LossAccuracyF1-Score
Train0.07298.6698.84
Validate0.059298.080.0

You can improve the model by training it more or tweaking the parameters.

7. Save the Model to the Disk

To save the model use run the below.

Conclusion

Here are some takeaways from this project:-

  1. Process the Data according to your needs, and write scripts/ functions to make tasks easier.
  2. Use Augmentations,, especially for images with the DataLoader and Dataset classes.
  3. Try to use a Transfer learning approach rather than a bare neural network; this will help the model to learn faster and better since it has the previous knowledge
  4. It's always good to approach having more than one metric unless you have a specific reason for having one.