Optimizing Models for CPU-based Deployments in Keras

Learn via video courses
Topics Covered

Overview

Model Optimisation plays an important role in terms of deployment. Currently, it is much easier to train machine learning or deep learning than to deploy it in production. For deploying machine learning or deep learning models, we have many factors involved, where optimization comes into play. This article will discuss optimizing your machine-learning models for CPU-based deployments.

Transform Your Career

Choose from our industry-leading programs designed for career success

NSDC Certified

Modern Software and AI Engineering Program

Master full-stack development with AI integration

12 MonthsDuration
AI-LedCurriculum
Career SupportSupport
GoogleAmazonPaytm+1000 more
Go to Program
NSDC Certified

Modern Data Science and ML with specialisation in AI

Advanced data science techniques with AI specialization

12 MonthsDuration
AI-LedCurriculum
Career SupportSupport
GoogleAmazonPaytm+1000 more
Go to Program
NSDC Certified

Advanced AIML with Specialisation in Agentic AI

Deep dive into AIML with focus on Agentic systems

12 MonthsDuration
AI-LedCurriculum
Career SupportSupport
GoogleAmazonPaytm+1000 more
Go to Program
NSDC Certified

DevOps, Cloud & AI Platform Engineering

Build and manage AI-powered cloud infrastructure

12 MonthsDuration
AI-LedCurriculum
Career SupportSupport
GoogleAmazonPaytm+1000 more
Go to Program
NSDC Certified

AI Engineering Advanced Certification by IIT-Roorkee

Premier AI engineering certification from IIT-Roorkee

3 MonthsDuration
AI-LedCurriculum
Career SupportSupport
Program highlights
Go to Program

Introduction to ONNX

Open Neural Network Exchange, known as ONNX, is a machine learning library in Python. The main objective of ONNX is to behave like an open format for deep learning models and move easily from one framework to another. Facebook and Microsoft develop it.

If you want to know more about ONNX, this link might be useful to you.

Advantages of ONNX

  • You can easily convert onxx models to frameworks like (TensorFlow, Keras, or PyTorch).
  • The latency of ONNX models is less than the other framework-based models.
  • You can easily deploy ONNX models in C, C++, or Java environments.

Optimizing Keras Models with ONNX

This section will walk through converting a Keras model to an ONNX format.

Setup

We will require a few libraries. The libraries are listed below:

Note: You can install these libraries via pip. For example, if you want to install numpy, you can run this command in your terminal.

Turn Learning into Career Growth

1200+Hiring Partners
89%Placement Rate
11,000+Placements
147%Avg Salary Increment
2.5XCareer Growth
₹23 LPAAvg Post-Scaler Salary
1200+Hiring Partners
89%Placement Rate
11,000+Placements
147%Avg Salary Increment
2.5XCareer Growth
₹23 LPAAvg Post-Scaler Salary

Convert the Model

We will use a pre-trained model to convert the Keras model into the ONNX model. You can use your custom Keras model to replicate this conversion.

Let's look into the code,

After executing the above code, we get two models, i.e., model-resnet50-final.h5 and model-resnet50-final.onnx. This is how we can convert Keras models into ONNX format.

Inference of the ONNX Model

Now let's infer the results with the ONNX model.

Output

Comparison Between Evaluation Metric, Model Sizes, Latency, and Throughput

Now in the next section, let's compare which one is better.

Let's consider the model size of both models. The file size of the Keras model is 317.4 MB whereas the ONNX model is about 286.3 MB. So we can see the difference.

Now let's test the model loading time for Keras and ONNX.

Output:

Let's calculate the time for the ONNX model.

Output:

You can point out the winner.

Now it's time to capture the inference results.

Output:

We can conclude that the ONNX model is 2.7x times faster than the Keras model.

Conclusion

  • In this article, we discussed model optimization with ONNX.
  • We also went through the process of how to optimize models using ONNX
  • We also compared how the ONNX model is optimized to Keras.
Hiring Partners:
GoogleGoogleAmazonAmazonMicrosoftMicrosoftFlipkartFlipkartAdobeAdobe1200+ more