Efficient Multilabel Classification in E-commerce

Updated 23 October 2023


Multilabel classification is an important task in computer vision that involves predicting multiple labels or categories from an image.

We can use multilabel classification in e-commerce platforms like bagisto, where e-commerce provides product search through images. For example, Google Lens can search products by Images.

Traditionally, this has been achieved by training separate models for each label and then merging their predictions. However, this approach can be time-consuming and inefficient because of multiple models for predictions.

Slow response time is bad for the user experience. To reduce the response time we use multiple classification single model. In this blog post, I will tell you about developing an efficient multilabel classification architecture and its use in e-commerce.

1. Single Multiclass Classification Model:

To start, I began with a single multiclass classification model where each image is assigned a single label from different categories.

However, this approach required training with different models, one for each label, and then combining their predictions to obtain the final multilabel output.


Load Different Models:

In this, we load trained models using Keras library.

Define Prediction Function:

In this, we predict different labels from different models using the model.predict. model.predict gives probabilities for different categories, np.argmax selects maximum probability and we get a label.


Now we get a result, like color: Red, master_category: Clothes, Category: T-shirt, and sub_category: Casual.

I train different models like color detection, Master Category of product, category of product, and sub-category of product. Each model predicts different labels, and we use these models by merging them together.

Suppose these models are in e-commerce websites for visual shopping, Customers upload a photo, and the model can predict the actual product and show similar products to the customer.

Although accurate, this method suffered from slow response times due to the need for multiple models.


2. Researching Efficient Multilabel Classification:

In order to overcome the slow response time issue, I had to do further research and discovered an alternative approach that could yield more efficient results.

The key insight was to design a single model architecture capable of predicting all multiple labels simultaneously.

By combining the architecture of the individual models, I could eliminate the need for merging predictions and achieve faster inference times.

3. Building the Multilabel Classification Model:

To implement this approach, I organized the model architecture as a class method. I created a class that encapsulates the architecture design and functionality.

Within this class, I defined separate functions, each responsible for constructing the architecture of one label.


Import Necessary libraries

we  import different Tensorflow.keras.layers

Conv2D, MaxPooling2D:  for features extraction from images.
Flatten: for flattening the features vector.

Dense:  Nural network layer.

Dropout, Activation, and BatchNormalization are three commonly used techniques in deep learning architectures to improve model performance, prevent overfitting, and accelerate convergence during training


These functions incorporated unique layers and connections specific to each label. By linking these functions together within the “build_model” function, I created a comprehensive architecture capable of handling all labels simultaneously.

In class functions inputs are a pixel dimension of an image, once we set these values for training, we can’t give any other value while predicting.

The “build_model” function initializes the CNN model structure and connects it to the subsequent label-specific architecture functions, setting up the input layer and shared convolutional layers that are common to all labels.

Train & Save Model

In this we build the model:

4. Inference and Response Time:

Load and inference:

Make prediction function first, we open and process the image using PIL image library, convert them into an array, normalize the image vector by dividing 255, and change image dimensions as batch input.

Now we predict labels, prediction gives a List of predictions of all categories, so we split by predictions[index_unmber] and use np.argmax to get label from particular category prediction.

Our Prediction looks like this:


This leads to faster inference, making the model more efficient and practical for real-time applications. As compared to the multi-model approach this approach in e-commerce is better, where we use only a single multilabel classification model.

With this approach, the model predicts multiple categories, such as color detection, master category, category, and sub-category of products, in a single inference pass. As a result, customers on an e-commerce website can upload a photo and receive quick and accurate predictions.

5. Maintenance:

The efficient multilabel classification architecture not only enhances the user experience but also reduces the model deployment maintenance. By managing the single model, the management of the system is easy, reducing the complexity of managing and updating multiple models separately.


In this blog post, I have shared an efficient approach for using multilabel classification using a single CNN model architecture in E-Commerce.

By adopting this technique, developers and researchers can enhance the efficiency and performance of their multilabel classification models.

Overall, this approach to multilabel classification in e-commerce is beneficial, offering faster response times, improved user satisfaction, and simplified model maintenance.

. . .

Leave a Comment

Your email address will not be published. Required fields are marked*

Be the first to comment.

Start a Project

    Message Sent!

    If you have more details or questions, you can reply to the received confirmation email.

    Back to Home