Fine Tuning vs Embeddings – which works best for Ecommerce ?

Amit Sharma

19 September 2023

In the new world of online shopping built on various tech stacks like Symfony, laravel eCommerce etc., the scope of AI and Natural Language Processing is continuously rising at high speed.

While dealing with NLP, we often encounter terms like Fine-Tuning a pre-trained model and embedding. Most of the time people get confused between them and don’t know which to use when.

Today in this blog, we will clear the difference between them and their use cases.

Embeddings: Embeddings are multi-dimensional vector representations of words. These vectors represent the meaning of the word in numerical language. They are the building blocks for Natural Language Processing. Hence, they will be used in every NLP task, directly or indirectly.

Need of Embeddings: Computers don’t understand text, they just understand numbers. Natural Language Tasks require a deep understanding of text, so words are converted to numerical vectors so that computers can also understand the meaning of human language. Suppose we have a small vocabulary of words: “cat,” “dog,” “fish,” and “bird.” We can represent these words as 20-dimensional embeddings like this:

"cat": [0.2, 0.5, -0.1, 0.8, 0.3, -0.4, -0.2, 0.6, 0.7, -0.5, 0.2, 0.9, -0.3, -0.7, -0.6, 0.4, -0.8, 0.1, -0.9, 0.0]
"dog": [0.1, 0.6, -0.2, 0.7, 0.4, -0.5, -0.3, 0.5, 0.8, -0.4, 0.1, 0.7, -0.2, -0.6, -0.5, 0.3, -0.7, 0.2, -0.8, 0.0]
"fish": [0.3, 0.4, -0.3, 0.9, 0.2, -0.3, -0.1, 0.7, 0.6, -0.3, 0.3, 0.8, -0.1, -0.5, -0.4, 0.5, -0.9, 0.0, -0.7, 0.1]
"bird": [0.4, 0.3, -0.4, 0.6, 0.1, -0.2, -0.4, 0.8, 0.5, -0.2, 0.4, 0.5, -0.4, -0.4, -0.3, 0.6, -0.6, 0.3, -0.6, 0.2]

"cat": [0.2, 0.5, -0.1, 0.8, 0.3, -0.4, -0.2, 0.6, 0.7, -0.5, 0.2, 0.9, -0.3, -0.7, -0.6, 0.4, -0.8, 0.1, -0.9, 0.0]

"dog": [0.1, 0.6, -0.2, 0.7, 0.4, -0.5, -0.3, 0.5, 0.8, -0.4, 0.1, 0.7, -0.2, -0.6, -0.5, 0.3, -0.7, 0.2, -0.8, 0.0]

"fish": [0.3, 0.4, -0.3, 0.9, 0.2, -0.3, -0.1, 0.7, 0.6, -0.3, 0.3, 0.8, -0.1, -0.5, -0.4, 0.5, -0.9, 0.0, -0.7, 0.1]

"bird": [0.4, 0.3, -0.4, 0.6, 0.1, -0.2, -0.4, 0.8, 0.5, -0.2, 0.4, 0.5, -0.4, -0.4, -0.3, 0.6, -0.6, 0.3, -0.6, 0.2]

Use Cases: Embeddings in e-commerce can be used for many tasks like Semantic Search, Recommendation Models, Products Reviews Classification, Sentiment Analysis, etc. Such tasks are simple and don’t require text generation, so in these cases, embeddings can be extracted from LLMs and can be used directly to train a model, and fine-tuning a pre-trained model is not required.

How to get embeddings of words: There are few embedding datasets like Glove & Word2Vec. You can also get embeddings of your text from open-source Language Models like this:

from transformers import AutoModel, AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained('xlnet-base-cased')
model = AutoModel.from_pretrained("xlnet-base-cased")

tokenizer.add_special_tokens({'pad_token':'[PAD]'})
model.resize_token_embeddings(len(tokenizer))

model_pipeline = pipeline('feature-extraction', model='xlnet-base-cased', tokenizer=tokenizer)

data=model_pipeline("My name is Amit")

data

<output>: [[[0.7087899446487427,
   -7.2828168869018555,
   -0.6374418139457703,
   -2.0751280784606934,
   -0.45991745591163635,
   1.2415050268173218,
   -2.2894093990325928,
   1.744140863418579,
...
   -4.4170989990234375,
   1.4603062868118286,
   -2.608621835708618,
   -0.1163523867726326,
   1.0641734600067139]]]

from transformers import AutoModel, AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained('xlnet-base-cased')

model = AutoModel.from_pretrained("xlnet-base-cased")

tokenizer.add_special_tokens({'pad_token':'[PAD]'})

model.resize_token_embeddings(len(tokenizer))

model_pipeline = pipeline('feature-extraction', model='xlnet-base-cased', tokenizer=tokenizer)

data=model_pipeline("My name is Amit")

data

<output>: [[[0.7087899446487427,

-7.2828168869018555,

-0.6374418139457703,

-2.0751280784606934,

-0.45991745591163635,

1.2415050268173218,

-2.2894093990325928,

1.744140863418579,

...

-4.4170989990234375,

1.4603062868118286,

-2.608621835708618,

-0.1163523867726326,

1.0641734600067139]]]

Or you can also get embeddings from State-of-the-art proprietary models like OpenAI Chat Models, like this:

from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(openai_api_key="your openai api key")

query="I love Football."
query_embeddings=embeddings.embed_documents(query.split(" "))

import numpy as np
print(query_embeddings)

<output>: [[-0.013148742383117574,
  -0.03375915215991781,
  0.007804009480188148,
  -0.014490166394041703,
  -0.020861926487810194,
  0.019394745052703704,
  -0.014420300434022799,
  -0.021267148683390785,
  0.012925171869850654,
  -0.005603236861867104,
  0.01721493231552059,
  0.017717965737540503,
  0.015971320462358402,
  0.005302814071976177,
...
  -0.012498991190116058,
  0.0016217598017991166,
  -0.011206473723734208,
  -0.015929401631405153,
  -0.009536681190837422,
  0.009382976637589406]]

from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(openai_api_key="your openai api key")

query="I love Football."

query_embeddings=embeddings.embed_documents(query.split(" "))

import numpy as np

print(query_embeddings)

<output>: [[-0.013148742383117574,

-0.03375915215991781,

0.007804009480188148,

-0.014490166394041703,

-0.020861926487810194,

0.019394745052703704,

-0.014420300434022799,

-0.021267148683390785,

0.012925171869850654,

-0.005603236861867104,

0.01721493231552059,

0.017717965737540503,

0.015971320462358402,

0.005302814071976177,

...

-0.012498991190116058,

0.0016217598017991166,

-0.011206473723734208,

-0.015929401631405153,

-0.009536681190837422,

0.009382976637589406]]

Fine Tuning a Pre-Trained LLM: Fine Tuning of Pre-Trained LLMs is done specifically for complex tasks like text generation. Preparing such a model from scratch requires a lot of data, computation resources, and a dedicated team of Data Scientists and ML Engineers to build.

Need for fine-tuning a pre-trained model: Text Generation is a complex task and requires a lot of resources and effort, so fine-tuning a pre-trained model is the best option. For this, we need to import the trained model and train it with our own data by referring to the official documentation of the model.
Use Cases: Models Like AI Chatbot and Conversational Agent, where text generation is done are the best use cases for fine-tuning a pre-trained model. Like in the Chatbot Module of Bagisto, we have fine-tuned the OpenAI LLM model on our data to respond to user queries.

How to fine-tune a pre-trained model: In this use case we will see how we can fine-tune the OpenAI model to answer from our own Data:

import { ChatOpenAI } from "langchain/chat_models/openai";
import { ConversationalRetrievalQAChain } from "langchain/chains";
import { HNSWLib } from "langchain/vectorstores/hnswlib";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { BufferMemory } from "langchain/memory";
import * as fs from "fs";

export const run = async () => {
  /* Initialize the LLM to use to answer the question */
  const model = new ChatOpenAI({});
  /* Load in the file we want to do question answering over */
  const text = fs.readFileSync("state_of_the_union.txt", "utf8");
  /* Split the text into chunks */
  const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000 });
  const docs = await textSplitter.createDocuments([text]);
  /* Create the vectorstore */
  const vectorStore = await HNSWLib.fromDocuments(docs, new OpenAIEmbeddings());
  /* Create the chain */
  const chain = ConversationalRetrievalQAChain.fromLLM(
    model,
    vectorStore.asRetriever(),
    {
      memory: new BufferMemory({
        memoryKey: "chat_history", // Must be set to "chat_history"
      }),
    }
  );
  /* Ask it a question */
  const question = "What did the president say about Justice Breyer?";
  const res = await chain.call({ question });
  console.log(res);
  /* Ask it a follow up question */
  const followUpRes = await chain.call({
    question: "Was that nice?",
  });
  console.log(followUpRes);
};

import { ChatOpenAI } from "langchain/chat_models/openai";

import { ConversationalRetrievalQAChain } from "langchain/chains";

import { HNSWLib } from "langchain/vectorstores/hnswlib";

import { OpenAIEmbeddings } from "langchain/embeddings/openai";

import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

import { BufferMemory } from "langchain/memory";

import * as fs from "fs";

export const run = async () => {

/* Initialize the LLM to use to answer the question */

const model = new ChatOpenAI({});

/* Load in the file we want to do question answering over */

const text = fs.readFileSync("state_of_the_union.txt", "utf8");

/* Split the text into chunks */

const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000 });

const docs = await textSplitter.createDocuments([text]);

/* Create the vectorstore */

const vectorStore = await HNSWLib.fromDocuments(docs, new OpenAIEmbeddings());

/* Create the chain */

const chain = ConversationalRetrievalQAChain.fromLLM(

model,

vectorStore.asRetriever(),

{

memory: new BufferMemory({

memoryKey: "chat_history", // Must be set to "chat_history"

}),

}

);

/* Ask it a question */

const question = "What did the president say about Justice Breyer?";

const res = await chain.call({ question });

console.log(res);

/* Ask it a follow up question */

const followUpRes = await chain.call({

question: "Was that nice?",

});

console.log(followUpRes);

};

In summary, the embedding approach is used for simpler models like semantic search or recommendation models, where we just need to fetch the text similar to the user’s query rather than generate a new text like in AI Chabot. So in complex NLP tasks, we go with fine-tuning a pre-trained model.