What is Semantic Search? A Brief Introduction
How does Google know which sport you’re referring to when you search for “football?” Google uses semantic search to understand the intent behind your questions. So, if you’re someone from the U.K., it will show you the results of the latest soccer match, and if you’re in the U.S., it will show you results from the NFL.
Semantic search is a powerful tool in machine learning that uses vector embeddings to add contextual information to your questions. So, when you ask your investment provider a domain-specific question, it can give you domain-specific answers.
This is useful because you want your systems to provide personalized answers and recommendations. This article explores the basic implementation of semantic search and how it works in customer service operations. we’re covering:
4. Conclusion
Keyword Search
Before we understand semantic search, it’s essential to understand keyword search. This is the basic search function that you can implement anywhere.
It’s even available on this blog page, and you can just press “Ctrl/Cmd + F” to access particular keywords. Keyword search and statistical rankings were the basic search forms implemented in search engines.
How Does it Work?
This is an exact match algorithm. So, in a basic keyword search, you will go through your documents and try to find matching keywords.
So, if you’re searching for “needle” in the following sentence:
“Hay hay hay hay needle hay hay hay.”
Then, this algorithm will provide you with the word “needle” fairly quickly.
However, this approach has several problems; let’s explore them.
Problems with Keyword Search
Since your keyword search algorithm will look for an exact match to your keyword, several problems arise:
1. Similar Keywords get Missed – If an eCommerce customer looks for comfortable shoes, the algorithm will not show them “Sneakers,” “Ergonomic Shoes,” or “Orthopedic Shoes.” These products match the customer’s use case, but the keyword algorithm can’t show all options.
2. Lack of Industry-Context – If you’re in finance, the term “bps” refers to base points; however, in the larger IT industry, it refers to “bytes per second.” A keyword search algorithm will serve both documents to a finance professional and confuse the user.
3. Lack of Customer Context – If a customer goes to an eCommerce website and searches for Apple, it will show the fruit and the phone. This is because Keyword search lacks any context around customer intent.
As these three examples demonstrate, keyword search algorithms can’t help people navigate around their problems well. Since 2013, Google search algorithms (When the Hummingbird update was released) have also employed semantic search.
So, what is semantic search? Let’s understand.
Semantic Search
Every language has two primary components: Syntax and Semantics.
1. Syntax – A collection of the grammatical rules that govern the language.
2. Semantics – A collection of relative meanings of words and how they’re related.
We’re trying to find syntactic similarity between the searched keyword and the document in a keyword search. In semantic search, we try to find the actual meaning of the words.
To explain this in practical terms, let’s take a simple example. Assume you explain a particular failure in your devices as a “Power Off Failure” in all your documents. If a customer asks: “My computer won’t switch on.”
1. Keyword Search will say that it has no information.
2. Semantic Search will connect the word “power” to “switch” and provide the correct troubleshooting procedure.
This elucidates why AI in customer service uses semantic search. Your documentation and training cannot encompass all possible variations of a query to give the correct results to keyword search. A semantic search makes it much easier for you to provide the correct answer to your customers, regardless of the words they use in the questions.
How Does it Work?
There are several different ways to perform semantic search today. You can power up one of the top 5 NLP libraries and build your applications for semantic search.
However, to illustrate how it works in real life, we will take a basic similarity search algorithm, Jaccard distance.
Now, distance-based algorithms for similarity search.
This algorithm takes you into the vector space and finds the most similar phrase or word. Since the vector space is where words are mapped out by their similarity and differences, this search gives you proper results.
The Jaccard Algorithm
Let’s take two sentences:
Sentence a = “I want to get my computer fixed.”
Recommended by LinkedIn
Sentence b = “The computer in my house isn’t fixed yet.”
1. Turn Sentences into Sets
A set is a list of unique elements in each of your sentences. This should give you:
“I want to get my computer fixed.” = {"I", "want", "to", "get", "my", "computer", "fixed"}
“The computer in my house isn’t fixed yet.” = {"The", "computer", "in", "my", "house", "isn't", "fixed", "yet"}
2. Find the Intersection of These Sets
An intersection calculates the list of shared elements between the sentences. In this instance,
a∩b = {"computer", "my", "fixed"}
3. Find the Union of These Sets
The union of two sets lists all the elements that only occur in one of the sets together. This should give you:
a∪b = {"I", "want", "to", "get", "The", "in", "house," "isn't", "yet", “computer”, “my”, “fixed”}
4. Find the Jaccard Similarity
The Jaccard Similarity is defined by:
Here, we have the sets for a∩b and a∪b. So, we just need to calculate the coefficient directly:
Length of a∩b = 3
Length of a∪b = 12
So, Jaccard Similarity = len(a∩b)/len(a∪b) = 3/12 = 1/4 = 0.25
And Jaccard Distance = 1 – 1/4 = 3/4 = 0.75
So, according to Jaccard Distance, these two sentences are different.
You can implement your own Python algorithm to implement this. Just use the code:
a = "i want to get my computer fixed".split()
b = "the computer in my house isn't fixed yet".split()
a = set(a)
b = set(b)
def jac(x: set, y: set):
shared = x.intersection(y)
print(len(shared))
print(len(x.union(y)))
return len(shared) / len(x.union(y))
print(jac(a, b))
Now, we know intuitively that these sentences are indeed semantically similar. However, this was one of the first algorithms for similarity search, and ever since, the algorithms have become more elegant and better at understanding words.
When you use our DialogFlow CX chatbots, for example, you’re tapping into Google’s latest Gemini and PaLM-based models to perform similarity searches, and you’ll get much better results.
Now that we know a basic primer about how semantic search works, let’s understand why we use it for customer service.
Benefits of Semantic Search
We use semantic search in Kommunicate to give better customer service. This is because semantic search gives you the following:
1. Context-Aware Answers
With semantic search, chatbots can adapt to any industry-specific problem. So, the answers a chatbot gives customers will depend on the business’s industry.
2. Improved Understanding
Semantic search helps chatbots understand the intent and meaning behind your customer’s queries. This allows the chatbot to formulate better answers overall.
3. Better CSAT
Since your chatbot can better understand your customers, they get their solution faster. So, you get a better CSAT score with your chatbot responses.
4. Handling of Complex Queries
Semantic search can handle multi-turn and complex queries much better than a basic keyword search could.
These benefits create a better and more holistic customer experience. It makes it easier for businesses to solve support tickets at scale.
Conclusion
Search has made rapid advances since 2013. Since the introduction of word2vec, semantic search has become standard across most applications.
Now, we’re seeing more practical business use cases for this with generative AI chatbots. This new generation of chatbots performs NLP to understand queries and can answer far more complex queries than earlier chatbots ever could.
AI is only set to improve multiple companies have set goals for publishing better AI models shortly. With these innovations, it’s not hard to imagine a world where AI will be one of the primary tools for businesses worldwide.