Semantic SearchTanvi Vishwasrao Published on: January 18, 2024
Imagine typing 'best coffee places near me' into a search engine and, instead of getting a random list of coffee shops, you're presented with cosy cafes that match your love for organic blends and indie music. This is a semantic search at work. Semantic search goes beyond the surface of words to grasp the intent and context of your queries, offering results that feel surprisingly personal. Now, isn't that a refreshing way to start your search – and your day?
What is Semantic Search?
I'm someone who prefers to understand the query logic than just the surface of typing out a search query; I'm fascinated by the 'why' and 'how' behind it. That's precisely where Semantic Search plays its part. It’s not limited to the 'what' of your query. Instead, it digs into the reasons and the methods behind it. Through sophisticated algorithms, it picks up on the subtle nuances of our language - be it idioms, synonyms, or even the specific quirks of regional dialects. This approach goes beyond the realm of traditional keyword matching. It leverages the power of Natural Language Processing (NLP) and Machine Learning (ML) to truly understand not just the words we type but the intent and meaning we embed within them.
Imagine engaging in a dialogue with someone who is part linguist, part psychologist - the search engine becomes an entity that comprehends your language and intentions. This is the kind of intelligent, user-centric approach that elevates our search experiences to new heights, making them not only more efficient but also more intuitive and personalized.
Why is Semantic Search a Game Changer for User Engagement?
Imagine a user searching for lifestyle products on your platform. With Semantic Search, their journey becomes more than a transaction; it becomes a personalized experience. Whether they are casually browsing or looking for something specific, the technology ensures that they find exactly what they need, and often, something more. This isn't just about enhancing user satisfaction; it's about transforming their interaction into a series of meaningful discoveries. The impact? Your customers spend less time searching and more time engaging with content that resonates with them, leading to higher conversion rates and, importantly, a stronger connection with your brand.
Google is perhaps the most prominent example of a search engine that utilizes semantic search extensively. Over the years, Google has continuously refined its algorithms to better understand the intent behind users' queries. With updates like Hummingbird and the introduction of the Knowledge Graph, Google has shifted from mere keyword matching to understanding the context and relationships between words and phrases in a query. This allows Google to provide more accurate and contextually relevant search results.
You can see the knowledge graph in action on the results page yielded upon searching for “chocolate chip cookies.” The SERP does contain standard organic results and links to suitable websites, but it also contains a rich set of knowledge graph data, including an answer box with a recipe, a right-hand knowledge panel featuring nutritional facts about this dessert, and suggestions for related search subjects.
To understand the process of Semantic Search, it's helpful to look at the steps involved clearly and concisely. Initially, the journey starts with data collection, often organized in a CSV file. This data is the backbone of the search system. The next step is to feed this data into an embedding model. In this stage, the model converts the data into a series of vectors. These vectors are crucial because they represent the data in a format that the search system can efficiently process.
Once we have these vectors, they are stored in a system ready for retrieval. When a user enters a query, the system begins its work. It first encodes the user's query into a similar vector format. With both the query and the data now in vector form, the system uses a method called ‘cosine similarity’ to compare them. This method calculates how closely the query vector matches the data vectors.
The closer the match, the more relevant the result. This process allows the search engine to retrieve and present results that are not just based on keywords but are aligned with the user's intent and the context of their query. It's a methodical process, combines data management, machine learning, and sophisticated matching techniques to deliver a search experience that is both intuitive and precise.
In Semantic Search, the cosine similarity rule is used to determine how similar two pieces of information are. The formula for this is quite elegant in its simplicity. It calculates the cosine of the angle between two vectors – think of these vectors as arrows pointing in different directions. The formula looks like this:
Cosine Similarity Rule –
Here, A⋅B represents the dot product of the two vectors, which is a way of multiplying them. The bottom part, ∥A∥∥B∥, is the magnitude (or length) of each vector. When the vectors are very similar, the angle between them is small, and the cosine similarity is close to 1. If they're very different, the angle is larger, and the cosine similarity gets closer to 0. This formula helps the search engine to understand how closely the content of your query matches potential search results, ensuring that the results you see are relevant to what you're looking for.
Process of Semantic Search
Example in action:
Let's see an example of a semantic search module in action.
We will first need a data set/resource to perform a semantic search on. For this, I will download all the blog article data that I have on Garchi CMS