Speed to Search Success: Synonyms
Another quick one this week. This time, we're talking about synonyms and how to use them for search matching.
Last time we talked about how typo tolerance can reduce the time from query to success for a searcher. Typos aren't the only way that a searcher might have a mismatch between the query and the content contained inside records. Sometimes the query might have correctly spelled words that correspond to items that are actually in the index but still not find a match.
This is where synonyms and alternatives come into play.
Unless you speak a completely logical language like Lojban (you don't), then you are going to run into situations of ambiguity, including where the same concept can be referred to by different words. When the words are a close enough match, we call them synonyms.
How synonyms improve speed to success is fairly obvious: rather than forcing searchers to know exactly which words are contained inside your records, let searchers use their own words. Indeed, this goes beyond speed to success and often will mean the difference between success and none at all.
Of course, creating synonyms can be a bit of a hassle. Most people will set up a lot when first creating search and then add to the list iteratively. For some cases, it might even be possible to download a list of synonyms already put together.
A better way to handle creating synonyms is to look at user behavior. When searchers don't find something a first time, they will often refine their query with other terms. Those refinements are the key here. This blog post illustrates one way to do it:
[The] algorithm studies the customer behavior and uses it as a pillar for building the list of synonyms. Such an approach is better than starting with predetermined synonyms or related words. Why? It studies what people type in the search box and what links they click. For example, when the same queries lead to the same search results, they are considered similar queries.
There's a problem with synonyms: while they catch the straightforward alternatives, they are difficult to use for matching on concepts. You're not going to have a synonym for long tail concepts like "portable video game player." This is where vector search comes into play, which we'll look at next time.