When pinecone announced a vector database early last year, it was building something specifically designed for machine learning and aimed at data scientists. The idea was that you could request this data in a format that machines understand, making it much faster.
Originally, they were semantic searches where users could search by meaning rather than specific words. However, it turned out that when people put Pinecone to work, there were cases where specific keywords mattered, and today the company announced that it is now possible to perform searches that combine both semantic and keyword searches, which the founder and CEO The company’s Edo Liberty calls hybrid search.
“We’ve done a lot of research on this topic and we’ve found that hybrid search gets better eventually [in many cases]. It’s better in the sense that if you can combine both semantic search, it’s the deep NLP coding of sentences that gets the context and meaning and so on, but you can also imbue that with specific keywords… the combination of those two ends up being significantly better,” Liberty told londonbusinessblog.com.
In fact, he says the two complement each other well, especially in cases where industry-specific terms matter. This could be something like a doctor searching for keywords related to a specific disease. In those cases, the medical context can yield better results by combining a question and some specific keywords around a particular disease.
He says that the keywords never take precedence over the semantic question the user is asking, but they provide some additional information to give more meaningful results.
“Maybe you know exactly what you’re looking for, and you can give it extra appeal if you make your semantic search keyword-aware – and that really helps a lot. So I don’t want to throw out the good parts of keyword search [by relying completely on semantic search]. I don’t want the keywords on the driver’s seat, but I don’t want to completely ignore them either,” he said.
As Liberty told us last year during the company’s $28 million Series A, search has become a big use case for the company.
“The vector databases are mainly used for searching, and searching in the broad sense of the word. It’s document search, but you can think of search as information retrieval in general, discovery, recommendation, anomaly detection and so on,” he said at the time.
Pinecone launched in 2019 and has raised $38 million, per Crunch base.