Proposed Search algorithm for Add-On Index

reubenv · June 12, 2017, 10:19pm

@darius Thanks for reminding me about that and also for the encouraging feedback !

Yes, we do have one limitation and that is currently, we have set the name of the field to be tokenized. Now the problem with this is that say we have a module with the name “Reference application module”, then , elasticsearch stores it as 3 different words i.e. “Reference” , “Application” and “module” . Now suppose, the search query is “Reference” and we have other modules with similar names/ names with the search query in them like say “Metadata Reference module” . In such cases even though we’d like to see those modules with Reference as it’s first word ranked higher, elasticsearch cannot distinguish between these 2 cases and sometimes ranks the 2nd module higher.This is because elastic search that both module names contain the search query without realizing the order. This is not a desired feature.

What’s the solution? The solution is to set the name field as “Not analyzed” i.e. basically we let elasticsearch that it is one single word and hence we tell elasticsearch to NOT tokenize it.

Then why haven’t we done it yet? This is because we currently also make use of the fact that the words are being tokenized in building our search algorithm.

Then what’s the final solution? We will have to create a duplicate field which will basically be a copy of the “Name” field and we will set it to “Not analyzed”