Chartsearch Module -- search box behavior / syntax

We are currently improving the chart search module and adapting it for OpenMRS 2.0. This is a module to allow clinicians near immediate access to patient information using search technology. Understanding how users expect the search box to behave is is critical to good usability.

Here are a few questions:

  • should the search ‘Blood Pressure’ search as two individual words or as a phrase?
  • should we understand AND / OR within the search box?
  • should we have " " indicate that the search entry should search as a phrase (as with google)?
  • If the user searches for 36 should this also return 36.2?
  • Should there be a way to preselect filters using the search box? For example obs:blood would only search observations for blood?
  • Should we support wildcard (**) in searches? What would this look like more specifically?
  • Support searching parts of words? For example searching “hem” would return “hemoglobin” and “hematocrit”

Here is the google search operators for reference which might be relevant: https://support.google.com/websearch/answer/136861?hl=en

1 Like

Blood Pressure should search for something like (blood or blood*) or (pressure or pressure*). That is, the union of individual words and taking advantage of Lucene’s stemming (e.g., finding “pressures”).

We use eDisMax to allow for more complex phrasing. It’s a nice feature to have, but Chart Search can still be very clinically useful without this feature, so I wouldn’t make it your top priority.

Yes. Again, not a top priority, but it’s handy to have. A regular expression like s/(?!and\b|or\b)(\b[^\s]+)\b(?=([^"]*"[^"]*")*[^"]*$)/$1*/g can turn a query like mammo and this and "mammo" or "match this phrase exactly" into a Lucene query for you (try it out on the regex playground).

Yes.

Sure, though I wouldn’t use “obs”. Create clinical categories, not data model categories. At Regenstrief, we’re planning on support lab:glucose as well as glucose in lab. FYI – you want a magic term for each category, but make sure to allow for common variations (e.g., lab:blood and labs:blood should be equivalent, rad:foot and radiology:foot would both search for foot in radiology results).

We prioritized saving default filtering over this feature, since being able to define your own starting filters is more generally useful.

Using Lucene, you can get some of these automatically. And, for the end of words, I would always apply the wildcard – i.e., foo becomes (foo or foo*). Most non-geeks aren’t going to know about * for wildcards, so I wouldn’t both with this feature.

Absolutely. As I’ve said above, I would always provide wildcard & stemming on all search terms that aren’t within quotes.

Hi @tgreensweig and @burke ; am now planning to have a serious implementation of this feature this week, however i have been planning that in order to eliminate returning so many results that may even not be necessary to some users, am thinking that we can use the default search by solr on the chart search page and then add a manage search syntax page where the users can customize such behavior for them selves according to their needs, for some who may not always remember the search syntax to be used directly in the search box.what do you think about this, wouldn’t it a better?

I think most users may prefer sitting behind and the module deal with this behavior rather than them handling it in the search phrase :smile:

I would assume that most people will expect the search box to behave just like Google – i.e., “read their mind.” :wink:

  • A search for CD4 returns anything with “CD4” in the name or in the results including CD4%, CD4 Count, narrative reports containing the word “CD4”, etc.

  • A search for fracture returns any results with “fracture”, “fractures”, “fracturing”, “fx”, etc. in the name or answer.

  • A search for blood pressure would return anything that contains the words “blood” and (“pressure” or “pressures”), including a report containing the phrase “Given the pressures we drew some additional blood tests.” If they wanted to exclude the latter, they could search for "blood pressure".

The number of results does not matter as long as the reason for each result appearing makes sense to the user. Having results not showing up when expected is worse than getting lots of results for short, ambiguous searches.

Getting your module hosted where we can play with it will be immensely helpful in refining the behavior. Getting your module to where people can test it out against their production data will be even more helpful.