I'm a programmer specialising in performant and scalable systems using PHP and Ruby and cooking


Published:

Sorting string fields with ElasticSearch

Using the mapping example from my previous article on Getting Results for Nested Objects from Elasticsearch and then attempting to sort your results byt the "title" field you will most likely recieve this error:

Can't sort on string types with more than one value per doc, or more than one token per field

Simply put you can't sort on a simple string field as by default it will be analyzed and split into tokens.

Enter MultiField

In our mapping we need to use the multi_field type which will allow us to have basically two versions of our field. This means we can have an analyzed version for searching and a raw version for sorting.

This is quite a simple adjustment to make to the mapping, simply replace the existing definition for the title field with this:

"title": {
  "type": "multi_field",
  "fields": {
     "title": { "type": "string", "index": "analyzed", "store": "yes" },
     "raw_title": { "type": "string", "index": "not_analyzed", "store": "yes" }
  }
 }

This tells elasticsearch that we want two fields mapped for the title field. The first called "title" will act as a default for the title field meaning we can then use it for searching and you'll note that it is marked as analyzed.

The second field is " rawtitle" this can be called anything you want and is marked as notanalyzed so that we can use this for sorting. E.G.

"sort": { "title.raw_title": "ASC" }