I encountered this issue with elasticsearch while I was trying to implement the sorting feature for one of my projects. The problem is that the sorting is not working as how I expected it for my string fields.
This is how the sorting problem looks like. Say you have this data.
[
{
"name": "Juan Andres"
},
{
"name": "Jasmin Mae"
},
{
"name": "Jorge"
}
]
When I do a regular sort via the name
field in ASC
order with a sample request like below:
{
"sort": [
{
"name": "asc"
}
]
}
I get this result:
[
{
"_source": {
"name": "Juan Andres"
},
"sort": ["Andres"]
},
{
"_source": {
"name": "Jasmin Mae"
},
"sort": ["Jasmin"]
},
{
"_source": {
"name": "Jorge"
},
"sort": ["Jorge"]
}
]
This is not the sorting I want to happen. The reason why this is happening is because my field name
is being analyzed
. Elasticsearch uses the tokens to do the sorting. This is actually not advisable.
I can just change the string index to not_analyzed
but I need it for my full-text search.
To make this work, I need to make a field both analyzed
and not_analyzed
. This can be done by using the fields
property. Fields
is a newer version of multi-field
. Your property will basically have two parts - one tokenized and one unanalyzed. Since you have both, you can use the not_analyzed
part for sorting, and the analyzed
part for the full-text search. You can read more about it here.
I modified my type-mapping for the field to this:
"name" : {
"type": "string",
"fields": {
"raw": { "type": "string", "index": "not_analyzed" }
}
}
Once I had name
configured to this, I was able to sort it properly. I needed to indicate the sort field to name.raw
instead of just plain name
. My new sort request looks like below:
{
"sort": [
{
"name.raw": "asc"
}
]
}
Once I ran the command above, I get the correct sorted list.
[
{
"_source": {
"name": "Jasmin Mae"
},
"sort": ["Jasmin Mae"]
},
{
"_source": {
"name": "Jorge"
},
"sort": ["Jorge"]
},
{
"_source": {
"name": "Juan Andres"
},
"sort": ["Juan Andres"]
}
]
The sorting criteria becomes the not_analyzed
version of the field.