Fun MySQL fact of the day: good-enough sort
Over the last few days, we've discussed some of the traits of filesort
and we'd discussed two important variables: sort_buffer_size
and max_length_for_sort_data
. But there's 1 more important variable that you may even find yourself wanting to change on a query-by-query basis: max_sort_length
.
While it's rather trivial to confuse max_sort_length
with max_length_for_sort_data
, they actually have some interplay as we'll see, but while max_length_for_sort_data
determines the filesort
mode, max_sort_length
tells MySQL how many bytes to compare for each field when sorting rows. For example, SET SESSION max_sort_length=32
would tell MySQL to compare only the first 32 bytes of each field, rather than the entire thing. While this can seem pretty weird, it might make sense to use when you need faster sorts on string data or temporal data that may not necessarily need to be perfect.
For example, if you had an ORDER BY blog.date_created, blog.description
, we may only want to sort the first 32 bytes of a user's blog post description instead of using the default 1024
bytes. This may not mean too much when you have a small number of rows to sort, but if you have 5,000, it can mean the difference between 4.9MB (requiring disk-based temporary files to sort) and 156KB, all which fits in the default 256KB sort buffer.
So, what do we now know about max_sort_length
?
- You can dynamically lower it to prevent a
<sort_key, rowid>
sort mode. - You can dynamically lower it to reduce the number of merge passes in a query with a large number of sorted rows.
- You can dynamically lower it to ensure the sorted rows all fit into the sort buffer.
So, next time you find yourself sorting by large strings, ask yourself if you can trade off some accuracy for sorting efficiency.