Fun MySQL fact of the day: OFFSET
On Friday, we discussed a simple fact about LIMIT
, and that wasn't just because it was Friday (well, maybe a bit), but because I felt it important to have a refresher on how MySQL and the storage engine interact before we look at OFFSET
.
It took some time, but after finally(!) finding the 3 lines of code in MySQL that's responsible for implementing OFFSET
, I had a bit of a chuckle. It was not at all what I expected, but I have a hard time arguing that it's wrong. Today, let's consider a table, (id,a,b,c)
, with no secondary indices, but with 30 rows in sequential order (id=1,id=2,...,id=30)
that we query as SELECT * FROM my_table LIMIT 10 OFFSET 10
. Knowing what we know, our expectation is that we will get back rows with id=11..id=20
, and, well, we will. But not until MySQL has first seen all rows from id=1..id=20
.
Huh? Surely there's some kind of optimisation where MySQL can just jump to id=11
, right? Unfortunately, no. There isn't. And there can't be, because for all MySQL knows, there is no id=11
. And, further, like the example on Friday, the storage engine also has no idea if MySQL will need to perform supplemental filtering on the rows it has returned. So, knowing this, I think we can agree that the OFFSET
implementation can't be terribly intelligent. But what exactly does it look like?
For a query with an OFFSET
, MySQL behaves no differently than a query without an OFFSET
: it fetches each row from the storage engine one-by-one, evaluates them against the applicable conditions in the WHERE
clause one-by-one, and then attempts to write it to the network. But this is where things differ. To see how, we need to consider just 3 variables in the SELECT
code path:
offset_limit_cnt
, which is initially set to the value ofOFFSET
(10 in our case)select_limit_cnt
, which is initially set to the value ofLIMIT
plusoffset_limit_cnt
(10+10=20 in our case)send_records
, which is initially 0 and tracks the number of rows sent to the network.
Now, just before the row is sent to the network in select_send::send_data
, MySQL checks if offset_limit_cnt > 0
. If so, it subtracts 1 and returns, effectively dropping the row. At the same time, MySQL increments send_records
. Eventually, offset_limit_cnt == 0
and the rows start getting sent to the client. Then, eventually, when send_records >= select_limit_cnt
, MySQL knows to stop asking for rows and cleans up. So, in effect, OFFSET
is (probably necessarily) more like a DONT WRITE x ROWS TO THE NETWORK
feature. Fun.
Fortunately for us, even though MySQL isn't "smart" enough to jump straight to id=11
, we are, but that's a fun fact for later this week.