Fun MySQL fact of the day: in defence of auto_increment
Monday's introduction to auto_increment
maybe wasn't as glowing as it should have been, but we really shouldn't let one little bug well-documented issue stop us from using it. You see, auto_increment
really is an invaluable tool, and today we'll consider why.
As we discussed yesterday, in InnoDB, a "table" is actually a b+tree index (the PRIMARY
index, or just PRIMARY
) in which "rows" are actually "records". In the PRIMARY
index, records are sorted in ascending order of the primary key. While designing a schema, deciding on a primary key is as much a science as it is an art, and the field(s) you eventually choose is critical to the performance of your database. Today, we're just going to focus on an OLTP (Online Transaction Processing) use case, which primarily optimises for write throughput and read latency of small data sets. Most web-based systems fit this category.
I'll make a bold proclamation: in any system that optimises for write throughput, a surrogate primary key (that is, one that's unrelated to the data) consisting of an auto_increment
value will nearly always win-out in terms of inserts per second. To explain why, let's consider that we currently have a PRIMARY
index with the following keys: [ 1, 3, 5, 6, 8 ]
.
Good. Now, let's insert a new record with a key value of 9
. Simple: we stick it right at the end. Then 10
? Again, right at the end. What about 522
or 33977344
? Right smack at the end, again. It's "almost" as "simple" as that (read: come along with me on my imaginary optimism trip into make-believe database land!).
Now, what happens if we instead need to insert 4
? It goes after 3
but before 5
. And what about 7
? You get the point: unordered inserts require extra work, which slows down the process of inserts, which, as we recall from the last few weeks, slows everything down.
Of course, there's a whole lot more complexity, but one thing remain constant: inserting (increasing) monotonic records is a heck of a lot more efficient than random inserts. So, should you use an auto_increment
column (which provides monotonic, sequential values) for your primary key? It highly depends on your scenario, but you'd be crazy not to at least consider it.