Fun MySQL fact of the day: REPLACE
your expectations
So, if we agree that we should avoid using INSERT ... ON DUPLICATE KEY UPDATE
and INSERT IGNORE
, surely we can use REPLACE INTO
to upsert a row, right?! And yes, you can, if you want a DoS bug feature or account/identity takeover vulnerability in your multi-tenant system. Yep, this is going to be another fun fact.
Starting with the basics, in MySQL, REPLACE INTO
acts like an INSERT
, but if another row in the primary index or any unique index already has the same value, MySQL will first DELETE
the existing row and then issue an INSERT
. This is probably the behaviour you think you want, but if we go back to Thursday's schema, I'll convince you why it isn't and why it's a liability:
CREATE TABLE secrets(
id INT AUTO_INCREMENT PRIMARY KEY,
global_secret_number INT NOT NULL UNIQUE,
account_id INT NOT NULL UNIQUE,
secret VARCHAR(50) NOT NULL
);
INSERT INTO secrets VALUES(0, 1, 10, 'account 10''s secret');
INSERT INTO secrets VALUES(0, 2, 20, 'account 20''s secret');
In this table, we again see that both global_secret_number
and account_id
each have a UNIQUE
index. This means no two rows may have the same value for either column. As a refresher, in InnoDB, indexes are created in the order in which they are specified, and, when inserting data, InnoDB inserts into the PRIMARY
index first (where the data is stored), followed by the rest of the indices in the order in which they were created. Let's go-ahead and try to REPLACE
account 20's secret:
REPLACE INTO secrets VALUES(0, 1, 20, 'account 20''s new secret');
When we run this, we probably expect one of two things:
- the row with
account_id=20
andglobal_secret_number=1
to be updated iff it exists (or inserted if not), otherwise - an error because
global_secret_id=1
already exists for a differentaccount_id
.
Let's see what actually happens:
SELECT * FROM secrets;
+------+------------------------+--------------+-------------------------+
| id | global_secret_number | account_id | secret |
|------+------------------------+--------------+-------------------------|
| 3 | 1 | 20 | account 20's new secret |
+------+------------------------+--------------+-------------------------+
Uh oh. What happened to account_id=10
's secret? A really fun (documented!) quirk about REPLACE INTO
is that after the DELETE
(in this case, a delete of global_secret_id=1
, the first matching UNIQUE
index), InnoDB will "change" the INSERT
into an UPDATE
due to the duplicate key error on account_id
. We probably expected an error, but, instead, account_id=20
has stolen global_secret_number=1
from account_id=10
, who is now denied service while you, the developer, are left mumbling, "this isn't possible" while responding to an irate customer at 3 in the morning.
With any luck, global_secret_number=1
, which had previously been associated with account_id=10
, isn't linked to any sensitive information! So, my advice today? Still the same as last week: instead of using REPLACE INTO
, do a SELECT
(preferably on a replica) and INSERT
or UPDATE
by primary key accordingly, safely handling the rare duplicate key error in your code.