MySQL Foreign Keys

January 10, 2021   

At Shopify we discourage our developers from using foreign key constraints with MySQL. When I’ve told my developer friends this they are shocked. They ask: How do you maintain referential integrity in your database? Your data must be a nightmare.

To this I usually stumble a response that we have to enforce referential integrity in our Rails Models and submit tests to keep this in sync. Usually my friend will respond that’s unrealistic and your typical developer is going to make a mistake. At this point I usually concede that this is true and I observe see lots of maintenance tasks submitted by developers to fix data.

My developer friends will suggest I use a different DB like MSSQL or Postgres that support these features. Since I’m not an expert in these systems I don’t know if this is correct or not and usually our conversation shifts to something else.

At Shopify the reasons described for not using foreign keys with MySQL are

  • Online schema migration tools eg (pt-osc, lhm, gh-ost) handle Foreign keys poorly (pt-osc) or not at all( (gh-ost, lhm).
  • Coupling your tables together with foreign keys results in performance issues with very large tables.
  • Some schema migrations on columns will require removing the FK, for example changing the column type on a FK from int to int unsigned

I’m lucky enough to belong to the “DBA Chat” slack group. Which where some of the longtime experts in the MySQL community reside. At some point in 2019? or 2020? this topic was discussed at length and I copy+pasted the chat log into a file. Since DBA Chat is not archived I will reproduce the chat log here as a reference. There are names attached to these logs which I think is ok as the group is semi-public, but if someone is upset for me reproducing this here they can email me…

hey quick question/poll. do you hate fk’s? do you dis/allow them? do you actively discourage their use from devs? asking for a friend :wink:

@chuck almost always disallow unless it’s been historically on the system. mostly because of online schema change issues and just the general annoyance it comes with. I would also argue that these days, applications treat a bit of data stored in the database backend as an object with its own attributes and handle these fk constraints on their own in the object’s context

It depends on the size of the data I suppose, it’s extra logic to enforce at the db layer which arguably is more scalable to manage at the app layer.. but then if the data isn’t huge I guess it’s nice sugar for anyone trying to understand or wanting to map the schema - all schema mapping tools I’ve seen rely on the fk to map the join.

I'm solidly in the FK hater camp, at least for mysql/innodb. OSC tool complications, schema change complications in general (requiring ordering DDL, problems with circular references), they're far less useful in a sharded environment, their locking is very painful at high write rates, their error messages are super vague in InnoDB, and the behavior of foreign_key_checks=0 is somewhat inconsistent (it allows FKs pointing to nonexistent tables, but not nonexistent columns of existing tables)

Ugh, but then you have to trust all of those devs to re-implement referential integrity logic that the DB just (sorta) handles. What happened to the DBA protecting the data from devs? /sarcasm

@evanelias that's a scalability argument though right?   One of the services we manage is an internal dbaas, in many cases the data isn't huge on a per schema basis, and having the fk's doesn't hurt when the data isn't going to scale.  (and tbh it's a handy jump start to understand the schema when jumping in to help troubleshoot query issues on a schema you've never seen before :D)

That's fair, yeah I'd say maybe 90% scalability concerns, 10% operational annoyance concerns :slightly_smiling_face:
Some of those operational annoyances happen at any scale, e.g. try changing the column type on an FK from int to int unsigned... AFAIK this is impossible without removing the FK first and then re-adding it after

my avoidance has always been about management issues: pt-osc mostly
but lately i have had a lot of devs using orm’s that automatically use fk’s and seemingly have no way not to
i can either block them in dev or have pain when they go to prod

We strictly don’t support it because of gh-ost or otherwise OSC. But also we’re extracting many tables these days to their own clusters - we’re very grateful there’s no foreign keys there!
@chuck use skeema to identify an addition of FK. We’re going in that direction.

billkarwin 11:59 AM
We also have a "soft prohibition" on FK's, primarily because it makes OSC's difficult. But we don't enforce the prohibition, so there are some apps that have FK's. When they want to do a schema change, we tell them they can't do an OSC. They have to deal with running a direct ALTER TABLE. Fortunately the apps that use FK's also tend to be apps that have small tables, so ALTER TABLE doesn't take more than a second or two anyway. Once the tables get large enough to need an OSC, they drop their FK's so that they can do that.

i think thats the way we are headed as well

I have a vision of a schema inspector that runs, say, monthly, and emails app teams about the things we see in their schemas that need improvement. Like having FK's, table with no PK, presence of FLOAT columns, tables that are growing too fast since the last monthly report, stuff like that. Not things that need an alert per se, but some regular awareness.

Ah presence of FLOAT columns is interesting, is it just motivated by lack of precision? I could see that being a huge problem for financial data like yours
I was recently horrified to learn that mysql supports float auto_increment :scream: it was only deprecated two months ago in mysql 8

Yeah, my assumption is that there is practically no valid use for a FLOAT column at our company. Of course exceptions might exist to an assumption like mine, but exceptions should be made very carefully by someone who actually understands the precision issues of FLOAT.

I think they’re great in the dev phase though, helps you find out all sort of logic errors in your app.

As a developer early in my career I really liked the idea of a 'self maintaining database' Cross table deletes, relationship data being maintained automatically. However the headaches and performance issues involved in actually using them quickly tempered that. As a DBA of course I obviously don't feel positively about them. Especially considering the number of times that the data wasn't in sync between tables despite having foreign keys to maintain the consistency.

Yes, as a developer I found that without FK constraints, it was easy to make orphan records by deleting data without deleting the child records. I had to write nightly quality-control scripts to check for orphaned records. In every database I've analyzed since then, if there are no FK constraints, it's almost certain that there's a bunch of orphaned data. Who knows how many?
And cleaning up orphaned data is a time sink. Can we delete these? Who has the authority to decide? What if there are 25,000 orphans and they have to be considered one by one?
I like the idea of designing software that prevents anomalous data from occurring, instead of trying to clean it up after it happens. That's why relational database normalization is so attractive too. But the implementation of FK's gets in the way and makes it difficult to do other kinds of operations.
It's tempting to say "the app will handle referential integrity" but then you have different code paths that aren't all in sync on the business rules -- possibly in different client apps written in different languages.

FKs inherently aren't friendly to shadow tables / atomic swap. I don't think it's anything specific to triggers or replication
see also gh-ost not supporting FKs despite not being trigger-based (edited) 
@shlomi can fill in more detail there I'm sure

yeah the atomic rename is a big pain

As for whether "hate" is too strong a word -- just speaking from personal experience, I personally hate FKs. Building tooling around schema changes, edge cases around FKs waste massive amounts of my time ¯\_(ツ)_/¯

Some OSC operations just can't work with FK. In particular: if you have a "parent" table, then you can't OSC it and still maintain relationship with "child" tables; the FK will not migrate to the "ghost" table. So to somehow be able to make that happen, you need to simultaneously migrate the parent table and every child table and any grandchild table etc., which is not feasible, and still need to worry about how to ensure consistency while you're halfway building table data.
https://dbachat.slack.com/archives/C027R4PCV/p1570465919178700

maybe i could humbly suggest that while FKs are certainly something that tends to fall away due to scale. the fact that we are avoiding them because of OSCs because of a bug that has existed since the dawn of innodb is pretty pathetic.

Posted in #general | Today at 11:31 AM | View message
I can agree to that ^

The same sentiment is true to so many other aspects of databases, relational or not, and to computers in general.

There's always some pathetic limitation to something that you need to work around.

I'd be super happy if ALTER table would just work instantaneously or completely async and across replicas.

So OSC is a technical reason for us to not use FK

but is not the only one, and certainly not a cause for "hate"

The sharding issue is real, and I've seen it many times, regardless of the fact many companies will never see it.

I've seen it in the last three companies I've worked for, and as a wild guess, at least a dozen companies represented by people in this chatroom.

It's a matter of growth, where you suddenly realize "OMG I need to shard this out of here"

If you have foreign keys, then it's not just the issue of remove-the-FK, but to now entrust your app to do something it previously expected the DB to do.

So that's a massive change to the app, and scary.

To @ngaitanis’s question, I fully agree with @evanelias’s response; and I should add that there's a ton of other integrity constraints only your app knows about, and not the database...
like: this status column can only change from ready to running but never directly to complete.

sougou 11:56 AM
I was about to type what my main objection to FKs was, but @shlomi :point_up: highlighted it. It's an incomplete solution for data integrity. The bigger problem lies with the app.
ngaitanis 11:58 AM
And that reservation, I understand.

sougou 11:59 AM
Also, I think databases should do what's difficult for apps to do, like transactions, joins, etc. Data integrity is the app's responsibility, because it goes beyond FKs. Also, it's more efficient for apps to enforce it than the database.

billkarwin 12:01 PM
Except unique checks...
:heavy_check_mark:

shlomi 12:02 PM
those do tend to be extremely useful :slightly_smiling_face:
we the MySQL people are also not used to CHECK constraints, which other DB users enjoy

billkarwin 12:03 PM
Cascading FK operations are also harder for the app...

sougou 12:04 PM
They're hard for the DB too. I had a conversation with Luis Soares about it. You can corrupt your replication (using RBR) and are using cascades. (edited) 

billkarwin 12:06 PM
The point is that you can't make blanket rules like "data integrity is the app's responsibility."
Unequivocal statements are always wrong. :wink:

shlomi 12:06 PM
I see it in a different light

sougou 12:06 PM
Agree, let's call them guidelines

shlomi 12:06 PM
"Foreign keys are going to cause major operational pains/limitations which we cannot allow"
"in our company"
which is a blanket rule I can live with
Also, I was so very sad to miss everyone in Amsterdam last week. Rest assured I was gloomy :slightly_smiling_face:

billkarwin 12:10 PM
I'd phrase it like: "There are tradeoffs with any technology or technique. So far, we haven't found a case where we could tolerate the disadvantages of foreign keys, and we can find other implementations to match their advantages."
It leaves open the possibility of some other scenario where FK's would be worth it.
I.e. the least bad option available.