MySQL Parallel Replication
January 14, 2021
Notes from MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0) details
Given by Jean-François Gagné @ Percona Live 2017
Types of Parallel Replication in the various MySQL Versions
MySQL 5.6 - schema based parallel replication
MySQL 5.7 - Logical Clock Parallel Replication
- Early very was based on group commits (pre 5.7.6)
- in middle version the logical clock is “interval” based (5.7.6)
- In later version write sets were added (5.7.22)
MySQL 8.0: Write Sets
Configuration Settings
slave_parallel_type - setting
- “DATABASE” - 5.6 schema based (crap & default)
- “LOGICAL_CLOCK” - This type is implemented by putting interval information in the binary logs This is not the same as Group Commit! which is what may be referenced in the doc (Bug #85977)
Slowing down the master to speedup the slave is done by tuning
binlog_group_commit_sync_delaybinlog_group_commit_sync_no_delay_count
Downsides:
- Long/Big Transactions reduce parallelism
- Intermediate Masters reduce parallelism
Understanding Intervals in MySQL 5.7
MariaDB 10 added the cid “group commit id” which shows which transactions can be done concurrently.
MySQL 5.7 uses two numbers which is not he cid
sequence_numberincreasing id for each trx (not to be confuse with GTID)last_committedthesequence_numberof the latest trx on whch this trx depends (this can be understood as the “write view” of the current transaction)- This is not the commit of the transaction but rather its lock.
- “The sequence number of the most recently committed transaction when the current transaction gets its last lock”
Write Set in MySQL 8.0
The concept of last_commited in 5.7 was pretty pessimistic as two transactions could be running on completely unrelated data.
8.0.1 changed how last_committed is set, it is now “The sequence number of the last transaction
that updated that same rows as the current transaction.”
MySQL 8.0 remembers the tuples (rows) which are modified by each transaction this is the “write set”
The write set is stored in memory as a hash.
The write set is NOT stored in the binlog, its used to generate the intervals and will make them much wider.
Write Set Configuration
transaction_write_set_extraction- default OFF, XXHASH64 to enablebinlog_transaction_dependancy_history_size- size of the data structure. default setting 250000 rows, which is a few MB of memory and not to be concerned about.binlog_transaction_dependancy_tracking- COMMIT_ORDER, WRITESET_SESSION, WRITESETCOMMIT_ORDER: The old way of calculation as in 5.7WRITESET_SESSION: no two updates from the same session can be reorderedWRITESET: any transactions which write different tuples can be parallelized
WRITESET_SESSION will not work well with connection proxies (proxysql) because recycling the sessions
will will confuse its session tracking.
You can test write sets on an intermediate master! No need to enable it on the master. Downstream replicas will see the writesets for parallelism testing. Or you can parse the binlogs and see what’s going on.
You can disable write set via:
binlog_transaction_dependancy_tracking= COMMIT_ORDERtransaction_write_set_extraction= OFF
Pro/Cons of Write Sets
Advantages of Write Set
- Master does not need “slowed down” to create wider intervals
- Will work even at low concurrency on the master
- Can test without Upgrading the master
Cons of WriteSet
- Needs Row based replication
- Not working for trx updating tables without PK and trx updating tables having foreign keys
- There’s a barrier at each DDL
- Barrier on binary log rotation, no transactions in different binlogs can be run in //
- With
WRITESET_SESSIONdoes not paly well with connection recycling (proxysql)
Seems to have few drawbacks EXCEPT this is newer code and may have many bugs!
Examples from 2017:
- Bug #86078 - Running an Insert before a Delete
- Bug #86079 - Deadlock Bug
- These bugs are normal and probably resolved by now in 2021 4 years later :)
Performance notes
Jean noticed when running with write sets when adding a delay
(binlog_group_commit_sync_delay,binlog_group_commit_sync_no_delay_count).
The write set performance improves with a similar affect on logical clock
Jean likes who Oracle simplified tracking the transactions in the binlog without using a dependency graph which is easier to understand.
In practice Jean saw very big performance improvements anywhere from 60% to 700% improvement (on SSD). On HDD not so great.
Write Set comes from MySQL Group Replication (MySQL NBD Cluster)
Other Notes
Jean has a huge volume of content on his blog about these topics
In addition he appeared to give LONG tutorial on this in 2018, the slides for this are here