MySQL Parallel Replication

January 14, 2021   

Notes from MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0) details

Given by Jean-François Gagné @ Percona Live 2017

Types of Parallel Replication in the various MySQL Versions

MySQL 5.6 - schema based parallel replication

MySQL 5.7 - Logical Clock Parallel Replication

  • Early very was based on group commits (pre 5.7.6)
  • in middle version the logical clock is “interval” based (5.7.6)
  • In later version write sets were added (5.7.22)

MySQL 8.0: Write Sets

Configuration Settings

slave_parallel_type - setting

  • “DATABASE” - 5.6 schema based (crap & default)
  • “LOGICAL_CLOCK” - This type is implemented by putting interval information in the binary logs This is not the same as Group Commit! which is what may be referenced in the doc (Bug #85977)

Slowing down the master to speedup the slave is done by tuning

  • binlog_group_commit_sync_delay
  • binlog_group_commit_sync_no_delay_count

Downsides:

  • Long/Big Transactions reduce parallelism
  • Intermediate Masters reduce parallelism

Understanding Intervals in MySQL 5.7

MariaDB 10 added the cid “group commit id” which shows which transactions can be done concurrently.

MySQL 5.7 uses two numbers which is not he cid

  • sequence_number increasing id for each trx (not to be confuse with GTID)
  • last_committed the sequence_number of the latest trx on whch this trx depends (this can be understood as the “write view” of the current transaction)
  • This is not the commit of the transaction but rather its lock.
  • “The sequence number of the most recently committed transaction when the current transaction gets its last lock”

Write Set in MySQL 8.0

The concept of last_commited in 5.7 was pretty pessimistic as two transactions could be running on completely unrelated data.

8.0.1 changed how last_committed is set, it is now “The sequence number of the last transaction that updated that same rows as the current transaction.”

MySQL 8.0 remembers the tuples (rows) which are modified by each transaction this is the “write set”

The write set is stored in memory as a hash.

The write set is NOT stored in the binlog, its used to generate the intervals and will make them much wider.

Write Set Configuration

  • transaction_write_set_extraction - default OFF, XXHASH64 to enable
  • binlog_transaction_dependancy_history_size - size of the data structure. default setting 250000 rows, which is a few MB of memory and not to be concerned about.
  • binlog_transaction_dependancy_tracking - COMMIT_ORDER, WRITESET_SESSION, WRITESET
  • COMMIT_ORDER: The old way of calculation as in 5.7
  • WRITESET_SESSION: no two updates from the same session can be reordered
  • WRITESET: any transactions which write different tuples can be parallelized

WRITESET_SESSION will not work well with connection proxies (proxysql) because recycling the sessions will will confuse its session tracking.

You can test write sets on an intermediate master! No need to enable it on the master. Downstream replicas will see the writesets for parallelism testing. Or you can parse the binlogs and see what’s going on.

You can disable write set via:

  • binlog_transaction_dependancy_tracking = COMMIT_ORDER
  • transaction_write_set_extraction = OFF

Pro/Cons of Write Sets

Advantages of Write Set

  • Master does not need “slowed down” to create wider intervals
  • Will work even at low concurrency on the master
  • Can test without Upgrading the master

Cons of WriteSet

  • Needs Row based replication
  • Not working for trx updating tables without PK and trx updating tables having foreign keys
  • There’s a barrier at each DDL
  • Barrier on binary log rotation, no transactions in different binlogs can be run in //
  • With WRITESET_SESSION does not paly well with connection recycling (proxysql)

Seems to have few drawbacks EXCEPT this is newer code and may have many bugs!

Examples from 2017:

  • Bug #86078 - Running an Insert before a Delete
  • Bug #86079 - Deadlock Bug
  • These bugs are normal and probably resolved by now in 2021 4 years later :)

Performance notes

Jean noticed when running with write sets when adding a delay (binlog_group_commit_sync_delay,binlog_group_commit_sync_no_delay_count). The write set performance improves with a similar affect on logical clock

Jean likes who Oracle simplified tracking the transactions in the binlog without using a dependency graph which is easier to understand.

In practice Jean saw very big performance improvements anywhere from 60% to 700% improvement (on SSD). On HDD not so great.

Write Set comes from MySQL Group Replication (MySQL NBD Cluster)

Other Notes

Jean has a huge volume of content on his blog about these topics

In addition he appeared to give LONG tutorial on this in 2018, the slides for this are here