Skip to main content

How does mysql replication works? io thread, sql thread and more


MySQL replication is a process that enables data from one MySQL database server (the master) to be copied automatically to one or more MySQL database servers (the slaves). It is usually used to spread read access on multiple servers for scalability, although it can also be used for other purposes such as for failover, or analyzing data on the slave in order not to overload the master.
  • At first, replication was based on propagation of SQL statements from the master server to the slave server. This replication format is called statement-based replication (SBR) and is default in older versions of MySQL (<= 5.1.4). Just as a note that in later versions of MySQL, especially with the Innodb Plugin, you NEED to run your transaction_isolation with REPEATABLE-READ.
  • The newer replication type is row-based replication (RBR), which logs changes in individual table rows to the binary log. Basically, logging the actual change and not the SQL statement itself.
  • MySQL also has the ability to change its binary logging format in real time depending on the type of event using the mixed format binary logging. When the mixed format is turned on, statement-based replication is on by default but will change to row-based replication is particular cases. For more information on Mixed Binary Logging please see Mixed Binary Logging Format in the MySQL Documentation.

1. IO thread
This process called IO thread connects to a master, reads binary log events from the master as they come in and just copies them over to a local log file called relay log. That’s all.
Even though there’s only one thread reading binary log from the master and one writing relay log on the slave, very rarely copying of replication events is a slower element of the replication. There could be a network delay, causing a steady delay of few hundred milliseconds, but that’s about it.
If you want to see where IO thread currently is, check the following in “show slave statusG”:
  • Master_Log_File – last file copied from the master (most of the time it would be the same as last binary log written by a master)
  • Read_Master_Log_Pos – binary log from master is copied over to the relay log on the slave up until this position.
And then you can compare it to the output of “show master statusG” from the master.
2. SQL thread
The second process – SQL thread – reads events from a relay log stored locally on the replication slave (the file that was written by IO thread) and then applies them as fast as possible.
This thread is what people often blame for being single-threaded. Going back to “show slave statusG”, you can get the current status of SQL thread from the following variables:
  • Relay_Master_Log_File – binary log from master, that SQL thread is “working on” (in reality it is working on relay log, so it’s just a convenient way to display information)
  • Exec_Master_Log_Pos – which position from master binary log is being executed by SQL thread.


Comments