MySQL Archive/Binary Logs: Replication, Recovery, and More
MySQL Binary Logs: The Engine Behind Replication, Point-in-Time Recovery, and Change Data Capture
If you manage MySQL in production, you have likely heard of the binary log (often mistakenly called the “archive log”). Unlike PostgreSQL’s Write-Ahead Log (WAL) or Oracle’s Redo/Archive logs, MySQL’s binary log serves a unique, dual purpose: it is the backbone of replication and the key to point-in-time recovery.
In this post, we will demystify the MySQL binary log, how it differs from traditional archive logs, and how to leverage it for robust replication, disaster recovery, and even streaming analytics.
1. What is the MySQL Binary Log?
The binary log (binlog) is a set of files that record all changes to your database INSERT, UPDATE, DELETE, CREATE TABLE, ALTER, etc. It does not record SELECT or SHOW statements.
Each binary log file is a sequence of events. MySQL flushes these events to disk when a transaction commits or when statement execution completes (depending on sync_binlog). Two critical auxiliary files accompany the binlog:
-
Binlog Index File (
<hostname>-bin.index): Lists all active binlog files. -
Relay Log (replica side): The replica’s copy of the source’s binlog, applied via the SQL thread.
Binary Log ≠ “Archive Log”
In Oracle, the “archive log” is a copy of the online redo log after a log switch, used for media recovery. In MySQL, you rotate binary logs manually or automatically. Retained logs effectively act as archive logs, but MySQL does not have a separate “archive log” phase. All binlogs are both “online” (for replication) and “archived” (for recovery) until purged.
2. Key Components & Configuration
Enabling the Binary Log
You need log_bin set in my.cnf or my.ini. Restart MySQL after enabling.
[mysqld] log_bin = /var/log/mysql/mysql-bin.log binlog_format = ROW # ROW | STATEMENT | MIXED expire_logs_days = 7 # Auto-purge after 7 days max_binlog_size = 1G sync_binlog = 1 # Most durable (safe for replication)
Binary Log Format
| Format | Description | Use case |
|---|---|---|
STATEMENT |
Logs actual SQL statements | Simple, compact but unsafe with nondeterministic functions (NOW(), UUID()) |
ROW |
Logs changed rows (before/after images) | Safe, recommended for replication, enables point-in-time row recovery |
MIXED |
MySQL chooses safest format per statement | Legacy systems only |
Recommendation: Use ROW format. It is compatible with all replication topologies, GTID, and tools like mysqlbinlog for recovery.
Viewing Binary Logs
List binary logs on the server:
SHOW BINARY LOGS;
Show current binary log file and position:
SHOW MASTER STATUS;
3. Binary Logs & Replication: How It Works
Replication is a “copy, transfer, apply” pipeline:
-
Source writes
UPDATEevents to its own binary log. -
Replica connects and requests all events after a certain file + position (or GTID).
-
Replica writes events to its relay log (I/O thread).
-
Replica reads from relay log and applies changes (SQL thread).
Traditional Position-Based Replication
-- On replica CHANGE REPLICATION SOURCE TO SOURCE_HOST='10.0.0.1', SOURCE_USER='repl_user', SOURCE_PASSWORD='strongpass', SOURCE_LOG_FILE='mysql-bin.000042', SOURCE_LOG_POS=123456; START REPLICA;
GTID Replication (Modern way)
GTID (Global Transaction ID) makes failover easier. Instead of file + position, replicas just remember which GTIDs they have applied.
CHANGE REPLICATION SOURCE TO SOURCE_HOST='10.0.0.1', SOURCE_AUTO_POSITION=1;
Benefits: Auto-failover, no need to find log_pos, works seamlessly with mysqlbinlog utilities.
4. Point-in-Time Recovery (PITR) Using Binary Logs
The classic backup + binlog recovery strategy:
-
Restore a full physical or logical backup (e.g.,
mysqldumporxtrabackup). -
Replay binary logs from the backup time up to a specific moment.
Real-World PITR Example
Assume:
-
Full backup at
2025-06-01 00:00:00 -
Accident occurred at
2025-06-01 14:32:17
Step 1: Restore backup.
Step 2: Find binlog files covering 00:00 → 14:32:17.
mysqlbinlog --start-datetime="2025-06-01 00:00:00" \ --stop-datetime="2025-06-01 14:32:17" \ /var/log/mysql/mysql-bin.000042 \ /var/log/mysql/mysql-bin.000043 | mysql -u root -p
Recovering a single dropped table
Using --start-position and --stop-position (find wrong DROP TABLE position via mysqlbinlog --verbose).
mysqlbinlog --start-position=4 --stop-position=4096 mysql-bin.000042 | mysql
5. Beyond Replication & Recovery: Change Data Capture (CDC)
Binary logs are excellent for streaming database changes to external systems (Kafka, Elasticsearch, data warehouses).
Tools like Debezium (CDC connector) read the binlog with ROW format and produce events:
{ "before": {"id": 2, "name": "Jane", "status": "active"}, "after": {"id": 2, "name": "Jane", "status": "suspended"}, "op": "u", "ts_ms": 1685632827000 }
Advantages of binlog-based CDC:
-
No application code changes.
-
Low overhead (sequential reads).
-
Captures
DELETEoperations (unlikeUPDATEorINSERTtriggers).
Important: MySQL binlog must be ROW format and binlog_row_image = FULL (default) for complete before/after images.
6. Maintenance & Best Practices
Log Rotation & Purging
Manual purge:
PURGE BINARY LOGS TO 'mysql-bin.000042'; PURGE BINARY LOGS BEFORE '2025-07-01 00:00:00';
Auto-purge: expire_logs_days or binlog_expire_logs_seconds (MySQL 8.0+).
Performance Implications
-
Disk space: Each
UPDATEwrites an entire row image. Size can be huge at 1000+ writes/sec. -
sync_binlog=1 ensures durability but adds fsync per commit. Use a fast SSD or battery-backed RAID.
-
Binary log compression (MySQL 8.0.20+):
SET GLOBAL binlog_transaction_compression = ON;
Security
-
Binlog files contain sensitive data (passwords, PII). Encrypt at rest:
INSTALL COMPONENT 'file://component_keyring_file'; SET GLOBAL binlog_encryption = ON;
-
Restrict access:
REPLICATION SLAVEprivilege is enough for replicas, but read access exposes all data mutations.
7. Troubleshooting Binary Log Issues
| Symptom | Likely Cause | Solution |
|---|---|---|
| Disk full | Binlog cleanup failed | PURGE BINARY LOGS or reduce expire_logs_days |
| Replica falls behind | Slow SQL thread or large transactions | Enable parallel replication (replica_parallel_workers) |
| “Binlog corrupted” | Server crash during write | mysqlbinlog --verify-binlog; rebuild replica |
| Missing events after recovery | Incorrect --stop-position |
Use mysqlbinlog --base64-output=decode-rows to view |
Conclusion
The MySQL binary log is far more than a replication requirement. It is:
-
A replication backbone (source→replica)
-
A PITR journal (post-backup recovery)
-
A CDC source (real-time event streaming)
Treat your binary logs as critical a resource as your data files. Monitor their size, encrypt them, and test your recovery procedures regularly. With ROW format, GTID replication, and modern tools like Debezium or Kafka, the binlog transforms MySQL from a simple database into an event-driven platform.
Next step: Run SHOW BINARY LOGS; now on your server and verify your expire_logs_days setting. You might be holding months of change history without knowing it.
Understanding MySQL binary logs is not optional for a production DBA. At Learnomate Technologies, we’ve trained hundreds of engineers to move beyond basic replication into advanced recovery strategies, CDC pipelines, and binlog forensics. This post distills those same battle-tested concepts.





