Today's article is from ushuz (GitHub), who joined Xiachufang in 2014 and is currently responsible for the Xiachufang platform and infrastructure.
Xiachufang is one of the first customers of JuiceFS. Initially, they utilized JuiceFS for NGINX logs and MySQL backup, and now using JuiceFS as the primary storage for Hadoop / Hive / HBase, their usage scenarios are quite extensive. Today's article is highly practical. When we first heard that they used the JuiceFS Snapshot feature to do MySQL backup verification, we also agreed this idea is genius!
Here is the article:
Xiachufang is the largest community dedicated to family delicious in China. It focuses on recipes and craft sharing. Their business involves e-commerce, paid content, short video, etc. Currently, they have more than 20 million registered users, and the entire platform DAU is close to 3 million. With more than 1 million recipes and nearly 40 million crafts, the number of likes and favorites is close to 1 billion.
The Xiachufang has founded for seven years. The company and its business have undergone many changes and growth. Up to now, we have maintained a streamlined team to support business growth with the most pragmatic solutions. Let's take this opportunity to share with you about how small teams can efficiently build database redundancy, backup, and backup verification mechanisms.
Database Architecture Overview
Xiachufang is using the most commonly MySQL replication architecture.
Before migrating to the cloud service, we use 2 dedicated machines as the database servers. Each dedicated machine runs only one MySQL instance, and the database is split based on services. The two instances are in Active-Standby mode. After migrating to the cloud service, the databases are grouped by load and business availability requirements. Each group includes at least 2 MySQL instances (each cloud host runs only 1 instance). They are placed in Active-Standby mode and are scattered in different availability zones to reduce the possibility of data loss due to fatal accidents such as fires and lightning strikes. For a group with a larger load, a number of read-only instances are supplied, shred the load by partial read and write.
As the business grows, our database scale is also increasing. There are currently about 10 active/standby pairs, with a total data size of nearly 2TiB, a maximum table of nearly 1 billion rows, and a peak database total QPS of nearly 150,000.
The main ways to use the database are:
- Percona Server 5.6 / 5.7 + InnoDB
- Row-based Replication (RBR) + GTID
- Online schema changes using pt-online-schema-change in the Percona Toolkit
- Monitor with Percona Monitoring and Management(PMM)
Redundancy and Backup
Incomplete redundancy terminate the days, deficient backup sinister than demon! – Xin Li (@delphij) 6 Jan 2010
Xiachufang has a deep understanding of this. After accidentally deleting database files in June 2013, due to redundancy and backup interruptions for up to 2 months, it took immense time to recover data from the hard disk.
Redundancy and backup are the top priorities of database management: on the one hand, there are unpredictable situations, from unawareness to natural disasters, which may lead to loss of database data; on the other hand, data is the most crucial asset of an internet company. While the green hills last, there'll be wood to burn. However, if the green hills are gone, we may have to burn ourselves.
In terms of redundancy, the primary and backup architectures across the availability zones provide the first layer of protection. Even if the primary node database file is accidentally deleted, the host is lost, or even the available area is down, the standby node retains almost all data. Percona XtraBackup supports hot backup, and it can create a slave in a few simple steps without suspending the master node, making it easy to set up the master and slave nodes.
# on master, take a backup to /backup/mysql root@master $ juicefs mount <volume> /backup root@master $ innobackupex --no-timestamp --slave-info /backup/mysql/ # on slave, copy back the backup # before copying, shutdown mysql on slave root@slave $ juicefs mount <volume> /backup root@slave $ rsync -avP /backup/mysql/ /path/to/mysql/datadir # prepare the backup # it’s faster to apply logs on SSD than on a network filesystem (in our case JuiceFS) root@slave $ innobackupex --apply-log --use-memory=16G /path/to/mysql/datadir # make sure datadir ownership # then start mysql on slave, setup replication based on /backup/mysql/xtrabackup_binlog_info # if the backup was taken from a slave, use /backup/mysql/xtrabackup_slave_info root@slave $ chown -R mysql:mysql /path/to/mysql/datadir
In terms of backup, the daily backup of the entire library and the binlog provide a second layer of protection. When the first layer of protection fails, such as unaware DROP, the data can still be recovered. However, the entire database backup size is enormous, and the amount of data generated by daily scheduled backups is considerable large. JuiceFS relies on object storage to provide unlimited storage space, making it ideal for backup scenarios, so you can keep backups on demand for long periods without worrying about space usage. Our current strategy is to keep a Percona XtraBackup full database backup in 7 days, a binlog in 3 years, and a weekly mysqldump in 1 year.
Compared to self-built NFS, JuiceFS is a highly available service of the same level as object storage. It also enables data redundancy across the cloud for easy cross-regional or even cross-cloud data transport. When accessing JuiceFS across regions or public cloud, the data is transfer through the public network outbound of the object store. The bandwidth is quite sufficient. We used to copy data from UCloud UFile-based JuiceFS in AWS Beijing, and the speed can reach 800+Mbps.
Finally, I got why my company set up a size check alert for database backup – 荡师傅 (@inntran) 13 Aug 2016
Regular backups don't mean peace of mind, because backups can also be problematic, so verifying backups is essential. Checking the size of the backup is the easiest way, but for a database backup, the only way to verify whether the backup is reliable is to perform a recovery. A sign of a successful recovery is that the MySQL process is started and replication is normal.
In the normal recovery process, copy the backup files is required. For a large backup, the copy costs a long time, and the target machine used to run the test also need enough storage space. Therefore the time cost of the backup test is pretty high.
Fortunately, JuiceFS provides a snapshot function, which can quickly create a snapshot for a path on JuiceFS. The changes made to the snapshot will not affect the files in the original path. With the snapshot feature of JuiceFS, we can save a lot of time.
Xiachufang built an easy-to-use backup test solution based on Docker and JuiceFS snapshot: mainly including
verify-backup.sh running on the cloud host, performing snapshot creation, launching the container, and cleaning up the snapshot.
#!/usr/bin/env bash juicefs snapshot /backup/mysql /backup/snapshot # run a percona:5.7 container to verify the backup # make sure container can reach master with --add-host or --network docker run --rm \ -v /path/to/my-cnf:/root/.my.cnf:ro \ -v /path/to/report.sh:/report.sh:ro \ -v /backup/snapshot:/var/lib/mysql \ percona:5.7 \ /report.sh juicefs snapshot -d /backup/snapshot
As well as
report.sh which running in the container, prepare backup, configure replication, and report replication status.
#!/usr/bin/env bash set -e # switch to domestic mirrors sed -i 's|deb.debian.org/debian|ftp.cn.debian.org/debian|g' /etc/apt/sources.list sed -i 's|security.debian.org|ftp.cn.debian.org/debian-security|g' /etc/apt/sources.list # install percona-xtrabackup-24 apt-get update && apt-get install -y percona-xtrabackup-24 # prepare backup innobackupex --apply-log --use-memory=16G /var/lib/mysql > /dev/null chown -R mysql:mysql /var/lib/mysql # start mysql mysqld \ --default-storage-engine=InnoDB \ --character-set-server=utf8mb4 \ --collation-server=utf8mb4_unicode_ci \ --server-id=1234 \ --gtid-mode=on \ --enforce-gtid-consistency=on \ --read-only=on \ --transaction-isolation=READ-COMMITTED \ --binlog-format=ROW \ --master-info-repository=TABLE \ --relay-log-info-repository=TABLE \ --log-bin=/var/log/mysql/mysql-bin \ --relay-log=mysqld-relay-bin \ --log-error=/var/log/mysql/error.log \ --innodb-lru-scan-depth=256 \ --disable-partition-engine-check & # check mysql ready, timeout 300s # it seems that commands inside loop aren't affected by "set -e" n=0 until [ $n -ge 100 ]; do mysqladmin ping && break n=$[$n+1] sleep 3 done # setup gtid, connect to mysql server using /root/.my.cnf to avoid password input mysql < <(printf "RESET SLAVE;\nRESET MASTER;\n$(cat /var/lib/mysql/xtrabackup_slave_info);\nSTART SLAVE;\n") # check replication status # Slave_IO_Running / Slave_SQL_Running should be Yes # Seconds_Behind_Master should be reasonable sleep 10 mysql -e 'SHOW SLAVE STATUS\G' | grep -E '(Running:|Seconds)' # report and exit exit $?
After the backup is complete,
verify-backup.sh is triggered by a scheduled task or other methods. If the backup verification is successful, the script should print out the status of the MySQL replication thread and the time behind the master. The replication threads should all be in the Running state, and the time behind the master should be within a reasonable range.
Slave_IO_Running: Yes Slave_SQL_Running: Yes Seconds_Behind_Master: 3625
Besides, given the particularity and importance of backup and backup verification tasks, additional attention to the notification policy is required. If the notification is only sent when the error occurs, then when the notification is not received, the task may run successfully, or may not run at all. For backup and backup verification tasks, the task fails if it is not running.
Xiachufang engineer team is relatively small with limited technical resources. However, with Percona products and JuiceFS, they quickly implemented database redundancy and backup, as well as an effective and easy-to-use backup verification solution that allowed them to focus on market development with greater confidence.
This article is from Xiachufang Engineering Blog. If you are a MySQL DBA and interested in discussing the infrastructure with us, you can open the chat window in the lower right corner of the website, or shoot an email to firstname.lastname@example.org.