Titles in this page

Wednesday, April 13, 2011

Slides: Automated, Non-Stop MySQL operations and failover

 On Tuesday I presented "Automated, Non-Stop MySQL Operations and Failover" at the MySQL Conference and Expo 2011, and published the slides at SlideShare. I thought this talk was very complicated and it was not easy to understand in 45-minute session. Now slides are online so I assume attendees will be easier to understand what steps are needed for master failover and slave promotion.
 As I mentioned during the talk, I'm planning to release the tool (monitoring master failure, promoting slave automatically or manually, and optionally switching alive master manually) as an open source software soon. The tool does all the steps what I covered at the talk so actually you don't need to do the steps manually. If you find any missing steps that need to be covered, I'd appreciate if you point out.
 I'm also writing English documentation (user's guide) so please stay tuned if you're interested.

Slides: Linux and H/W optimizations for MySQL

On Monday I presented 3-hour tutorial "Linux and H/W optimizations for MySQL" at the MySQL Conference and Expo 2011, and published the slides at SlideShare.

Wednesday, April 6, 2011

Tracking long running transactions in MySQL

 Sometimes I want to trace transactions that have taken long time to execute. If a transaction is holding row locks for 30 seconds, all other clients requesting locks for the same rows will be blocked for up to 30 seconds (or will get Lock Wait Timeout errors if you set innodb_lock_wait_timeout lower). This is serious.

 The important step to solve such issues is identifying what kinds of queries are executed by problematic transactions. But tracking long running transactions is not as easy as tracking just slow queries. Suppose you execute the following transaction.
[client1]$ mysql --host=mysql_server db1
mysql> BEGIN;
mysql> SELECT * FROM t1 WHERE id=2 FOR UPDATE;
... (do nothing for 10 seconds)
mysql> UPDATE t1 SET value=20 WHERE id=2;
mysql> COMMIT;
 This transaction takes more than 10 seconds to complete, but each query finishes very quickly. So nothing is written to the slow query log. Analyzing slow query logs is not helpful in this case.

 If you enable general query log, all queries including above are written. But this hurts performance (writing to general query log is serialized inside MySQL), and logging all queries makes log file size huge. Analyzing too big log files is not so fun. In addition, neither query execution time nor error code is written to the general query log. So just analyzing general query log is not enough to verify whether the above SELECT .. FOR UPDATE took 10 seconds or finished quickly.

 How about analyzing statement based binary logs? Unfortunately this is not always helpful. First, uncommitted transactions are not written to the binary log. There are many cases that transactions take long time and finally do rollback. Second, SELECT statements (including SELECT ... FOR UPDATE/LOCK IN SHARE MODE) are not written to the binary log. When you use locking reads, lots of transactions will be likely to be blocked. You probably want to identify which SELECT statement has caused the problem.

 Using SHOW ENGINE INNODB STATUS? SHOW ENGINE INNODB STATUS prints active transactions like below.
---TRANSACTION 1B43B50, ACTIVE 24 sec, process no 7388, OS thread id 1235609920 starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 23, query id 140396660 client2.example.com 192.168.0.2 root Updating
update t1 set value=100 where id=2
------- TRX HAS BEEN WAITING 24 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 213041 n bits 80 index `PRIMARY` of table `test`.`t1`
trx id 1B43B50 lock_mode X locks rec but not gap waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 4; hex 80000002; asc ;;
1: len 6; hex 000001b43b4e; asc ;N;;
2: len 7; hex 810000012d011c; asc - ;;
3: len 4; hex 80000002; asc ;;
------------------

---TRANSACTION 1B43B4F, ACTIVE 34 sec, process no 7388, OS thread id 1096673600
2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 22, query id 140396658 client1.example.com 192.168.0.1 root
 You can identify that a client from 192.168.0.2 was executing "update t1 set value=100 where id=2" and was waiting for 24 seconds to get some locks. But there is no information about what queries held the lock. A client from 192.168.0.1 (1B43B4F) was running a transaction for 34 seconds, so this was probably the one holding the lock. But there is no information about why it was running for 34 seconds. In this case I want to get all queries that the 1B43B4F executed so far.

 If you can modify application programs, tracing long running transactions is not so difficult. Adding below logics will help.

1. Checking transaction start time (t1)
2. Pushing queries (and current timestamp) by the transaction into some in-memory queue
3. Checking transaction end time (t2)
4. If (t2 - t1) exceeds N seconds, printing all queries that the transaction executed
5. Deleting the in-memory queue

 The problem is that this approach is not feasible in many cases. Because long running transactions may be executed from every client, you have to implement tracing logic on all application servers. Modifying core database access codes and re-deplying on all application servers is not fun. If you use O/R mappers, things become more difficult. Modifying database driver programs on all application servers? I don't want to do that.


 Based on the above reasons, I thought that the most practical approach is tracing slow transactions on MySQL servers, without modifying any existing program (including client programs and MySQL servers). Hopefully there is a way to do that: capturing MySQL packets and tracking transactions and queries. The above 1-5 tracing algorithm should work.
I implemented the tool "MySlowTranCapture" and published at GitHub. Here is an example output.
[mysql-server]# myslowtrancapture -i eth0
Monitoring eth0 interface..
Listening port 3306..
Logging transactions that take more than 4000 milliseconds..

From 192.168.0.1:24441
2011/02/23 09:12:17.258307 ->
begin
2011/02/23 09:12:17.258354 <-
GOT_OK
2011/02/23 09:12:17.264797 ->
select * from diary where diary_id=100 for update
2011/02/23 09:12:17.265087 <-
GOT_RES
2011/02/23 09:12:17.277622 ->
select 1
2011/02/23 09:12:17.277713 <-
GOT_RES
2011/02/23 09:13:01.232620 ->
update diary set diary_date=now() where diary_id=100
2011/02/23 09:13:01.232960 <-
GOT_OK
2011/02/23 09:13:17.360993 ->
commit

From 192.168.0.2:24442
2011/02/23 09:12:20.969288 ->
begin
2011/02/23 09:12:20.969483 <-
GOT_OK
2011/02/23 09:12:20.977699 ->
update diary set diary_date=now() where diary_id=100
2011/02/23 09:13:11.300935 <-
GOT_ERR:Lock wait timeout exceeded; try restarting transaction
2011/02/23 09:13:13.136967 ->
rollback
--------------------
 It is easy to identify that the first transaction caused problems. The first transaction held an exclusive lock for diary_id=100 at 09:12:17.264797, and didn't release until 09:13:17.360993. All queries by the transaction completed within a millisecond, so the first transaction was probably sleeping, or took long time to do other logics (i.e. accessing to remote servers and taking long time there) before closing the transaction.

 MySlowTranCapture uses libpcap to capture network packets, like tcpdump. Since libpcap loses packets sometimes, it is not guaranteed to capture 100% transactions.
 MySlowTranCapture approximately uses 10-30% CPU resources of mysqld process. This is not bad for short-term analyzing purposes (when I tested tcpdump, it used 140% CPU resources of mysqld, and most of CPU time seemed to be spent for writing network packets to local files).
 I sometimes use this tool for debugging, too. Sometimes developers execute unnecessarily many SQL statements within single transaction. MySlowTranCapture is useful to trace such transactions on development servers, by setting -t lower (i.e. 50ms).

Monday, April 4, 2011

Speaking at MySQL Conference and Expo 2011

Next week I'll stay at Hyatt Santa Clara to attend the MySQL Conference and Expo 2011. This year I'll present one tutorial and two sessions.

- Linux and H/W optimizations for MySQL (Apr 11, 9:00-12:30)

- Automated, Non-Stop MySQL Operations and Failover (Apr 12, 15:05-15:50)

- Using MySQL As NoSQL - Introduction to HandlerSocket Plugin (Apr 13, 17:15-18:00) (Co-present with Kazuho Oku. He is a former employee at Cybozu Labs, and recently joined DeNA. He is an author of Q4M and MyCached(origin of HandlerSocket))

These talks will be based on my experiences as a former MySQL consultant and our experiences from running real-world social gaming platforms (DeNA is not well-known in US yet, but our service has 2-3 billion page views per day, so it's not small). Speaking 4.5 hours is not easy for a non-native English speaker, but I'll do the best to make the sessions will be interesting and beneficial.
Due to the recent disasters and ongoing rolling blackouts in Japan, most of my colleagues had to cancel the flight, and I also have to go back to Tokyo just after the UC (I booked the flight leaving at Apr 14 6:55pm),so I can't attend to some interesting events scheduled in Apr 15 or later. But it's confirmed that I can stay from Apr 10 to 14 so I look forward to seeing ex-MySQL colleagues and friends!