Vlad Fedorkov

Performance consulting for MySQL and Sphinx

One, two, three, MySQLers… OOW 2019, Percona Live, ProxySQL Tech Day!

As people are getting back from vacations and application traffic is going back and hitting database backend hard, it is a good time to think about performance and prepare for the winter holiday spike. Whenever you are on-premise or in the cloud, questions are the same:

  • Are we satisfied with the current MySQL performance?
  • How much traffic are we currently serving?
  • Is there enough capacity to stay stable as traffic increases?
  • What would be the best way to stay cost-effective as we grow?

Unfortunately, the answers are not so obvious and require a thorough performance analysis. However, there is a chance that your problem has already been solved by someone else. I do not mean StackOverflow (which is a beautiful place, by the way), but some higher level questions which people usually discuss at conferences, and we have several of them ongoing or just a couple of weeks away.

Right this moment René Cannào and Peter Boros are preparing to deliver their talk MySQL with ProxySQL at Zuora at Oracle Open World conference, 09:00 AM @ Room 301, Moscone South.

At the same time next door in the Room 312 Oracle’ Frederic Descamps will be talking about MySQL Shell for DBAs.

If you are a developer, than you probably won’t miss Seventeen Things Developers Need to Know About MySQL by Peter Zaitsev 11:15 AM @ Room 3.

But that’s California, and what if you’re in Europe? Please hold on, we’re crossing the Atlantic!

Let’s move to Amsterdam where you might want to attend Percona Live Europe held from September 30th to October 2nd. As you land, you will not even need a taxi to get there as your venue is next to the airport! Besides useful tutorials on the first day, there will be two complete days filled with Database Wisdom. Perconians and their guests will share virtually every aspect of the database performance and operations.

Here is a list of my favorite talks from day one @ Percona Live:

  • MyRocks and RocksDB Advanced Features and Performance by Yoshinori Matsunobu (2:30 PM),
  • ClickHouse Features to Blow Your Mind by  Aleksei Milovidov (6:00 PM),
  • Tracing and Profiling MySQL by Valerii Kravchuk (same 6:00 PM).

The next day is also full of great talks, here’re just some examples:

  • Sveta Smirnova helps you to save life with Histograms (9:00 AM),
  • Marco Tusa will make a deep dive into Using ProxySQL as Firewall (10:00 AM),
  • Morgan Tocker will share his First 90 days experience with Vitess (11:00 AM).
  • And if you are looking for real hardcore stuff, please visit gdb basics for MySQL DBAs  by Valerii Kravchuk (3:30 PM).

That’s a really intense line-up for three days, but that’s not the end, here comes the ProxySQL Tech Day!

Right after Percona Live, on Thursday Oct 3rd, we are going to meet in person at a free event in Ghent, Belgium. We’ll be talking specifically about large-scale MySQL environments and how to serve a really huge amount of transactions with MySQL. Experts from 5 different companies will gather at Co.Station Gent and share their knowledge. Tricky questions, mind-breaking cases, crazy requirements and challenges of all kinds are super welcome! Please use this form and be sure that we’ll tackle your issue.

We’re starting at 5:00 PM, so you will have some time for a walk across the nice City of Ghent!

P.S. The views expressed here are my own and do not necessarily reflect the opinion of my employer.

ProxySQL team is coming to Ghent with the Technology Day

ProxySQL Technology Day in Ghent, Belgium

As a part of the ProxySQL team I am happy to announce my trip to Ghent to ProxySQL Technology Day. With great pleasure, I will join the experts from Oracle, Percona, and Pythian to talk about various things that ProxySQL can do for a user. We are going to meet at 5PM at Ghent’s Co.Station co-working on Thursday, October 3rd to walk through key features that ProxySQL is adding to MySQL ecosystem. Join me and others for five talks and a round table about building MySQL infrastructure to serve hundreds of thousands of queries per second.

What am I going to talk about?

In my 30 minutes “Introduction to High Performance for MySQL” talk at 5PM, I’m going to walk through the needs of today’s high-performance setups. These days you can’t just be a DBA, looking after your MySQL server. You have to work closely with development teams to prepare new code to work with production-sized datasets, deal with potential load spikes and be able to fix certain things on the fly without even changing the code. In this case, ProxySQL is acting as a bridge between application and database backend and so adding required flexibility.

In my second talk “ProxySQL: Traffic management and Performance Troubleshooting” at 6:10 PM I’m going to dive into details of MySQL traffic management and metrics that ProxySQL provides to identify bottlenecks and time-consuming queries. I’ll show you some real-life use cases like read/write split, query-based and user-based query routing and some advanced techniques for high loaded environments.

What else are we going to discuss?

- Fault-tolerant infrastructure with ProxySQL: MySQL InnoDB Cluster, PXC, Orchestrator, etc
- Load balancing and advanced MySQL traffic management
- Application troubleshooting: real-time queries and connection pool statistics

Hey! Just in case you forgot, ProxySQL and it’s creator René Cannaò are three times in a row MySQL Community Awards winners, so you know it will be worth spending the evening ;)

What is stuck in MySQL server?

There are few easy ticks to see what is stuck inside MySQL instance. All these techniques will not give you whole picture, but might help to find queries that block performance. Let’s start from what exactly doing your MySQL server right now.

Which queries are running now?

This will give you an idea what’s running right now so you can find long running queries which slowing down MySQL and/or causing replication lag:

mysql -e "SHOW PROCESSLIST" | grep -v -i "sleep"

It is more convenient than just run “SHOW PROCESSLIST” as it’s hiding all connected threads in “Sleep” state so you’ll get a clean output. Also you can get same output but updating each second:

watch -n1 'mysql -e "SHOW FULL PROCESSLIST" | grep -v -i "Sleep"'

What to look for? This is complex output but you can start with Time and State columns. When you see a query running for more than one second it’s time for query review. If you sure it’s ok for query to be slow (like for queries in complex reports) you can move it away from critical server to special “reporting” replica.

When you see states like “Copying to tmp table” or any kinds of “Waiting …” like “Waiting for query cache lock” even if it’s fast then your server performance is less than stellar and it’s time to dig in.

How to collect bad qeuries

How to easily collect bad queries? Non-intrusive way is to listen MySQL network communications and dump all the data using tcpdump. This is extremely useful when you need to get an idea about most time consuming queries without touching MySQL configuration and query logging at all. This will also require Percona toolkit to be installed, but it’s worth it. Here is quick example. Let tcpdump run for some time and then run pt-query-digest to aggregate tcpdump output into readable report.

tcpdump -s 65535 -X -nn -q -tttt -i any port 3306 > mysql.tcp.txt
pt-query-digest --type tcpdump ./mysql.tcp.txt > digest.log

This will produce report with most resource consuming queries on the top. Optimizing them one by one will improve speed or your MySQL instances, application response time and so give you a chance to handle more traffic with less hardware.

Which engines are in use?

Where also to look? Performance optimization techniques quite complex but depends on the engine that you use to store data. InnoDB, MyISAM, TokuDB – they all require different approach in query tunning and troubleshooting. So next step is to see how your data is distributed across the engines. In that case the following query will help:

SELECT
    engine,
    count(*) TABLES,
    concat(round(sum(table_rows)/1000000,2),'M') rows,
    concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') Size
FROM
    information_schema.TABLES
WHERE
    table_schema NOT IN
      ('mysql', 'information_schema', 'performance_schema')
GROUP BY engine
ORDER BY Size DESC;
Find all tables across all databases uses specific engine

Now you would probably like to see all tables that uses say MyISAM engine to store data. Here you go – example for MyISAM with big tables first.

SELECT
    concat(table_schema,'.',table_name),
    engine,
    concat(round(table_rows/1000000,2),'M') Rows,
    concat(round((data_length+index_length)/(1024*1024*1024),2),'G') Size
FROM
    information_schema.TABLES
WHERE
    engine = 'MyISAM' AND
    table_schema NOT IN
        ('mysql', 'information_schema', 'performance_schema')
ORDER BY
    Size DESC;

Keeping your queries in a good shape could save you money on hardware and make your application fast but it’s also cause servers to do less work which means less electricity consumed and less CO2 and heat released to the atmosphere. So it is good thing to do whatever you believe on global warming or not :)

Full table scans and MySQL performance

High season is coming, how do you make sure that MySQL will handle the increased load? Stress tests could help with that, but it’s not a good idea to run them in a production environment. In this case Select_scan, Select_full_join and other MySQL counters could quickly give you an idea of how many queries are not performing well and could cause a performance degradation as the load goes up.

Select_scan from SHOW GLOBAL STATUS indicates how many full table scans were done since last MySQL restart. Scanning the entire table is a resource intensive operation. It also forces MySQL to store unnecessary data in the buffer pool, wasting memory and IO resources. Continue reading

Tips and tricks while working with Production DBs

From time to time we have to work with live environments and production databases. For some of us this is day-to-day job. And most of the time cost of a mistake is way higher than expected improvement especially on the databases. Because issue on the database side will affect everything else.

I heard enough war stories about ruined productions and can imagine well enough speed of DROP DATABASE command replicating across the cluster. So I’m scared to make changes in production. The more loss expected if things go wrong the more I’m going to be scared planning every change. But I still love to make improvements so the only question is how to make them safer.

This post is not intended to be a guide or best practices on how to avoid issues at all, it’s more invitation to discussion that started between me and @randomsurfer in twitter on how to avoid production failures. It was hard for me to fit to 150 characters so I’m switching to more comfortable environment. Continue reading

How to avoid two backups running at the same time

When your backup script is running for too long it sometimes causes the second backup script starting at the time when previous backup is still running. This increasing pressure on the database, makes server slower, could start chain of backup processes and in some cases may break backup integrity.

Simplest solution is to avoid this undesired situation by adding locking to your backup script and prevent script to start second time when it’s already running.

Here is working sample. You will need to replace “sleep 10″ string with actual backup script call:

#!/bin/bash

LOCK_NAME="/tmp/my.lock"
if [[ -e $LOCK_NAME ]] ; then
        echo "re-entry, exiting"
        exit 1
fi

### Placing lock file
touch $LOCK_NAME
echo -n "Started..."

### Performing required work
sleep 10

### Removing lock
rm -f $LOCK_NAME

echo "Done."

It works perfectly most of the times. Problem is that you could still theoretically run two scripts at the same time so both will pass lock file checks and will be running together. To avoid that you would need to place unique lock file just before check and make sure no other processes did the same.

Here is improved version:

#!/bin/bash

UNIQSTR=$$
LOCK_PREFIX="/tmp/my.lock."
LOCK_NAME="$LOCK_PREFIX$UNIQSTR"

### Placing lock file
touch $LOCK_NAME
if [[ -e $LOCK_NAME && `ls -la $LOCK_PREFIX* | wc -l` == 1 ]] ; then
        echo -n "Started..."
        ### Performing required work
        sleep 10
        ### Removing lock
        rm -f $LOCK_NAME
        echo "Done."
else

### another process is running, removing lock
        echo "re-entry, exiting"
        rm -f $LOCK_NAME
        exit 1
fi

Now even if you managed to run two scripts at the same time only one script could actually start backup. In very rare situation both scripts will refuse to start (because of two lock files existing at the same time) but you could catch this issue by simply monitoring script exit code. Anyway – as soon you receive backup exit code different than zero it’s time to review your backup structure and make sure it works as desired.

Please note – when you terminate this script manually you will also need to remove lock file as well so script will pass check on startup. You could also use this script for any periodic tasks you have like Sphinx indexing, merging or index consistency checking.

For your convenience this script is available for download directly or using wget:

wget http://astellar.com/downloads/backup-wrapper.sh

You could also find more about MySQL backup solutions here.

Keep your data safe and have a nice day!

Difference between myisam_sort_buffer_size and sort_buffer_size

MySQL has two confusingly identical by the first look variables myisam_sort_buffer_size and sort_buffer_size. Thing is that those two confusingly similar variables has absolutely different meanings.

sort_buffer_size is a per-connection variable and do not belongs to any specific storage engine. It doesn’t matter do you use MyISAM or InnoDB – MySQL will allocate sort_buffer_size for every sort (required most of the times for ORDER BY and GROUP BY queries) so increasing it’s value might help speeding up those queries however I would not recommend to change it from the default value unless you are absolutely sure about all the drawbacks. Value for out-of-the-box MySQL-5.1.41 installation on Ubuntu is 2Mb and it’s recommended to keep it that way.

On the other side myisam_sort_buffer_size used by MyISAM to perform index sorting on relatively rare table-wide modifications like ALTER/REPAIR TABLE. Stock value is 8Mb so if you are using MyISAM tables intensively (please refer to my other post to see how to know your tables type) I would recommend to set it to some higher value close or even more than key_buffer_size but still small enough to keep it in memory and prevent MySQL from swapping.

Why is stock MySQL slow?

“I’ve installed MySQL and it doesn’t work fast enough for me”. MySQL server is heart of database driven application (if it uses MySQL as database of course!) and any slowness related to running queries is affecting all application layers.

MySQL server tuning and query slowness hunting are always step by step process and without knowing all the data (SHOW GLOBAL VARIABLES, SHOW GLOBAL STATUS, SHOW TABLE STATUS LIKE ‘tablename’, EXPLAIN details for slow query is just some of the required information) it would be generally a blind guess. But there are still few things which is related to newly installed MySQL server.

If you are using stock MySQL you might need to check memory pool size which MySQL used to load index data to avoid slow IO requests and increase queries speed. Connect to MySQL and fire two queries: Continue reading