Vlad Fedorkov

Performance consulting for MySQL and Sphinx

Full table scans and MySQL performance

High season is coming, how do you make sure that MySQL will handle the increased load? Stress tests could help with that, but it’s not a good idea to run them in a production environment. In this case Select_scan, Select_full_join and other MySQL counters could quickly give you an idea of how many queries are not performing well and could cause a performance degradation as the load goes up.

Select_scan from SHOW GLOBAL STATUS indicates how many full table scans were done since last MySQL restart. Scanning the entire table is a resource intensive operation. It also forces MySQL to store unnecessary data in the buffer pool, wasting memory and IO resources.

Full scan of a tiny table would be quite fast so missing indexes could stay invisible until the load rises or the dataset grows up. This could also be the case for developers who work with too small data sets on their dev boxes. To prevent performance issues all newly added features need to be tested against production-size tables, so that the required indexes could be created on production servers before the code release.

Full table scans can be caused by several reasons. First, a missing index forces MySQL to walk though the entire table and filter rows one-by-one to match the condition in WHERE clause or sort them instead of using index. For example:

SELECT * FROM staff WHERE first_name = 'John';
SELECT * FROM staff ORDER BY first_name LIMIT 10;

Lack of the index on first_name column would force MySQL to read the entire table even when only a few rows need to be sent back to the client.

Second, sometimes MySQL is unable to utilize existing indexes and has to perform full table scan. Below are a few common examples.
Using function or math as a constraint:

SELECT * FROM table WHERE myfunc(a) = 5
SELECT * FROM sessions WHERE session_updated + INTERVAL 30 MINUTE < NOW();

Not equal and NOT IN constraints:

SELECT * FROM table WHERE a <> 1
SELECT * FROM table WHERE a NOT IN (1,2,3)

Bitwise operations on numeric columns:

SELECT * FROM table WHERE ( status & 4 ) = 0

Postfix LIKE clause:

SELECT * FROM table WHERE c LIKE ‘%omething’

All these kinds of queries need to be rewritten where possible, so that MySQL could use index lookups. Another performance trick is to move such queries out of MySQL, but that's another story.

Third, some queries would cause a full table scan at all times. My favorite example is ORDER BY RAND(). MySQL has to calculate a random value for every single row in the dataset and then perform sorting. This is a guaranteed performance killer.

Fourth, another special case is InnoDB doing a full table scan for SELECT COUNT(*) FROM table. This is one of the rare cases where MyISAM is faster because it keeps number or rows in the table metadata. InnoDB can't do that because of it's transactional nature, so it has to perform a full table scan to calculate the rows count on the fly. Please note that some admin statements like SHOW STATUS can increase Select_scan as well.

Getting back to counters, another one called 'Select_full_join' shows an even worse case, when MySQL has to perform a full table scan against a joined table, which is even slower. Good news is, it is so slow that people usually notice these cases right away.

How to get these numbers from MySQL? There are several ways:

1. Run SHOW GLOBAL STATUS LIKE 'Select_scan'; in any MySQL client and you'll get a raw value since MySQL start.

2. mysqladmin -r -i 10 extended-status | grep "Select_scan" will show the same as above and then continue to display live increments every 10 seconds.

3. Download poor man's health check. This tool will show you both historical and current per second averages for Select_scan, Select_full_join, and other useful statistics, as well as most resource consuming queries from the slow query log. Please see the report sample for details.

4. When MySQL server gets occasionally overloaded you can employ pt-stalk utility from Percona Toolkit to gather key metrics exactly when the issue appears.

5. Take a look on Select_scan graph in Cacti/Graphite/etc. Graph will show you how it changes over the time which is extremely useful in investigations, especially if you compare it with 'Com_select' statistics.

Your email address will not be published. Required fields are marked *