Vlad Fedorkov

Performance consulting for MySQL and Sphinx

One, two, three, MySQLers… OOW 2019, Percona Live, ProxySQL Tech Day!

As people are getting back from vacations and application traffic is going back and hitting database backend hard, it is a good time to think about performance and prepare for the winter holiday spike. Whenever you are on-premise or in the cloud, questions are the same:

  • Are we satisfied with the current MySQL performance?
  • How much traffic are we currently serving?
  • Is there enough capacity to stay stable as traffic increases?
  • What would be the best way to stay cost-effective as we grow?

Unfortunately, the answers are not so obvious and require a thorough performance analysis. However, there is a chance that your problem has already been solved by someone else. I do not mean StackOverflow (which is a beautiful place, by the way), but some higher level questions which people usually discuss at conferences, and we have several of them ongoing or just a couple of weeks away.

Right this moment René Cannào and Peter Boros are preparing to deliver their talk MySQL with ProxySQL at Zuora at Oracle Open World conference, 09:00 AM @ Room 301, Moscone South.

At the same time next door in the Room 312 Oracle’ Frederic Descamps will be talking about MySQL Shell for DBAs.

If you are a developer, than you probably won’t miss Seventeen Things Developers Need to Know About MySQL by Peter Zaitsev 11:15 AM @ Room 3.

But that’s California, and what if you’re in Europe? Please hold on, we’re crossing the Atlantic!

Let’s move to Amsterdam where you might want to attend Percona Live Europe held from September 30th to October 2nd. As you land, you will not even need a taxi to get there as your venue is next to the airport! Besides useful tutorials on the first day, there will be two complete days filled with Database Wisdom. Perconians and their guests will share virtually every aspect of the database performance and operations.

Here is a list of my favorite talks from day one @ Percona Live:

  • MyRocks and RocksDB Advanced Features and Performance by Yoshinori Matsunobu (2:30 PM),
  • ClickHouse Features to Blow Your Mind by  Aleksei Milovidov (6:00 PM),
  • Tracing and Profiling MySQL by Valerii Kravchuk (same 6:00 PM).

The next day is also full of great talks, here’re just some examples:

  • Sveta Smirnova helps you to save life with Histograms (9:00 AM),
  • Marco Tusa will make a deep dive into Using ProxySQL as Firewall (10:00 AM),
  • Morgan Tocker will share his First 90 days experience with Vitess (11:00 AM).
  • And if you are looking for real hardcore stuff, please visit gdb basics for MySQL DBAs  by Valerii Kravchuk (3:30 PM).

That’s a really intense line-up for three days, but that’s not the end, here comes the ProxySQL Tech Day!

Right after Percona Live, on Thursday Oct 3rd, we are going to meet in person at a free event in Ghent, Belgium. We’ll be talking specifically about large-scale MySQL environments and how to serve a really huge amount of transactions with MySQL. Experts from 5 different companies will gather at Co.Station Gent and share their knowledge. Tricky questions, mind-breaking cases, crazy requirements and challenges of all kinds are super welcome! Please use this form and be sure that we’ll tackle your issue.

We’re starting at 5:00 PM, so you will have some time for a walk across the nice City of Ghent!

P.S. The views expressed here are my own and do not necessarily reflect the opinion of my employer.

Sphinx events in New York City this fall

For some of you who situated near New York City I am happy to announce that you could attend two events related to leading Full-Text search engines in open source – Sphinx Search.

First meeting organized by NYPHP meetup on Tuesday, September 25th at IBM, 590 Madison Avenue, New York. I’ll be speaking about search services in cloud environment and distributed search tips and tricks. Event is free, please RSVP.

One week later on October 1st, I’ll be doing tutorial about MySQL and Sphinx “Full-text based services with Sphinx and MySQL” at greatest MySQL event in East Cost, Percona Live NY. Use “FlashSale” code to get exceptional 25% discount (valid until Sept 23rd).

Looking forward to meet you in New York City!

Difference between myisam_sort_buffer_size and sort_buffer_size

MySQL has two confusingly identical by the first look variables myisam_sort_buffer_size and sort_buffer_size. Thing is that those two confusingly similar variables has absolutely different meanings.

sort_buffer_size is a per-connection variable and do not belongs to any specific storage engine. It doesn’t matter do you use MyISAM or InnoDB – MySQL will allocate sort_buffer_size for every sort (required most of the times for ORDER BY and GROUP BY queries) so increasing it’s value might help speeding up those queries however I would not recommend to change it from the default value unless you are absolutely sure about all the drawbacks. Value for out-of-the-box MySQL-5.1.41 installation on Ubuntu is 2Mb and it’s recommended to keep it that way.

On the other side myisam_sort_buffer_size used by MyISAM to perform index sorting on relatively rare table-wide modifications like ALTER/REPAIR TABLE. Stock value is 8Mb so if you are using MyISAM tables intensively (please refer to my other post to see how to know your tables type) I would recommend to set it to some higher value close or even more than key_buffer_size but still small enough to keep it in memory and prevent MySQL from swapping.

Why is stock MySQL slow?

“I’ve installed MySQL and it doesn’t work fast enough for me”. MySQL server is heart of database driven application (if it uses MySQL as database of course!) and any slowness related to running queries is affecting all application layers.

MySQL server tuning and query slowness hunting are always step by step process and without knowing all the data (SHOW GLOBAL VARIABLES, SHOW GLOBAL STATUS, SHOW TABLE STATUS LIKE ‘tablename’, EXPLAIN details for slow query is just some of the required information) it would be generally a blind guess. But there are still few things which is related to newly installed MySQL server.

If you are using stock MySQL you might need to check memory pool size which MySQL used to load index data to avoid slow IO requests and increase queries speed. Connect to MySQL and fire two queries: Continue reading

Top 100 and top 500 stopwords for Sphinx Search

Back to year 2006 when I was working for my first sphinxsearch project I was playing with stopwords files. Stopwords is basically a small set of highly frequent words you often don’t want to search for (like “I”, “Am”, “The”, etc). For most sphinx instances they only wasting index space and slower your search queries by finding all occurrences of these non-important words.

Say if you are searching for “when is jane’s birthday” you are actually looking to find documents with “jane’s birthday”, and you don’t really care about lot’s of documents (blog posts, news articles, etc) with only “when” and “is” inside. Continue reading