<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Vlad Fedorkov</title>
	<atom:link href="http://astellar.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://astellar.com</link>
	<description>Performance consulting for MySQL and Sphinx</description>
	<lastBuildDate>Sat, 08 Dec 2012 09:38:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>How to avoid two backups running at the same time</title>
		<link>http://astellar.com/2012/10/backups-running-at-the-same-time/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=backups-running-at-the-same-time</link>
		<comments>http://astellar.com/2012/10/backups-running-at-the-same-time/#comments</comments>
		<pubDate>Tue, 16 Oct 2012 10:11:47 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[Guide]]></category>
		<category><![CDATA[Operations]]></category>
		<category><![CDATA[How to]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=306</guid>
		<description><![CDATA[When your backup script is running for too long it sometimes causes the second backup script starting at the time when previous backup is still running. This increasing pressure on the database, makes server slower, could start chain of backup processes and in some cases may break backup integrity. Simplest solution is to avoid this [...]]]></description>
			<content:encoded><![CDATA[<p>When your backup script is running for too long it sometimes causes the second backup script starting at the time when previous backup is still running. This increasing pressure on the database, makes server slower, could start chain of backup processes and in some cases may break backup integrity.</p>
<p>Simplest solution is to avoid this undesired situation by adding locking to your backup script and prevent script to start second time when it&#8217;s already running.</p>
<p>Here is working sample. You will need to replace &#8220;sleep 10&#8243; string with actual backup script call:</p>
<pre>#!/bin/bash

LOCK_NAME="/tmp/my.lock"
if [[ -e $LOCK_NAME ]] ; then
        echo "re-entry, exiting"
        exit 1
fi

### Placing lock file
touch $LOCK_NAME
echo -n "Started..."

### Performing required work
sleep 10

### Removing lock
rm -f $LOCK_NAME

echo "Done."</pre>
<p>It works perfectly most of the times. Problem is that you could still theoretically run two scripts at the same time so both will pass lock file checks and will be running together. To avoid that you would need to place unique lock file just before check and make sure no other processes did the same.</p>
<p>Here is improved version:</p>
<pre>#!/bin/bash

UNIQSTR=$$
LOCK_PREFIX="/tmp/my.lock."
LOCK_NAME="$LOCK_PREFIX$UNIQSTR"

### Placing lock file
touch $LOCK_NAME
if [[ -e $LOCK_NAME &amp;&amp; `ls -la $LOCK_PREFIX* | wc -l` == 1 ]] ; then
        echo -n "Started..."
        ### Performing required work
        sleep 10
        ### Removing lock
        rm -f $LOCK_NAME
        echo "Done."
else

### another process is running, removing lock
        echo "re-entry, exiting"
        rm -f $LOCK_NAME
        exit 1
fi</pre>
<p>Now even if you managed to run two scripts at the same time only one script could actually start backup. In very rare situation both scripts will refuse to start (because of two lock files existing at the same time) but you could catch this issue by simply monitoring script exit code. Anyway &#8211; as soon you receive backup exit code different than zero it&#8217;s time to review your backup structure and make sure it works as desired.</p>
<p>Please note &#8211; when you terminate this script manually you will also need to remove lock file as well so script will pass check on startup. You could also use this script for any periodic tasks you have like Sphinx indexing, merging or <a href="http://astellar.com/2012/07/debuging-sphinx-index-with-indextool/">index consistency checking</a>.</p>
<p>For your convenience this script is available for <a href="http://astellar.com/downloads/backup-wrapper.sh">download directly</a> or using wget:</p>
<pre>wget http://astellar.com/downloads/backup-wrapper.sh</pre>
<p>You could also find <a href="http://astellar.com/mysql-and-sphinx-consulting-services/mysql-backup/">more about MySQL backup solutions here</a>.</p>
<p>Keep your data safe and have a nice day!</p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2012/10/backups-running-at-the-same-time/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Sphinx events in New York City this fall</title>
		<link>http://astellar.com/2012/09/sphinx-events-in-new-york-city-this-fall/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=sphinx-events-in-new-york-city-this-fall</link>
		<comments>http://astellar.com/2012/09/sphinx-events-in-new-york-city-this-fall/#comments</comments>
		<pubDate>Sat, 22 Sep 2012 16:23:44 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Sphinx search]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=270</guid>
		<description><![CDATA[For some of you who situated near New York City I am happy to announce that you could attend two events related to leading Full-Text search engines in open source – Sphinx Search. First meeting organized by NYPHP meetup on Tuesday, September 25th at IBM, 590 Madison Avenue, New York. I&#8217;ll be speaking about search [...]]]></description>
			<content:encoded><![CDATA[<p>For some of you who situated near New York City I am happy to announce that you could attend two events related to leading Full-Text search engines in open source – <a href="http://sphinxsearch.com">Sphinx Search</a>.</p>
<p>First meeting organized by <a href="http://www.nyphp.org/php-presentations/202_Full-Text-PHP-Sphinx">NYPHP meetup</a> on Tuesday, September 25th at <a href="http://www.nyphp.org/rsvp/202">IBM</a>, 590 Madison Avenue, New York. I&#8217;ll be speaking about search services in cloud environment and distributed search tips and tricks. Event is free, <a href="http://www.nyphp.org/rsvp/202">please RSVP</a>.</p>
<p>One week later on October 1st, I&#8217;ll be doing tutorial about MySQL and Sphinx “<a href="http://www.percona.com/live/nyc-2012/sessions/creating-full-text-based-services-sphinx-and-mysql">Full-text based services with Sphinx and MySQL</a>” at greatest MySQL event in East Cost, <a href="http://www.percona.com/live/nyc-2012/">Percona Live NY</a>. Use &#8220;FlashSale&#8221; code to get exceptional 25% discount (valid until Sept 23rd).</p>
<p>Looking forward to meet you in New York City!</p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2012/09/sphinx-events-in-new-york-city-this-fall/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Debuging Sphinx index with indextool</title>
		<link>http://astellar.com/2012/07/debuging-sphinx-index-with-indextool/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=debuging-sphinx-index-with-indextool</link>
		<comments>http://astellar.com/2012/07/debuging-sphinx-index-with-indextool/#comments</comments>
		<pubDate>Tue, 10 Jul 2012 14:19:31 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[Guide]]></category>
		<category><![CDATA[Sphinx insights]]></category>
		<category><![CDATA[Sphinx operations]]></category>
		<category><![CDATA[Sphinx search]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=180</guid>
		<description><![CDATA[Sometime you need to debug your Sphinx indexes to know what&#8217;s inside it, is it okay, is there document you trying to find? In this case indextool utility might be very handy as it gathers information directly from index files even searchd is not started. Here few examples of indextool usage: Checking index consistency One [...]]]></description>
			<content:encoded><![CDATA[<p>Sometime you need to debug your Sphinx indexes to know what&#8217;s inside it, is it okay, is there document you trying to find? In this case <em>indextool</em> utility might be very handy as it gathers information directly from index files even <em>searchd</em> is not started. Here few examples of indextool usage:</p>
<h2>Checking index consistency</h2>
<p>One of the most important functions of <em>indextool</em> is checking index consistency. You will need to have sphinx config file and index files.<br />
<code>/path/to/indextool -c sphinx.conf --check my_sphinx_index</code><br />
<span id="more-180"></span><br />
This will perform checking of <em>my_sphinx_index</em> for consistency between document list, hit list, positions and other internal sphinx index structures. Please note that indextool is only checking disk indexes (starting from 2.0.2 it could also check on-disk part of Real-Time indexes, but not a memory part). Usual output for healthy index looks likes this:</p>
<p><code><br />
using config file 'sphinx.conf'...<br />
checking index 'my_sphinx_index'...<br />
checking dictionary...<br />
checking data...<br />
checking rows...<br />
checking attribute blocks index...<br />
checking kill-list...<br />
check passed, 4.4 sec elapsed<br />
</code></p>
<p>indextool doesn&#8217;t fix issues itself it&#8217;s only telling you if index okay or not. In case of troubles you will need to rebuild broken index. Usually you could do that with <em>indexer [--rotate] my_sphinx_index</em> where &#8211;rotate is used to rebuild index on the fly, while searchd is running.</p>
<h2>Getting number of documents from index</h2>
<p>indextool is providing you an option to reverse engineer index to see internal structures and settings along with global information like number of documents stored, number of bits per document identifier (32 or 64), tokenizer, morphology type and some other index setting. Usage:<br />
<code>indextool --dumpheader my_sphinx_index.sph</code></p>
<p>Shortened results would look like this:<code><br />
dumping header for index 'my_sphinx_index'...<br />
dumping header file 'my_sphinx_index.sph'...<br />
version: 23<br />
<strong>idbits</strong>: 64<br />
docinfo: extern<br />
<strong>fields</strong>: 5<br />
field 0: page<br />
field 1: title<br />
field 2: description<br />
field 3: tags<br />
<strong>attrs</strong>: 23<br />
attr 0: user_id, uint, bitoff 0<br />
attr 1: url_crc32, uint, bitoff 32<br />
[...]<br />
attr 12: deleted, uint, bitoff 384<br />
attr 13: private, uint, bitoff 416<br />
attr 14: external, uint, bitoff 448<br />
attr 15: enabled, uint, bitoff 480<br />
attr 16: lastactivity, timestamp, bitoff 512<br />
attr 17: url, ordinal, bitoff 544<br />
attr 18: has_picture, bool, bitoff 576<br />
[...]<br />
<strong>total-documents</strong>: 3438767<br />
<strong>total-bytes</strong>: 21164723870<br />
min-prefix-len: 0<br />
min-infix-len: 0<br />
exact-words: 0<br />
html-strip: 1<br />
[...]<br />
</code></p>
<p>Lots of interesting internal index information as you could see. Besides <em>total-documents</em> and <em>total-bytes</em> you can find names and internal types for all non full-text attributes with their sizes. <em>bitoff</em> (offset) field for last attr record will give you an idea of attributes memory consumption. For full-text fields you can find names, text processing settings like prefix/infix indexing, stemming length, stopwords, zones support and some other settings.</p>
<h2>Are there documents in index that matched my keyword?</h2>
<p>Easy! Even when daemon is not running:</p>
<pre>
# indextool -c sphinx.conf --dumphitlist &lt;my_sphinx_index&gt; &lt;keyword&gt;
</pre>
<p>or in some more convenient way, just to get the number of docs:</p>
<pre>
# indextool -c sphinx.conf --dumphitlist &lt;my_sphinx_index&gt; &lt;keyword&gt; | grep docs | wc -l
</pre>
<p>You could find <em>indextool</em> in any Sphinx distribution since version 0.9.9, it&#8217;s usually located at the same directory as <em>searchd</em>.</p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2012/07/debuging-sphinx-index-with-indextool/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>PHP libmysqlclient.so.16 error and MySQL Percona Server 5.5</title>
		<link>http://astellar.com/2012/06/php-libmysqlclient-so-16-error-and-percona-server-5-5/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=php-libmysqlclient-so-16-error-and-percona-server-5-5</link>
		<comments>http://astellar.com/2012/06/php-libmysqlclient-so-16-error-and-percona-server-5-5/#comments</comments>
		<pubDate>Tue, 12 Jun 2012 14:11:01 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[Operations]]></category>
		<category><![CDATA[How to]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[UNIX/Linux]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=153</guid>
		<description><![CDATA[If you ever get an error with MySQL client library: php: error while loading shared libraries: libmysqlclient.so.16: cannot open shared object file: No such file or directory while using Percona MySQL Server 5.5 just go ahead and install Percona-Server-shared-compat package from Percona Repo: yum install Percona-Server-shared-compat]]></description>
			<content:encoded><![CDATA[<p>If you ever get an error with MySQL client library:</p>
<p><b><code>php: error while loading shared libraries: libmysqlclient.so.16: cannot open shared object file: No such file or directory</code></b></p>
<p>while using Percona MySQL Server 5.5 just go ahead and install Percona-Server-shared-compat package from <a href="http://www.percona.com/doc/percona-server/5.5/installation.html" title="Percona Repo">Percona Repo</a>:</p>
<p><code>yum install Percona-Server-shared-compat</code></p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2012/06/php-libmysqlclient-so-16-error-and-percona-server-5-5/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Difference between myisam_sort_buffer_size and sort_buffer_size</title>
		<link>http://astellar.com/2012/01/difference-between-myisam_sort_buffer_size-and-sort_buffer_size/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=difference-between-myisam_sort_buffer_size-and-sort_buffer_size</link>
		<comments>http://astellar.com/2012/01/difference-between-myisam_sort_buffer_size-and-sort_buffer_size/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 06:54:32 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[Guide]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[MySQL performance]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=112</guid>
		<description><![CDATA[MySQL has two confusingly identical by the first look variables myisam_sort_buffer_size and sort_buffer_size. Thing is that those two confusingly similar variables has absolutely different meanings. sort_buffer_size is a per-connection variable and do not belongs to any specific storage engine. It doesn&#8217;t matter do you use MyISAM or InnoDB &#8211; MySQL will allocate sort_buffer_size for every [...]]]></description>
			<content:encoded><![CDATA[<p>MySQL has two confusingly identical by the first look variables <strong>myisam_sort_buffer_size</strong> and <strong>sort_buffer_size</strong>. Thing is that those two confusingly similar variables has absolutely different meanings.</p>
<p><strong>sort_buffer_size</strong> is a <em>per-connection variable</em> and do not belongs to any specific storage engine. It doesn&#8217;t matter do you use MyISAM or InnoDB &#8211; MySQL will allocate <strong>sort_buffer_size</strong> for every sort (required most of the times for ORDER BY and GROUP BY queries) so increasing it&#8217;s value might help speeding up those queries however I would not recommend to change it from the default value unless you are absolutely sure about all the drawbacks. Value for out-of-the-box MySQL-5.1.41 installation on Ubuntu is 2Mb and it&#8217;s recommended to <a href="http://www.xaprb.com/blog/2010/05/09/how-to-tune-mysqls-sort_buffer_size/">keep it that way</a>.</p>
<p>On the other side <strong>myisam_sort_buffer_size</strong> used by MyISAM to perform index sorting on relatively rare table-wide modifications like ALTER/REPAIR TABLE. Stock value is 8Mb so if you are using MyISAM tables intensively (please refer to my other post to see <a href="http://astellar.com/2011/12/why-is-stock-mysql-slow/">how to know your tables type</a>) I would recommend to set it to some higher value close or even more than key_buffer_size but still small enough to keep it in memory and prevent MySQL from swapping.</p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2012/01/difference-between-myisam_sort_buffer_size-and-sort_buffer_size/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lets meet on FOSDEM!</title>
		<link>http://astellar.com/2012/01/lets-meet-on-fosdem/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=lets-meet-on-fosdem</link>
		<comments>http://astellar.com/2012/01/lets-meet-on-fosdem/#comments</comments>
		<pubDate>Wed, 11 Jan 2012 13:41:46 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Conferences]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=100</guid>
		<description><![CDATA[I&#8217;ll be speaking at FOSDEM 2012 conference at MySQL and Friends track (schedule is yet to appear) with two talks &#8220;How to offload MySQL server with Sphinx&#8221; and &#8220;Sphinx performance top secrets&#8221;. Additionally I&#8217;ll be co-presenting &#8220;Sphinx users stories&#8221; with SkySQL engineers and customers. Looking forward to meet you at FOSDEM 2012!]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ll be speaking at <a href="http://fosdem.org/2012/">FOSDEM 2012</a> conference at <a href="http://fosdem.org/2012/schedule/track/mysql_and_friends_devroom">MySQL and Friends track</a> (schedule is yet to appear) with two talks &#8220;How to offload MySQL server with Sphinx&#8221; and &#8220;Sphinx performance top secrets&#8221;. Additionally I&#8217;ll be co-presenting &#8220;Sphinx users stories&#8221; with SkySQL engineers and customers.</p>
<p>Looking forward to meet you at FOSDEM 2012!</p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2012/01/lets-meet-on-fosdem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why is stock MySQL slow?</title>
		<link>http://astellar.com/2011/12/why-is-stock-mysql-slow/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=why-is-stock-mysql-slow</link>
		<comments>http://astellar.com/2011/12/why-is-stock-mysql-slow/#comments</comments>
		<pubDate>Fri, 30 Dec 2011 08:30:03 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[MySQL performance]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=79</guid>
		<description><![CDATA[&#8220;I&#8217;ve installed MySQL and it doesn&#8217;t work fast enough for me&#8221;. MySQL server is heart of database driven application (if it uses MySQL as database of course!) and any slowness related to running queries is affecting all application layers. MySQL server tuning and query slowness hunting are always step by step process and without knowing [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;I&#8217;ve installed MySQL and it doesn&#8217;t work fast enough for me&#8221;. MySQL server is heart of database driven application (if it uses MySQL as database of course!) and any slowness related to running queries is affecting all application layers.</p>
<p>MySQL server tuning and query slowness hunting are always step by step process and without knowing all the data (SHOW GLOBAL VARIABLES, SHOW GLOBAL STATUS, SHOW TABLE STATUS LIKE &#8216;tablename&#8217;, EXPLAIN details for slow query is just some of the required information) it would be generally a blind guess. But there are still few things which is related to newly installed MySQL server.</p>
<p>If you are using <strong>stock MySQL</strong> you might need to check <strong>memory pool size</strong> which MySQL used to load index data to avoid slow IO requests and increase queries speed. Connect to MySQL and fire two queries:<span id="more-79"></span></p>
<pre>SHOW VARIABLES LIKE 'key_buffer_size';
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';</pre>
<p>Default values are very small. If you see something less than 128MB (8Mb for older versions) that might be an issue. <strong>key_buffer_size</strong> used by MyISAM tables while <strong>innodb_buffer_pool_size</strong> required for InnoDB tables. I would recommend to set one of them to some relatively big value (generally from 25% to 80% of total memory on the box depends on the case).</p>
<p>How do you know which table type you are using? While connected to mysql please type:</p>
<pre>SHOW CREATE TABLE 'table_name';</pre>
<p>and find ENGINE=MyISAM or ENGINE=InnoDB on the last line. Increase key_buffer_size in my.cnf (/etc/my.cnf or /etc/mysql/my.cnf) if you see MyISAM or innodb_buffer_pool_size for InnoDB and restart MySQL.</p>
<p>Have fun!</p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2011/12/why-is-stock-mysql-slow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Replacing MySQL Full-text search with Sphinx</title>
		<link>http://astellar.com/2011/12/replacing-mysql-full-text-search-with-sphinx/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=replacing-mysql-full-text-search-with-sphinx</link>
		<comments>http://astellar.com/2011/12/replacing-mysql-full-text-search-with-sphinx/#comments</comments>
		<pubDate>Wed, 28 Dec 2011 08:50:49 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[Guide]]></category>
		<category><![CDATA[How to]]></category>
		<category><![CDATA[Sphinx search]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=42</guid>
		<description><![CDATA[It&#8217;s very handy to have FT search out of the box, but there are several drawbacks attached. Problem is that MyISAM Full-text search is not designed to handle big amounts of text data. If you plan to index more than 1M documents you will probably need to take a look on the external search system [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s very handy to have FT search out of the box, but there are several drawbacks attached. Problem is that MyISAM Full-text search is not designed to handle big amounts of text data. If you plan to index more than 1M documents you will probably need to take a look on the external search system like <a href="http://lucene.apache.org/java/docs/index.html">Lucene</a> or <a href="http://sphinxsearch.com/">Sphinx</a>. For the usual LAMP-based service I personally would prefer to use Sphinx as it provides simple transition from MySQL FT and easy to integrate into any application (Sphinx could be queried via native APIs or via MySQL protocol).</p>
<p>Say we have table called &lt;my_table&gt; with `title` and `content` text fields. In MySQL you have to fire query like this:
<pre>SELECT * FROM &lt;my_table&gt; WHERE MATCH(`title`,`content`) AGAINST ('I love Sphinx');</pre>
<p> Let&#8217;s see how could we do the same query with Sphinx.</p>
<p>There are two steps to run the <a href="http://astellar.com/mysql-and-sphinx-consulting-services/sphinx-tuning-and-integration/">Sphinx as a MySQL FT replacement</a>. First you will need to pull all needed text data from MySQL to Sphinx. For that you have to configure <strong>source</strong> and <strong>index definition</strong> in Sphinx search config. Second step is to run indexer program which will connect to MySQL and fetch all the desired data. Then fire up the search daemon which will serve queries. Simplified Sphinx <a href="http://astellar.com/downloads/sphinx.conf">configuration example</a> is below: <span id="more-42"></span></p>
<p>You need to let Sphinx know where to look for the data (source configuration):</p>
<pre>source my_source
{
    type      = mysql
    sql_host  = localhost
    sql_user  = sphinx
    sql_pass  = ********
    sql_db    = &lt;my_database_name&gt;
    sql_port  = 3306
    sql_query = SELECT id, title, description FROM &lt;my_table&gt;
}</pre>
<p>Please note that <strong>id</strong> field in sql_query. This field MUST be <strong>positive integer</strong> and have to be <strong>unique</strong> across all the documents in collection. Auto_incremented integer primary key from MySQL table with work in this case like a charm.</p>
<p>Now we need to tell Sphinx where to store all that data and <strong>configure index:</strong></p>
<pre>index my_first_sphinx_index
{
    source        = my_source
    path          = &lt;path_to_sphinx_home&gt;/var/index1
    docinfo       = extern
    charset_type  = utf-8
}</pre>
<p>That&#8217;s it. Let&#8217;s add few more required sections to complete configuration:</p>
<p><strong>Indexer settings</strong>:</p>
<pre>indexer
{
    mem_limit    = 256M
    write_buffer = 8M
}</pre>
<p>and <strong>daemon configuration</strong>:</p>
<pre>searchd
{
    listen                  = 9312
    listen                  = 9306:mysql41
    pid_file                = &lt;sphinx_path&gt;/var/searchd.pid
    max_matches             = 1000
}</pre>
<p>Put blocks above to &lt;sphinx_path&gt;/etc/sphinx.conf file which will be your main Sphinx configuration file.<br />
Please also make sure that &lt;sphinx_path&gt;/var directory is writable for user you planning to run sphinx daemon.</p>
<p>Now we have to perform initial indexing by running indexer binary</p>
<pre>&lt;sphinx_path&gt;/bin/indexer my_first_sphinx_index -c &lt;sphinx_path&gt;/etc/sphinx.conf</pre>
<p><em>indexer my_first_sphinx_index</em> tells indexer to create index called my_first_sphinx_index described in sphinx config. To create all the indexes at once (if you have two or more) run <em>indexer &#8211;all -c &lt;sphinx_path&gt;/etc/sphinx.conf</em></p>
<p>Now you have to run the search daemon apparently called searchd</p>
<pre>&lt;sphinx_path&gt;/bin/searchd -c &lt;sphinx_path&gt;/etc/sphinx.conf</pre>
<p>Now Sphinx should be able to answer queries.</p>
<p>Fire up mysql client and connect to brand new Sphinx installation:</p>
<pre>$ mysql -h0 -P 9306
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 2.0.3-id64-dev (rel20-r3043)

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql&gt;</pre>
<p>Please note server version. It is Sphinx! Now you can hire all power of Sphinx full-text query language!</p>
<pre>mysql&gt; SELECT * FROM my_first_sphinx_index WHERE
MATCH('I love Sphinx') LIMIT 0,5; SHOW META;
+---------+--------+
| id      | weight |
+---------+--------+
| 7637682 |   2652 |
| 6598265 |   2612 |
| 6941386 |   2612 |
| 6913297 |   2584 |
| 7139957 |   1667 |
+---------+--------+
5 rows in set (0.01 sec)

+---------------+--------+
| Variable_name | Value  |
+---------------+--------+
| total         | 51     |
| total_found   | 51     |
| time          | 0.013  |
| keyword[0]    | love   |
| docs[0]       | 227990 |
| hits[0]       | 472541 |
| keyword[1]    | sphinx |
| docs[1]       | 114    |
| hits[1]       | 178    |
+---------------+--------+
9 rows in set (0.00 sec)
mysql&gt;</pre>
<p>Please note &#8211; Sphinx returns document IDs, not a document content, so you need to query MySQL to fetch additional fields: SELECT * FROM &lt;my_table&gt; WHERE id IN (7637682, 6598265, &#8230;, 7139957)</p>
<p>Config above is fully working but very simple and provided as an example. You can download it from <a href="http://astellar.com/downloads/sphinx.conf" title="Download Sphinx config sample">this website directly</a> or using wget:</p>
<pre>wget http://astellar.com/downloads/sphinx.conf</pre>
<p>Another way to create initial Sphinx configuration is to adopt Sphinx configuration sample called sphinx-min.conf.dist bundled to the Sphinx RPM and Deb packets.</p>
<p>You could also learn more about Sphinx tips and tricks from <a href="http://astellar.com/about/talks/">my talks</a> on various conferences and meetups, read blog posts <a href="http://astellar.com/tag/sphinx-search/">about Sphinx</a> and follow me on <a href="http://twitter.com/vfedorkov">twitter</a>.</p>
<p>If you are looking for help with <a href="http://astellar.com/mysql-and-sphinx-consulting-services/sphinx-integration/">Sphinx installation and integration</a>, troubleshooting and fine tuning please <a href="http://astellar.com/contact-me/">contact me</a> for a quote with your problem description.</p>
<p>Enjoy!</p>
<p>P.S. If you found this article useful please share it!</p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2011/12/replacing-mysql-full-text-search-with-sphinx/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Integrity check of all Sphinx indexes at once</title>
		<link>http://astellar.com/2011/12/perform-integrity-check-of-all-sphinx-indexes-at-once/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=perform-integrity-check-of-all-sphinx-indexes-at-once</link>
		<comments>http://astellar.com/2011/12/perform-integrity-check-of-all-sphinx-indexes-at-once/#comments</comments>
		<pubDate>Mon, 12 Dec 2011 18:58:02 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[Operations]]></category>
		<category><![CDATA[Sphinx operations]]></category>
		<category><![CDATA[Sphinx search]]></category>
		<category><![CDATA[UNIX/Linux]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=20</guid>
		<description><![CDATA[A quick tip on how to check integrity of all on-disk Sphinx indexes at once: bash$ cd /path/to/your/indexes bash$ for i in *.spa; do echo $i &#124; sed -e "s/.spa//g" &#124; xargs /path/to/indextool -c /path/to/config --check ; done Enjoy!]]></description>
			<content:encoded><![CDATA[<p>A quick tip on how to check integrity of all on-disk Sphinx indexes at once:<br />
<code><br />
bash$ cd /path/to/your/indexes<br />
bash$ for i in *.spa; do echo $i | sed -e "s/.spa//g" | xargs /path/to/indextool -c /path/to/config --check ; done<br />
</code></p>
<p>Enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2011/12/perform-integrity-check-of-all-sphinx-indexes-at-once/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Top 100 and top 500 stopwords for Sphinx Search</title>
		<link>http://astellar.com/2011/12/stopwords-for-sphinx-search/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=stopwords-for-sphinx-search</link>
		<comments>http://astellar.com/2011/12/stopwords-for-sphinx-search/#comments</comments>
		<pubDate>Sun, 11 Dec 2011 15:36:39 +0000</pubDate>
		<dc:creator>vlad</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[Sphinx insights]]></category>
		<category><![CDATA[Sphinx Performance]]></category>
		<category><![CDATA[Sphinx search]]></category>

		<guid isPermaLink="false">http://astellar.com/?p=13</guid>
		<description><![CDATA[Back to year 2006 when I was working for my first sphinxsearch project I was playing with stopwords files. Stopwords is basically a small set of highly frequent words you often don&#8217;t want to search for (like &#8220;I&#8221;, &#8220;Am&#8221;, &#8220;The&#8221;, etc). For most sphinx instances they only wasting index space and slower your search queries [...]]]></description>
			<content:encoded><![CDATA[<p>Back to year 2006 when I was working for my first sphinxsearch project I was playing with <a href="http://sphinxsearch.com/docs/current.html#conf-stopwords">stopwords</a> files. Stopwords is basically a small set of highly frequent words you often don&#8217;t want to search for (like &#8220;I&#8221;, &#8220;Am&#8221;, &#8220;The&#8221;, etc). For most sphinx instances they only wasting index space and slower your search queries by finding all occurrences of these non-important words.</p>
<p>Say if you are searching for &#8220;when is jane&#8217;s birthday&#8221; you are actually looking to find documents with &#8220;jane&#8217;s birthday&#8221;, and you don&#8217;t really care about lot&#8217;s of documents (blog posts, news articles, etc) with only &#8220;when&#8221; and &#8220;is&#8221; inside. <span id="more-13"></span></p>
<p>Remove those high frequency words from search index is usually smart move and ages ago I&#8217;ve created two stopword file samples which I&#8217;m using by now.</p>
<p><strong><a href="http://astellar.com/downloads/stopwords.txt">stopwords.txt</a></strong> is a top 100 most frequent words in my blog post collection while <strong><a href="http://astellar.com/downloads/stopwords-500.txt">stopwords-500.txt</a></strong> as you might expect is a 500 top frequent words. They are old, but not yet included in sphinx distribution so I would suggest to start with <a href="http://astellar.com/downloads/stopwords.txt">stopwords.txt</a> and add it using <a href="http://sphinxsearch.com/docs/current.html#conf-stopwords">stopwords option</a> to your sphinx config file as below:</p>
<pre>
     stopwords = /path/to/stopwords.txt
</pre>
<p>You could also download stopword files using wget:</p>
<pre>wget http://astellar.com/downloads/stopwords.txt
wget http://astellar.com/downloads/stopwords-500.txt</pre>
<p>Learn more about Sphinx tips and tricks from <a href="http://astellar.com/about/talks/">my talks</a> on various conferences and meetups, read blog posts <a href="http://astellar.com/tag/sphinx-search/">about Sphinx</a> and follow me on <a href="http://twitter.com/vfedorkov">twitter</a>.</p>
<p>If you are looking for help with <a href="http://astellar.com/mysql-and-sphinx-consulting-services/sphinx-integration/">Sphinx installation and integration</a>, troubleshooting and fine tuning please <a href="http://astellar.com/contact-me/">contact me</a> for a quote with your problem description.</p>
<p>Enjoy!</p>
<p>P.S. If you found this article useful please share it!</p>
]]></content:encoded>
			<wfw:commentRss>http://astellar.com/2011/12/stopwords-for-sphinx-search/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
