May 072014
 
Article MySQL

A great number of websites are set up on top of the same set of software building blocks: a Linux operating system, an Apache web server, a MySQL dabase management system, and a set of PHP scripts that generate the dynamic content. Among these elements, the database might be one of the more influencial in the overall performance of the site.

This post uses a real-world example to show the existing possibilities when it comes to analyze the site performance, and tune the MySQL configuration to optimize it.

Analyzing the performance of the database

A good starting point is the report generated by the mysqltuner.pl script that can be downloaded from github:

This script gathers from the database server statistical information on the usage done since the last time it was started, and output recommentations on changes to the configuration that could be done to tune the system.

Executing the script in the web server used as example for this post results in the following summary being output:

As we can see, parameters that are found to be sub-optimal are flagged with the prefix [!!]

In the following sections we’ll see how to deal with them.

[!!] Total fragmented tables: 143

As new registers are added, others are modified and some others are deleted from a table, it becomes fragmented. The disk space used to hold the table information becomes larger than the minimum required, and this also impacts the performance in accessing the data in the table.

This issue can be fixed running the “OPTIMIZE TABLE” command on each table that has become fragmented.

The “SHOW TABLE STATUS” command can be executed to obtain a relation of all tables in the database. From it, those that show a value greater than zero in the “Data_free” column are eligible to be optimized.

Another possibility is to perform the optimization of all tables in a database from the command line, running the command:

It is also possible to optimize all databases with a single command:

[!!] Highest connection usage: 100% (152/151)

This message warns that at some point in time, the limit of concurrent connections established in the configuration file has been reached. The default value for this limit es 151, which should be enough for many small or medium-sized websites.

The current number of established connections can be obtained with the command:

We can also issue a more elaborate command, that returns the number of established connections as a percentage of the maximum allowed connections:

In the website that we are using to illustrate this post, we ran this command several times, spanning a period of normal load of the server, and found that the number of established connections was always less than ten.

In this case, we concluded that there is no point in increasing the maximum number of allowed concurrent connections. Instead we need to investigate what caused the sudden increase in the number of established connections reported by the script. It might well be that some other bad-behaved application was to blame, with no relationship with the web server itself.

Meanwhile, the output of mysqltuner.pl suggest a change to the wait_timeout and interactive_timeout parameters, that could alleviate the problem:

The default behaviour of a MySQL server is to keep open an inactive connection for 28800 seconds (8 hours).

In most cases, this value can be significatively reduced, both for interactive and non-interactive connections. For instance, a limit of  5 minutes (300 seconds) can be established adding to the mysql configuration file (/etc/mysql/my.cnf) the directives:

 

[!!] Query cache prunes per day: 1816027

The Query cache mechanism may be helpful in many cases to improve the performance of a mysql server, but this is not always the case. This page gives a clear explanation of the benefits and drawbacks of the query cache.

Anyway,  mysqltuner.pl suggests an increment in the query_cache_size parameter:

[!!] Joins performed without indexes: 79453

This message warns about queries that use more than one table, linked by means of a JOIN (INNER JOIN, LEFT OUTER JOIN,…) clause, where there is no index in place that can be used to retrieve the matching records in the second table for each record in the first table. This means that for each record in the first table, a sequential search of the whole second table has to be performed to lookup the matching records. This is highly inefficient.

The fix implies identifying what are the queries involved, and defining the required indexes to optimize their processing.

To do this, we can add the following directives to the mysql configuration file:

Using  these directives, all queries not using indexes, as well as any query that takes more than one second to execute, are recorded in the slow queries log file. Later, we can analyze the log, and carry out the required measures to fix the low-performing queries identified.

[!!] Table cache hit rate: 0% (400 open / 101K opened)

Data for the MyISAM tables in the MySQL database are stored in the file system, a file per table. When the processing of a SQL sentence requires access to the data in a table, mysql need to open a file descriptor for the corresponding file.

The table cache keeps the opened file descriptors, to be reused if processing of other sentences require access to the same tables.

The default value of the table_cache parameter is 64, but this value is often too low. We can get a snapshot on the number of opened file descriptors running the mysqladmin command:

In the output produced by this command, we can see “Open tables: 400”. Therefore, mysqltuner advises to increment the value of this parameter to more than 400.

Anyway, there is an ongoing discussion on the benefit that can be obtained increasing the value of this parameter, and there are some posts of well known authors that refer to flaws in the implementation of this functionality in MySQL, at least until version 5.4:

http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/

The official documentation of MySQL suggest to set the value of the table_cache parameter with the formula table_cache = max_connections * N, where N is the maximum number of tables that could be accessed in one of the executed JOIN sentences.

Anyway, we must make sure that the value we intend to set for table_cache does not exceed the maximum number of open file descriptors allowed by the operating system. More information about how to get and modify this limit can be found here.

[!!] InnoDB buffer pool / data size: 128.0M/5.1G

This message just warns that the size of the data in InnoDB tables is much higher than the buffer pool set to keep data in RAM. Depending on the usage pattern of the database, this might mean that the execution of most queries invoving InnoDB tables might need to retrieve data from disk.

It is always good to increase the size of the buffer pool as much as possible, depending on the amount of RAM available in our system.

References:

 Posted by at 2:06 pm

 Leave a Reply

(required)

(required)