FastDomain - Optimizing MySQL: Queries and Indexes Article 4 of 4
 

FastDomain Web Hosting Help

Optimizing MySQL: Queries and Indexes Article 4 of 4

Optimizing MySQL: Queries and Indexes Article 4 of 4

Courtesy of:Ian Gilfillan

Most systems need to be highly optimized for selects - take a news site which performs millions of queries per day, but where the data arrives in large batches of text files. So for parts of the day, inserts need to be optimal, without noticeably affecting the millions trying to access the data...

Assuming a nicely formatted '|' delimited text file that we want to insert into the table above, take this piece of PHP code:

if (!($fp = fopen("datafile.txt","r"))) {               // open the file for reading
              print "nUnable to open datafile.txt for writing";    // display error
              exit();                                       // end the running of the program
            }
            
            
            while (!feof ($fp)) {                   // loop through the file line by line
              $sline = fgets($fp, 4096);            // put the data into the variable $sline
              $sline = chop($sline);                        // remove the newline
              list($eno,$fname,$sname,$telno,$salary) = split("|",$code);
                                    // split the line on "|", populating the ind. variables
              $db->query("insert into employee(employee_number,firstname,surname,
            tel_no, salary 
               values($eno,'$fname','$sname','$tel_no', $salary)");
            }                                               // end while loop
            

This would work, but would be very slow. The index buffer would be flushed after every insert. Until recently, MyISAM tables (The MySQL default) did not allow data to be inserted at the same time as being read. The new format does, but only if there are no deleted records present (highly unlikely in a heavily used system). So the entire table is locked for the duration of each insert. Fine on a low volume site, but when you're getting hundreds or thousands of queries per second, you'll soon notice the backlog!

There's a solution however - the best way to insert the data is to use MySQL's "LOAD DATA INFILE". This is much faster (20 times according to MySQL), and the only way some systems I've seen are still hanging in there!

The syntax is simple, and the code becomes a lot simpler too:

$db->query("LOAD DATA INFILE 'datafile.txt' INTO TABLE employee (employee_number,firstname,surname,tel_no,salary) FIELDS TERMINATED BY '|'");

LOAD DATA INFILE has defaults of:

FIELDS TERMINATED BY 't' ENCLOSED BY '' ESCAPED BY ''

if you don't specify any of these clauses. And, just as with an ordinary insert, you need to specify a field list if the order of the fields is different, or, as in the example above, you're not inserting data for every field. Always specifying a field list is good practice for all queries anyway - if someone adds a field to the table at a later stage, you don't want to go back and have to fix all your previous INSERT and SELECT * statements.

If you can't get this to work properly, have a look at the format of your text file - every problem I've seen with LOAD DATA has been because of a corrupted text file. Every field in every row must be delimited correctly!

You may not always be inserting from a text file - perhaps your application needs to do many unrelated inserts continually. There are ways to make sure the mass of users selecting are not badly affected... The first is to use INSERT LOW PRIORITY. This waits until there are no more reads waiting to happen, waiting for the gap, and not pushing in as it were. Of course, if your database is a rush hour special, there may never be a gap, and the client performing the INSERT LOW PRIORITY may start to grow cobwebs! An alternative here is INSERT DELAYED. The client is immediately freed, and the insert put into a queue (with all the other INSERT DELAYED's still waiting for the queue to end). This means that there can be no meaningful information passed back to the client, (such as the auto_increment value), as the INSERT has not been processed when the client is freed. Also, be aware that a catastrophe such as an unexpected power failure here will result in the queued INSERT's being lost. For neither of these methods do you have any idea when the data will be inserted, if at all, so I suggest you use with caution.

Conclusion

It's not only getting the data in that needs to be quick - sometimes you need to get it out quickly too. (Say you've accidentally loaded yesterday's classified ads, for example). Don't do a:

DELETE FROM classifieds;

Rather, use:

TRUNCATE TABLE classifieds;

The difference here is that DELETE drops records one by one, and that can be 1 million one by one's too slow! Note that this does not apply before version 4.0 of MySQL. At time of writing, most of you will still be using 3.x versions (if you do a DELETE FROM tablename on a non-empty table, and get 0 records back as a result, you're running an earlier version. To fix this problem, MySQL made DELETE remove records one by one so as to return the number of records deleted, but TRUNCATE still did the quick delete. Also, earlier versions than 3.23.33 used TRUNCATE tablename, not TRUNCATE TABLE tablename)

This has only been a brief introduction to optimizing queries and indexes. Once you've mastered these tips, you'll want to look at your more complex joins that just never seem to perform quickly. The MySQL site has lots of useful information, and there are a few good books out there as well. But don't forget EXPLAIN! Often the best way is to try and rewrite the query in as many different ways as possible, and see which one runs more efficiently. You can learn a lot by trying to figure out why one alternative ran faster than the other. Good luck!

resource list:
www.mysql.com - complete documentation for MySQL



Was this resource helpful?

Did this resolve your issue?


Please add any other comments or suggestions about this content:





Recommended Help Content

Optimizing MySQL: Queries and Indexes Article 3 of 4 Courtesy of: Ian Gilfillan Ordering by surname is a common requirement, so it would make sense to create an index on surname. But in this example o
Knowledgebase Article 222,358 views tags: cpu exceeded index indexes mysql optimize optimizing queries query

Optimizing MySQL: Queries and Indexes Article 1 of 4 Courtesy of: Ian Gilfillan Badly defined or non-existent indexes are one of the primary reasons for poor performance, understanding and then fixing
Knowledgebase Article 221,944 views tags: indexes mysql optimize optimizing queries query

Optimizing MySQL: Queries and Indexes Article 2 of 4 Courtesy of: Ian Gilfillan Some knowledge of how indexes work allows you to use them more efficiently. Firstly, note that when you update a table w
Knowledgebase Article 188,887 views tags: indexes mysql optimize optimizing queries query

Related Help Content

What is CPU Protection
Knowledgebase Article 387,273 views tags: cpu database mysql optimize peak performance query script slow

How to repair and optimize your database using phpMyAdmin:
Knowledgebase Article 443,040 views tags: database mysql optimizing repairing slow speed

A guide on performing a standard SQL query using the phpMyAdmin interface
Knowledgebase Article 192,978 views tags: databases mysql phpmyadmin

This article explains how to add CPU and memory upgrades to a CloudSites account.
Knowledgebase Article 125,834 views tags: add cloudsites cpu memory resources upgrade

How to restore MySQL databases using the Site Backup & Restore tool
Knowledgebase Article 259,484 views tags: backup database mysql restore site

How to backup all of your MySQL databases using the Site Backup & Restore tool
Knowledgebase Article 273,541 views tags: backup database mysql restore site

How can I optimize my images?
Knowledgebase Article 237,477 views tags: images optimize

How do I display my home page which is using a name other than index.html?
Knowledgebase Article 331,058 views tags: directory homepage htaccess index order