LOAD DATA INFILE

Inserting data into MySQL can be a slow affair. We work with a lot of data here and we generate some big aggregated tables before pushing data to other systems.

One of these tables now requires ~175,000 inserts and it has start to show on the execution time of the cronjob.

Remembering back to a talk at PHPNW 2013, I decided to look at using the mysql function LOAD DATA INFILE.

Rather than run 175,000 inserts, we now build a CSV file, and once its complete we run a LOAD DATA INFILE sql command.

[code language=”php”]
//First create and truncate the file
$fp = fopen(sys_get_temp_dir() . DS . ‘agg_import_file.csv’, ‘w’);
fclose($fp);

//now in each iteration of inserts (we have multiple)
$fp = fopen(sys_get_temp_dir() . DS . ‘agg_import_file.csv’, ‘a’);
foreach($somearray as $someitem){
//…
foreach($someitem as $somerow){
//…
fputcsv($fp, $somerow);
}
}
fclose($fp);
[/code]

Now we have a big CSV file we can use to populate the table in MySQL.
[code language=”sql”]
LOAD DATA CONCURRENT INFILE ‘/path/to/file.csv’
IGNORE INTO TABLE table_name
FIELDS TERMINATED by ‘,’
OPTIONALLY ENCLOSED BY ‘\"’
LINES TERMINATED BY ‘\n’
(list, of, field, names…)
[/code]

And thats it. We got our processes down from just under two hours (!) to about 10 minutes. (And the SQL insert only takes just over a minute).

More information can be found in the MySQL docs

Happy days!

About the Author: jay