include()更快还是数据库查询?

A client is insisting that we store some vital and complex configuration data as php arrays while I want it to be stored in the database. He brought up the issue of efficiency/optimization, saying that file i/o will be much faster than database queries. I'm pretty sure I heard somewhere that file includes are actually slow in PHP.

Any stats/real info on this?

I don't think that performance is a compelling argument either way. On my Mac, I ran the following tests.

First 10,000 includes of a file that doesn't do anything but set a variable:

<?php

$mtime = microtime(); 
$mtime = explode(' ', $mtime); 
$mtime = $mtime[1] + $mtime[0]; 
$starttime = $mtime; 

for ($i = 0; $i < 10000; $i++) {
    include("foo.php");
}


$mtime = microtime(); 
$mtime = explode(" ", $mtime); 
$mtime = $mtime[1] + $mtime[0]; 
$endtime = $mtime; 
$totaltime = ($endtime - $starttime); 
echo 'Rendered in ' .$totaltime. ' seconds.'; 
?>

It took about .58 seconds to run each time. (Remember, that's 10,000 includes.)

Then I wrote another script that queries the database 10,000 times. It doesn't select any real data, just does a SELECT NOW().

<?php
mysql_connect('127.0.0.1', 'root', '');
mysql_select_db('test');

$mtime = microtime(); 
$mtime = explode(' ', $mtime); 
$mtime = $mtime[1] + $mtime[0]; 
$starttime = $mtime; 

for ($i = 0; $i < 10000; $i++) {
    mysql_query("select now()");
}


$mtime = microtime(); 
$mtime = explode(" ", $mtime); 
$mtime = $mtime[1] + $mtime[0]; 
$endtime = $mtime; 
$totaltime = ($endtime - $starttime); 
echo 'Rendered in ' .$totaltime. ' seconds.';

?>

This script takes roughly 0.76 seconds to run on my computer each time. Obviously there are a lot of factors that could make a difference in your specific case, but there is no meaningful performance difference in running MySQL queries versus using includes. (Note that I did not include the MySQL connection overhead in my test -- if you're connecting to the database only to get the included data, that would make a difference.)

Given that most people will include 10-20 files into their script for a regular page, I have a feeling that includes are much faster than MySQL queries.

I could though, be wrong.

The question is that if those values will never change without you doing other modifications (moving files, etc), it should probably be stored in an include file.

If the data is dynamic in any way, it should be pulled from a database.

It's gonna vary heavily based on your specific case.

If the database is stored in memory and/or the data you're looking for is cached, then database I/O should be pretty fast. A really complex query on a large database can take a fair bit of time if it's not cached or it has to go to disk, though.

File I/O does have to read from the disk, which is slow, though there are also smart caching mechanisms for keeping often-accessed files in memory as well.

Profiling on your actual system is gonna be the most definitive.

This is a pretty obvious case of premature optimization. Don't ever try to optimize things like this unless you've actually identified it as a real bottleneck in a production environment.

That said, using an opcode cache like APC (you are using an opcode cache, right? Because that is the very first thing you should do to optimize PHP), my money is on the include file.

But again, the difference will likely be neglible, so pick the solution which requires 1) the least code and 2) the least maintenance. Programmer time is much more expensive than CPU time.

Update: I did a quick benchmark of the inclusion of a PHP file defining a 1000-entry array. The script ran 5 times faster using APC than without.

A similar benchmark, fetching 1000 rows from a MySQL database (on localhost), only ran 15% faster using APC (since APC doesn't do anything for database queries).

However, once APC was enabled, there was no significant difference between using an include file and using a database.

I don't think this decision should be based on performance. The question I'd ask myself: is this data going to be updated by the application. If the answer is "no", consider how much quicker and simpler it will be to implement and use as an included array.

I work with a large system where almost every possible thing is stored in the database. Even data that has to be manually changed via a database alter written by the developer, and I can tell you it has led to way more coding, and way more complexity than if the information was stored the way your client is suggesting.

If the data won't change often and has to be changed via manual intervention anyway and doesn't need to be made available in the database (for other systems, for instance), give the array a try. You can always put it in the database later and write all the necessary SQL.