查找数千个文件中是否存在动态文件名

I am writing a cache module in PHP. It tries to write a cache with a $string+timestamp as a filename.

I don't have problem with writing the cache.

The problem is, I do a foreach loop to get the cache that I want.

This is the logic that I use for getting the cache:

foreach ($filenames as $filename){ 
    if(strstr($filename,$cachename)){//if found 
        if(check_timestamp($filename,time()))
                          display_cace($filename);  
        break;
    } 
}

But when it tries to get and read the cache, it slows the server down. Imagine that I have 10000 cache files in a folder, and I need to check for every file in that cache folder.

In other words, I write the cache file with this format filename_timestamp. For example: cache_function_random_news_191982899010 in a folder ./cache/.

When I want to get the cache, I only pass cache_function_random_news_ and check recursively on that folder. If I find something with that needle on a file name, display it, and break.

But checking recursively on a 10000 files in a folder is not a good thing, right?

What's the best way of doing this?

Do not store the timestamp as part of the filename, but store it in the file together with the cached content in some format that makes sense to you. For example:

File /cache/cache_function_random_news:

191982899010
stored content

The first line of the file contains the timestamp, which you can read when needed, e.g. when cleaning the cache periodically. The rest of the file contains the cached content. Another possibility would be to use serialized arrays. Either way, this makes it trivial to read the cache:

if (file_exists('cache/cache_function_random_news')) ...

Browsers and web servers work around the cache maintenance issue by maintaining an 'index'. You can maintain this index in either a file(binary/text) or a database.

For example:

  1. Whenever you create a new cache file, add a row/entry to the table/file.
  2. Then just use table/file to quickly search for cache-file existence
  3. You can also mark unnecessary/obsolete files using a flag in the record
  4. Then periodically (using a Cron job or some other technique) delete the obsolete cache files.

This approach will greatly improve the performance.

function rpl_cache_get($cachename, $time=''){
    $ci=&get_instance();
    $ci->load->helper('directory');

    //getting the file in cache folder.
    if(is_file(BASEPATH.'cache/'.$cachename)){  
        //current time is less then the time cache expire
        //get the data.
        $f = fopen(BASEPATH.'cache/'.$cachename,"r");
        $content = fread($f,filesize(BASEPATH.'cache/'.$cachename)); 
        if ( ! preg_match("/(\d+TS--->)/", $content, $match))
        {
            return FALSE;
        }
        // Has the file expired? If so we'll delete it.
        if (time() >= trim(str_replace('TS--->', '', $match['1'])))
        {       
            @unlink(BASEPATH.'cache/'.$cachename);  

            log_message('debug', "Cache file has expired. File deleted");
            return FALSE;
        } 
        $content = str_replace($match['0'], '', $content); 
        fclose($f);

        return unserialize($content); 
    }

    return false;
}

this caching system is saving html blocks into php serialized array. and then with the function above reads it and unserialize it and returns the array of htmls. you just need to display them using echo or print_r

function rpl_cache_write(&$data,$name,$timelimit){  
    $timesecond = $timelimit * 60;
    $cache_timestamp = time() + $timesecond;

    $f = fopen(BASEPATH.'cache/'.$name,"w");
    if($f != FALSE){
        $content = $cache_timestamp.'TS--->'.serialize($data);      
        fwrite($f,$content,strlen($content));       
        fclose($f);

        return true;
    } else {
        //todo : throw error cannot write cache file
        //echo "cannot write cache";
    }
    return false;
}