I am currently re-writing a file uploader. Parsing scripts for different data types that currently exists are perl scripts. Program is written in php. The way it currently is that it allows for a single file upload only and once the file is on the server, it will call the perl script for the uploaded file's data type. We have over 20 data types.
What I have done so far is to write a new system that allows multiple file uploads. It will first let you validate your attributes before upload, compress them using zipjs, upload the zipped file, uncompress it on the server, for each file, call the parser for it.
I am at the part where I need to say for each file, put the parser call in the queue. I can not run multiple parsers at once. Rough sketch is below.
for each file
$job = "exec('location/to/file/parser.pl file');";
// using the pheanstalkd library
$this->pheanstalk->useTube('testtube')->put($job);
Depending on the file, parsing may take 2mins or 20mins. When I put the job on the queue, I need to make sure that the parser for the file2 fires after the parser for file1 finishes. How can I accomplish that ? Thx
Beanstalk doesn't have the notion of dependencies between jobs. You seem to have two jobs:
If you need job B to run only after job A, the most straightforward way to do this is for Job A to create Job B as its last action.
I have achieved what I wanted which was to request more time if parser is taking longer than a minute. Worker is a php script and I can get the process id when I execute the "exec" command for the parser executable file. I am currently using the code snippet below in my worker.
$job = $pheanstalk->watch( $tubeName )->reserve();
// do some more stuff here ... then
// while the parser is running on the server
while( file_exists( "/proc/$pid" ) )
{
// make sure the job is still reserved on the queue server
if( $job ) {
// get the time left on the queue server for the job
$jobStats = $pheanstalk->statsJob( $job );
// when there is not enough time, request more
if( $jobStats['time-left'] < 5 ){
echo "requested more time for the job at ".$jobStats['time-left']." secs left
";
$pheanstalk->touch( $job );
}
}
}