将多个文件放在后台作业或单个文件中会更好吗?

Few months ago i asked this about the implementation of my api for processing files it uses PHP, a command line script that is called via PHP and queue. For the queue i am using beanstalkd

The API accepts one file or group of files (up to 5) per request. Processing one file takes 1-3 seconds depending of the size.

My question now is whatever will be better, to put every file of the request to a separate job or all the files in one job? My function for processing that is slow accepts one or multiple files. My guess is that i put the all the files of the request on processing, they will be processed by one worker. But if i put every file into separate background job it will be probably processed by own worker so 4 files 4 workers - that is what i think. Not sure if this is correct.

So if my above conclusion is correct, is it better for a lot of requests to process all files in once or add them separate worker?

Thank you.

To handle more users, or more throughput in the same second you need to ensure multiple things:

  • have more than 1 worker, usually scale up the size of the worker to 10 from start
  • this way you have 10 parallel workers
  • put 10 different messages into the queue so each worker pickup a job to tackle
  • monitor queue and if more jobs keep accumulating add more workers
  • monitor machine CPU and Ram state and if starts to throttle around 80% of CPU you should consider adding another machine that consumes jobs from the same queue
  • you could have different machines for different needs (SSD for fast IO, high end CPU for quick jobs, lower machines for transactional states etc..)