Queue&Events的工作? CSV文件导入和处理

I need to add functionality allowing a user to upload a .csv file (+- 300 rows of data), then process the file ... the steps are:

  1. The user uploads a .csv file
  2. The .csv file is saved to S3
  3. Each row in the .csv file is validated and imported into a database table (+- 300 rows)
  4. Individual PDF reports are generated and saved to S3 (one for each row of data imported into the database) ie. +- 300 PDF reports generated and saved to S3.
  5. Zip all the PDF's into a single file and save to S3
  6. The user is notified when the job is done (and can download the zip). In your experience, what would be the most efficient way to achieve this if using 3rd party services was out of the question (other than those provided by AWS)

I'm leaning towards ... after the .csv file is uploaded and saved to S3, queue all the other tasks and send the user an email when the job is fully completed? How can I handle all these. User can just upload the csv file and start the process and it should be noticed when the job is done.

Feedback greatly appreciated.

Thanks.

AWS has some great tooling for this type of situation. To start with, you should use S3 bucket triggers with AWS Lambda. It will run code upon the file being uploaded. From there you can either do all of the work or use SQS or other messaging solution to fanout messages to be processed in parallel. Each of these tasks can then be bundled up into a zip file using a separate process if desired and an email, SMS, or push notification can be sent using SNS or SES.

Both AWS SWF and StepFunctions could be used to orchestrate all of the steps and workflow.