I want to fetch only files I haven't fetched previously from an S3 Bucket. I also need their file names so I know which files to handle in each iteration.
I've decided I don't want to implement a queue listener for now so using the notifications isn't what I want.
I've considered using the downloadBucket api and turning on "debug", and then parse the results to find the files that were downloaded.
Does anyone know a better way, or if turning on debug impacts performance?
I'm using laravel/php to implement this.
S3 doesn't have a filter features in list_objects. So the best things you can do, is MOVE processed file to a new bucket or prefix. So you don't need to worry which key is "processed".
Thus you can do this (algorithm, not code)
- store new_key to new/ folder
- select everything in new/ prefix
- process each key
- copy key to proceesed/ prefix
- delete key in new/ prefix
Note : Every 1000 PUT/COPY/POST/LIST will cost you a 0.005 cents. So use it sparingly .