I have several gigabytes of documents which I wish to store online somewhere in a system I can access from multiple servers using HTTP requests. These are mostly 5-200kB text documents (very few in binary formats) that are not read very often, and need to be stored in a way that all servers can access them. Cost is a big factor.
These documents do not have additional attributes, so if the files were larger I would use S3 for sure, but since they are so small I'm not sure which service would be easier to work with.
Has anyone used either of these services for this type of thing?
S3 has the massive advantage that you can access files very simply over HTTP, and it supports REST operations for creating/updating/deleting files. This makes it incredibly easy to talk to.
Combine that with the fact that with S3 you only pay for storage (with SimpleDB, you also pay for machine hours), and I'd say S3 was the best solution in this case.
I'm pretty sure the maximum row size in SimpleDB is lower than 200kb, so you'd have to use S3.
I currently have a document management system running across several servers which stores all documents in S3. The documents/files range from 1KB to 2GB. So far I've found s3 to be brilliant, very easy to communicate with in almost any language and as a bonus offers AES encryption.
If Storage Size & Cost are your major deciding factors - SimpleDB's sizing model includes indexes and therefore costs more than S3 byte-for-byte. Starting at $0.25 to $0.34 per GB-month, it is a waste to load up SimpleDB with data and not use complex queries.
The attribute value size limit of 1k may impact your designs and require you to chunk values. Great for javascript/terrible for html.
SimpleDB is fantastic as an index to your content hosted on S3 and pre/post processed on EC2.