Scenario 1/ writer would produce a new version of a csv file every few hours 2/ reader, in another application, would read the contents of csv file
We want to ensure that reader won't be affected when it is reading latest version of the file, but at the same time writer publishes new version. Reader should still be able to read the file from the version it had started reading. When reader requests again for the file, it should get the new version.
Can I use AWS S3 versioning to achieve this? Is it possible to achieve it on a "directory" level in s3? Is it possible to stream file contents during read instead of downloading full file before doing processing?
Scenario 1/ writer would produce a new version of a csv file every few hours 2/ reader, in another application, would read the contents of csv file
We want to ensure that reader won't be affected when it is reading latest version of the file, but at the same time writer publishes new version. Reader should still be able to read the file from the version it had started reading. When reader requests again for the file, it should get the new version.
Can I use AWS S3 versioning to achieve this? Is it possible to achieve it on a "directory" level in s3? Is it possible to stream file contents during read instead of downloading full file before doing processing?
Versioning will let you keep track of file versions so the reader can stick with the version it started with until it decides to fetch a new one. No directory versioning directly in S3, but you can manage versions by using timestamps or version IDs in filenames. You can stream the file without downloading it all using get_object or use S3 Select for querying specific parts of the CSV.