Just two days ago, Amazon announced that it has made Amazon S3 Batch Operations, a storage management feature for processing millions of S3 objects in an easier way. It is also an automated feature that was first previewed at AWS re:Invent 2018. Users can now set tags or access control lists (ACLs), copy objects to another bucket, initiate a restore from Glacier, and also invoke an AWS Lambda function on each one.
Developers and IT administrators can now change object properties and metadata and further execute storage management tasks with a single API request. For example, S3 Batch Operations allows customers to replace object tags, change access controls, add object retention dates, copy objects from one bucket to another, and even trigger Lambda functions against existing objects stored in S3.
S3’s existing support for inventory reports are used to drive the batch operations. With this new feature of Batch Operations, users can now easily write code, set up any server fleets, or figure out how to partition the work and distribute it to the fleet. Users can now create a job in minutes with a couple of clicks. S3 uses massive, behind-the-scenes parallelism to manage the job. Users can also create, monitor, and manage their batch jobs using the S3 CLI, the S3 Console, or the S3 APIs.
Important terminologies for batch operations
S3 Inventory report
An S3 inventory report can be generated when daily or weekly bucket inventory is run. A report can be configured to include all of the objects in a bucket or to focus on a prefix-delimited subset.
A manifest is an inventory report or a file in CSV format that identifies the objects to be processed in the batch job.
Batch action is the desired action on the objects which is described by a Manifest.
An IAM role provides S3 with permission for reading the objects in the inventory report and perform the desired actions for writing the optional completion report.
Batch references all of the above-mentioned terminologies. Each job has a status and a priority; higher priority (numerically) jobs take precedence over those with lower priority.
Most of the users are happy because of this news as they think the performance of their projects might increase. A user commented on HackerNews, “This S3 request rate performance increase removes any previous guidance to randomize object prefixes to achieve faster performance. That means you can now use logical or sequential naming patterns in S3 object naming without any performance implications.”
To know more about this news, check out Amazon’s blog post.