Amazon S3 vs Amazon Glacier

by Mike Sweetman
17 November 2020

Every day, stored data in cloud systems continues to grow.

Some of the most important cloud file storage services are Simple Storage Service (S3) and Glacier from Amazon.

S3 allows you to store and retrieve any amount of data, at any time, from anywhere on the network, i.e., so-called file hosting. S3 provides you with object storage where you can hold not only a file but metadata about files, which can also be used for processing data.

With the help of Amazon S3’s high scalability, reliability, and high speed, you can attain inexpensive storage infrastructure. S3 first appeared in March 2006 in the United States and in November 2007 in Europe.

Amazon S3 offers several storage classes for different use cases. These include S3 Standard general-purpose storage for frequently accessed data, S3 Intelligent-Tiering for data with unknown or changing access schemes, S3 Standard-Infrequent Access (S3 Standard-IA), S3 One Zone-Infrequent Access (S3 One Zone-IA ) for data requiring long-term storage.

Also, there’s Amazon S3 Glacier (S3 Glacier) and Amazon S3 Glacier Deep Archive (S3 Glacier Deep Archive) for long-term storage and digital archiving of data.

In this article, we’ll give you an outlook of working with Amazon S3 Standard (hereinafter S3) and Amazon S3 Glacier (hereinafter S3 Glacier). We’ll explore when it’s necessary to use S3 and when it’s better to move to S3 Glacier.

What is S3

Amazon S3 is a convenient, cloud-based object storage service. S3 offers industry-leading performance, scalability, availability, speed of access, and data security.

S3 can be used to store almost any amount of data in a variety of scenarios. Oftentimes, the storage service is used to support the operation of static web sites, mobile applications, backup and recovery, archiving, corporate applications, IoT device-generated data, application log files, and big data analysis.

Amazon S3 also offers easy-to-use administration tools. These tools, accessible through the web console, command line, or API, allow you to organize data and fine-tune access restrictions to meet project or regulatory needs.

Amazon S3 is 99.999999999% reliable and now stores data for millions of applications.

S3
Image sourse

What is Amazon Glacier

Amazon S3 Glacier is a highly affordable storage service. S3 Glacier (formerly known as Amazon Glacier) provides secure, highly reliable storage for archiving and backing up data. 

With Amazon S3 Glacier, customers can securely store data for as little as $0.004 per GB per month. It is worth noting that in the S3 Glacier mode, you will be charged for receiving data from the cloud.

Amazon S3 Glacier enables customers to offload the administrative burden of managing and scaling archive storage to AWS. This eliminates the need for resource planning, provisioning the necessary on-premises equipment, data replication, detecting and eliminating hardware failures, and significant labor costs for moving equipment.

Data is stored in Amazon S3 Glacier as archives. At the moment, the maximum size of one archive is 40 terabytes. Amazon S3 Glacier can store an unlimited number of archives and unlimited data.

When creating an archive, it’s assigned a unique identifier. The contents of the archive are immutable: once created, the archive cannot be updated.

It is possible to set an access policy for each store in order to allow or deny users to perform certain actions. You can have up to 1,000 repositories per AWS account.

Please find prices for data storage in Amazon S3 Glacier in the following image. Prices are actual for the moment of writing this article. The final price will include the price for storage, the price for data retrieval, and the price for a retrieval request. Also, it is necessary to note that prices can vary in different regions.


Image source

When to replace S3 with Amazon Glacier

In what cases does it become necessary to switch from regular S3 to S3 Glacier?

Let's see when it’s appropriate to make this transition:

  • When data is accumulated but quick access to it is not required.
  • When organizing an archive.
  • When organizing a backup.
  • With large amounts of data, S3 Glacier budget is much less.

Amazon S3 Glacier provides three archive extraction options (also known as retrieval tiers) to meet different access time and cost requirements: Expedited, Standard, and Bulk Retrieval.

  • Express extraction where archives are available in 1–5 minutes.
  • Standard extraction where archives are available in 3-5 hours.
  • Batch retrieval for cost-effective access to large amounts of data (up to a few petabytes), which costs $0.0025 per GB.
  • Data retrieval cost is varied.

Amazon S3 Glacier Select lets you query data stored in Amazon S3 Glacier without retrieving the entire archive.

Amazon S3 Glacier Select lets you find and process only the data you need to work in your archives.

What are the steps to take to transition to Amazon S3 Glacier?

  • Decide how much data you are going to work with.
  • Determine how often you need to retrieve data from the backup.
  • Decide how much time you have to wait until your backup is ready.
  • Think about whether you need to receive data through the API.

Based on this, you can calculate whether you should switch from standard S3 to Amazon S3 Glacier and what technical characteristics will be critical for your work.

Switching back from the S3 Glacier service is extremely rare or never occurs in real projects. Archived data can be retrieved quickly from storage on Glacier. In addition, the same API to work with Glacier is now supported, as in S3.

  S3 S3 Glacier
Access speed Hi Low
API availability Yes Yes
Data storage cost Relatively high Very low
Object size Up to 5 Tb 40 Tb
Accommodation in regions Many regions Average
Static Web Content Yes No
Supporting Versioning Yes No
Number of data archives Unlimited 1000 per account


Customer Examples

For our AWS customers, S3 and Glacier are common components of solutions we created. Use cases cover a wide range and take advantage of a broad set of S3’s features and often integrate with other AWS services. Some examples include:

  • Static website publishing: S3 Standard can be configured to serve its public contents (HTML or other assets) via HTTP/S. This is a very efficient way to deliver a static website or even a dynamic web application using JavaScript frameworks such as React, Angular or Vue.js  
  • Static website “redirect” publishing: similar to the prior use case, the configuration of a bucket as a publicly accessible website does not need to include any files. When used in combination with CloudFront, the bucket can be used to redirect traffic from one subdomain to another. 
  • Private website content publishing: objects in S3 buckets can remain private and still be available via the web. For example, we provided a customer with a solution to share custom-created podcasts, videos, and documents only to website subscribers via a private S3 bucket. The service was made via CloudFront and accessed via signed-URLs.
  • Medium-term file storage: Application logs can be stored in many different ways, but the data typically has a short lifetime due to its quickly diminishing usefulness. Based on this, we configured a series of S3 buckets to receive and store application logs from a customer’s distributed system for use in problem determination and resolution. Log data that was a few weeks old was rarely if ever needed. Instead of accumulating data, the bucket was configured with a Lifecycle Configuration to transfer files to infrequent access after 30 days and delete expired data after 6 months.
  • Redundant backup storage: The AWS Backup service and options are reliable and comprehensive for AWS managed services for databases, compute instances, and filesystems. However, we have found that the formats are not as accessible or friendly for quick restoration. In this case, redundant backup is used where zip files of filesystems or SQL dump files are created and uploaded to S3 for storage. Given their redundancy, these are not needed for long periods of time. These buckets are typically configured for infrequent access storage class and lifecycle expirations of up to 30 days before deleting data. 

Conclusion

S3 and S3 Glacier complement each other harmoniously. Building your own facilities to store and access data in a reliable and timely fashion requires a massive technical, human, and technological effort. In addition, passing the certification process of a file storage system is not an easy or inexpensive task.

For these reasons, using Amazon S3 and Amazon S3 Glacier helps solve several tasks at once:

  • Obtaining certification of the required level of data storage.
  • Accessing the network to file resources.
  • Obtaining the required data access speed.
  • Saving funds when working with projects of different complexity levels.
  • Passing the required level of network data security.
  • Integrating with projects using APIs.

When choosing the right tool to solve your data storage problems, you must systematically calculate the cost of working with these cloud services. Once you have reached a specific amount of data storage, you can consider moving to S3 Glacier and migrate at the API level quickly.

Our Luneba experts have deep experience with several cloud storage systems. We can help you calculate the cost and storage performance of your project and arrange the successful transition from one storage service to another.

Related articles.