Cost savings approaches in AWS

by Mike Sweetman
15 September 2020

A very important point when using cloud computing is the price. One of the reasonswhy cloud computing began to develop was because it can offer a very low price with a high quality of service. When serving a large number of customers, cloud service providers can give very cost-effective solutions because cloud systems are inherently scalable.

But you need to correctly and competently select the needed resources in cloud systems. You can't just turn on a bunch of expensive instances in the cloud system so that they work constantly and eat a lot of money from the project budget. If you competently approach the issue of scaling, applying load balancing, and choosing a service area in the cloud, you can save a lot on data storage and processing, as well as traffic.

Let's take a look at the main points that you need to pay attention to when setting up a cloud system in order to save your budget and serve your customers well. In addition, by wisely allocating resources, you save energy and computing power, which means you are protecting the environment.

What is AWS?

Cloud computing is a model for providing convenient network access on-demand for a certain common pool of configurable computing resources (for example, data networks, servers, storage devices, applications, and services: both together and separately) that can be quickly provided and released with minimal operational costs or calls to the provider.

Cloud computing customers can significantly reduce their IT infrastructure costs (in the short to medium term) and flexibly respond to changing computing needs by leveraging the elastic computing properties of cloud services.

Since its inception in 2006, the concept has penetrated deeply into various IT spheres and takes an increasingly significant role in practice. According to IDC, the public cloud computing market by 2009 amounted to $17 billion, about 5% of the total information technology market, and In 2014, the total costs to organizations for infrastructure and services related to cloud computing were estimated at almost $175 billion.

Amazon Web Services or AWS is a subsidiary of which provides a cloud computing platform for rent to individuals, companies, and governments on a subscription basis. There is also a free subscription, which is available for the first 12 months. The technology allows subscribers to have a full-fledged virtual cluster of computers that is always available via the Internet. AWS virtual machines have most of the attributes of a real computer, including hardware devices (processor, video card, local and RAM, hard disk, or SSD); optional operating system; network; and pre-installed applications such as web server, database, CRM, etc. Each AWS system also virtualizes console I/O, allowing AWS users to connect to their AWS system using a browser. The browser acts as a window into a virtual machine, allowing the user to log in, configure and use their virtual systems just like a real, physical computer. This allows them to configure the system to provide Internet-oriented services to their customers.

AWS technology is based on server clusters (farms) located around the world. The usage fee is based on a combination of the use of hardware/OS/software/network functions selected by the user, as well as requirements for availability, redundancy, security, and additional parameters. Amazon is committed to managing and updating software and hardware to meet the required security standards. AWS operates in many geographical regions, including Canada, Germany, Ireland, Singapore, Tokyo, Sydney, Beijing, London, etc.

In 2016, AWS provided more than 70 services covering a wide range, including computing and storing data, networking, analytics, mobile applications, developer tools, etc. The most popular of these are Amazon Elastic Compute Cloud (EC2). and Amazon Simple Storage Service (S3). Most services are not provided directly to end-users, but instead, offer functionality through APIs that developers can use in their applications. Amazon Web Services offers are available over HTTP using the REST architecture and SOAP.

Amazon advertises AWS as a way to get computing power that scales faster and cheaper than companies building their own physical server clusters. All services are paid for depending on usage, but each service measures usage by its own method.

James Hamilton, an AWS engineer, wrote a retrospective article in 2016 that covers the ten-year history of online services between 2006 and 2016. As an early admirer and outspoken supporter of technology, he joined the AWS team of engineers in 2008.

Cost-saving is important in the cloud systems

Autoscaling means a certain portion of resources is used for a specific activity. Depending on the specifics of the business, two main ways of saving based on business activity can be identified:

  • seasonal economy - most relevant for retailers, when sales leaps can be predicted and they depend on the season (Christmas holidays, summer holidays, etc.);
  • daily savings - when demand peaks at certain times and the rest of the time are mostly downtime.

The opportunities to reduce the cost of cloud computing here are that you don't need to purchase any hardware or software since you can simply rent the space you need from a specialized provider. If you don't need too much space at a given time, you don't need to pay for it. Rented cloud servers are also available anywhere in the world, allowing you to rent cloud storage in locations where rates are lower.

Using Reserved Instances to save on Compute Costs 

Public cloud providers in general, and AWS, in particular, have long advertised flexibility as a key benefit for Cloud Computing, including the ability to utilize resources “on-demand” without long-term contracts or waiting for procurement, shipping, installation, etc.  Similarly, the ability to quickly change the characteristics of resources to increase or reduce computing capacity, memory, or storage capacity, etc. is also part of this flexibility.  But this flexibility does come with some additional costs even though it is much less expensive than on-premises. This means that there are different payment methods depending on specific needs and some up-front planning can help in the long run.

Three options and a multitude of billing rates for the same resources 

For raw computing power in the form of virtual machines - EC2 instances - there are three primary ways to be billed for each individual resource:  on-demand, reserved, or spot.  On-demand is the basic, default billing option where the resource is billed a fixed, known fee per second of usage (e.g. while the instance is running).  AWS publishes the on-demand rates for all individual instance types (per region) and you can easily determine beforehand what an instance will cost based on the instance type, the time it will be running, and the published rate.  This is a good option for resources with a short lifetime (seconds to hours to days to weeks) and when control of the instance lifecycle is required (e.g. workload cannot tolerate interruptions from the server).

Reserved pricing comes in the form of Reserved Instances and is the next option for billing. A Reserved Instance is a running EC2 instance that fits the definition of a previously made reservation. The billing or a portion of it is attributed to the reservation instead of the On-Demand rate for the applicable period of time. The determination of whether an instance fits a reservation has become more complex over time with the addition of “compute units” and other factors but in its simplest form, a reservation is defined as the use of a specific instance type in a specific location (region and option availability zone) for a committed term (1 or 3 years.)  Payment options include paying some portion upfront or paying everything upfront when the reservation is made to save even more cost. Reserved Instance rates are also published as a fixed, known cost by AWS at the instance type (and region) level.  

The amount billed is not tied to the actual usage as it is in On-Demand.  A reservation allocates specific resources for your use and is essentially billed at a minimum amount for the term of the reservation whether the resources are fully utilized or not.  Any usage exceeding the reservation is billed at the On-Demand rate. The idea of the reservation is to trade some commitment to this minimum usage over a period of time for a significantly reduced rate, up to 60% savings.  For compute resources that need to be constantly available for an extended period (a year or more), this is the best option for payment.  Examples include VPN gateways, directory servers, file servers, or database servers (though, for databases, there are similar reservations possible in RDS.).

Spot billing presents the best option for the most significant savings - up to 90% in some cases over On-Demand rates - but using them involves even more significant tradeoffs to achieve these savings. The biggest difference for Spot instances is the lack of control over the lifetime of the instance.  It can come and go at any time with minimal warnings.  This is because Spot instances are allocated from the excess capacity AWS has available at any point in time, which is constantly changing.  The idea for AWS is that they can sell this excess capacity to earn some revenue instead of it being completely idle, but can also reclaim the capacity when needed for On-Demand or Reserved Instance needs. Another difference from the other rates is that Spot Rates may change at any time and they are not published as the other rates are (there are APIs available to determine current spot rates for an instance type at a particular location, however).  AWS has made efforts to control Spot Rates more recently so that the price varies less in an effort to increase Spot rate usage. 

The key requirement for a workload to be able to take advantage of Spot rate pricing is to be completely stateless. This eliminates the sensitivity of the workload to the lifetime of the server it runs on. Examples are API servers, analytics servers, or web servers, where all of their computations or state changes are stored elsewhere (e.g. object storage, file storage on another on-demand/reserved instance, database, or caching server).

Reserved Instances in practice

We are constantly looking for ways to save expenses for our customers and reserved instances often are a part of our solution.  A fairly typical example is for a flexible workload to be set up and left running in AWS for a period of time long enough to determine the usage patterns - peak web usage times, for example - as determined by monitoring all resources. Some workloads with high seasonal variability (e.g. tax processing systems, eCommerce systems) may require up to a year of usage before all of the patterns are known, but more often only a few months are sufficient.  The goal of this initial period is not to fully predict usage to schedule all resources over an extended period, but instead to determine the minimal set of resources needed to support the workload and then make those as low-cost as possible by purchasing reserved instances for them. Establishing this baseline along with knowing the patterns or seasonality helps provide our customers with some cost predictability.  

The most common reservation our customers have chosen is One Year, No Upfront.  This preserves some flexibility to reevaluate instance types (adopt new generation, change family, etc.) each year, but provides significant savings with no outlay of cash.

Auto-scaling groups with Spot rates

A common way to take advantage of cloud flexibility is to use Auto-Scaling Groups (ASGs) for stateless workloads to vary the number of instances based on the computing power needed to satisfy the load.  Web and application server clusters behind Application Load Balancers are an example application of this concept.  Because these are stateless by design, the opportunity to use Spot Rate instances exists.  Combining On Demand (or Reserved Instances) with Spot Instances previously required complex interactions between multiple ASG configurations until AWS incorporated the concept directly within the ASG definition itself in late 2018.  This made it possible to have a single ASG with a minimum number of On Demand (or Reserved) instances alongside a set of Spot Rate instances. It is now also possible to specify the portion of Spot instances to use during scale-out events.  

The other challenge with using Spot rates was handling the case where the price went above the On-Demand rate because of the high demand for the resources. AWS has addressed this in the new ASG definitions by providing the ability to specify a maximum price to use for Spot with the default of the On-Demand price.  Then when a Spot instance may be called for, if the price is higher than the maximum in the ASG definition (or On-Demand price if default) then an On-Demand instance will be used instead.  

Applying these concepts to our customer’s ASG configurations has provided significant savings for our customers without any further changes. One example is a highly varying ASG cluster serving a popular website that uses 100% Spot Instances and saves over 75% over the previous combination of a small number of Reserved Instances and On-Demand for its frequent scale-out events.

Savings Plans

AWS introduced another way to pay for compute resources with AWS Savings Plans in late 2019.  This extended the idea of Reserved Instances beyond EC2 and includes other compute services like AWS Lambda (“serverless functions”) and AWS Fargate (“serverless containers”).  The concept is similar to Reserved Instances where a customer can predict and commit to paying for a specific amount of monthly computing power but greatly increases the flexibility of this by allocating the computing power across services, regions, instance types, and more.  We are still in the process of evaluating how to apply Savings Plans instead of Reserved Instances and will begin applying this as our customers’ reservations expire.


In conclusion, we strongly recommend reading this article on Amazon.

Here you will find a lot of interesting and simple changes that can save significant amountson cloud services. Many of the methods listed in this article will work on other cloud systems as well.

You can also drive all your projects into docker containers, and then you can use several cloud systems at once and choose where it will be cheaper to launch this or that service since the docker container is universal and you can run it all regardless of the chosen cloud system and quickly move to another cloud provider.

Our Luneba experts have extensive experience in configuring cloud systems and choosing the best cloud billing system. We have the necessary knowledge for many types of projects to set up cloud services both in Amazon and in other cloud systems.

You can contact us about developing a project from scratch for a cloud system, or on the other hand, you can come with a finished project and we will see if there are options to reduce the cost of cloud services. This will help make the project more efficient and save resources of both the company itself and the end-users of the information system.

We will always respond to your requirements and help with setting up and developing cloud projects in various cloud providers.

Related articles.