Solving the cost savings conundrum in Cloud

Solving the cost savings conundrum in Cloud

Gartner estimates that end user spending on Cloud is expected to reach $679B in 2024 and projected to exceed $1 trillion in 2027.  Surveys indicate that nearly 30% of the Cloud spend is wasted by Enterprises, representing a $204B opportunity for FinOps!

Understanding this waste is often a laborious task and requires a fair amount of attention to the nature of cloud consumption. The Cloud service providers (CSPs) are quick to position commitment programs as the panacea for cost savings. It is not uncommon to see organisations defer technical optimization efforts after migrating to cloud while buying into the commitment programs to lower spends. The dilemma is whether this is the right approach to deliver savings on cloud in the long run. As a FinOps practitioner, I strongly believe and advocate that cloud usage should be inherently cost efficient to maximize the savings potential for enterprises. Enterprises can achieve this through a simple approach that starts with identifying and avoiding wasted spend. Follow this up by architecting applications to run on modern cloud infrastructure using microservices. This would economize your cloud spend and establish a realistic view of steady state cloud usage. The central cloud teams can go ahead and commit spends with discounts without worrying about leaving any savings on the table. In this article, I will elaborate on the three steps that will help your organization confidently solve the cost savings conundrum!


  1. Avoid Cloud waste: In the Cloud, any resource that is left on while not being used or inadvertently launched is incurring you a charge. As your cloud footprint expands, so would the number of cloud environments and provisioned resources. The ability to identify and shut down ‘wasted’ resources on cloud is paramount. The usual suspects are non production virtual machines running 24 x 7, inactive or under-utilized compute instances, unattached disks or storage volumes, idle volumes, older snapshots for backup, unattached elastic IPs, anomaly detected cloud spend. Best practice here is to set up resource configuration rules and audit your cloud resources for waste using CSP services such as AWS Config or Azure Policy. Highlight the wastage to cloud operations teams using resource level reports so that they can take prompt actions. This step would require only infrastructure setup changes and no changes to your application architecture. 
  2. Make applications cloud ready: Large enterprises often prioritize the speed of deploying their application in the cloud instead of the cost of doing so. A ‘lift and shift’ approach for cloud migration usually results in replacing an on-premise server with a virtual machine on cloud to avoid any impact on application performance. This leads to lower CPU or Memory utilization (for VMs) or low capacity utilization (for disks or storage volumes) as the cloud infrastructure remains over-provisioned post migration. These workloads remain well funded through the migration phase either by internal project (or Capex funding) along with any migration credits offered by partners or cloud providers. Once the migration is completed, the funding source shifts to BAU (or Opex) and the operations teams are left with the onerous task of decommissioning unused environments and resizing the cloud infrastructure. There is a minimum 3 months delay to implement these changes pending approval of internal change requests & funding and onboarding cloud engineers, not to mention an additional 6-9 months FinOps teams spend in building detailed insights on cloud over-spend and convincing management to act on them. The cost of delay can be easily avoided through implementation of DevOps pipeline in a sandpit environment, whether it is to test the application workloads on a new processor or running workloads on Spot instances. This would inherently make migration efforts more efficient and help cloud teams achieve a true view of steady state cloud usage quickly.
  3. Maximize savings with commitments: CSPs offer upto 70% discount on their on-demand price for committing spends on cloud infrastructure services . Most enterprises follow a thumb rule to achieve 80% spend coverage through these commitments allowing for any variability in consumption. However, without the implementation of steps (1) and (2), I remain unsure if I am covering the ‘right’ cloud spend through commitments (bit like advertising spend). The commitments are designed to maximize savings so inherently, the most “wasteful” cloud spends attract the maximum savings. For example, if the commitments apply largely to un-optimized non production environments with transient workloads, the production workloads may not benefit from the preferential rates. Further, when these non-production environments are shutdown, enterprises are saddled with the cost of under-utilized commitments. Another case in point is that certain commitments (Eg. AWS RDS reservations) are relatively inflexible and one might get locked down to a particular instance type to service the commitment. It is ok to allow for some short term savings through reservations. However, when enterprises buy savings plans while deferring usage optimization, they run the risk of not offering sufficient incentives for technical teams to rightsize and architect their applications for cloud. Remember that optimization projects can fund themselves and deliver savings durably. 

Usage optimization on Cloud is extremely challenging and hard. Savings from usage optimization are difficult to measure and report as well. However, Enterprises that embrace this culture will be more successful in not only taming their cloud bills but also becoming self-learning and cost-efficient in the long run.


Bio: 

Arvind Shastry is a certified FinOps Practitioner and leads the FinOps practice at Transport for NSW based in Sydney. He has 15+ years of experience as a Business Strategy and Operations expert across organizations such as Tata Group, InMobi and Amazon. He is also a FinOps content creator and a member of the FinOps Foundation. 

Gireesh Ramji

DAAS - Decisions are a Science!

7mo

Thanks for sharing Arvind Shastry - aligned on most of your views, but just wonder whether it is Is it actually easy or practical to prioritise tech optimization for cloud, without having aligning incentive structures within the org to long term rather than short term cost savings?

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics