Back-of-envelope Calculations

You need to have a basic sense of scalability to effectively conduct the back-of-the-envelope estimations. I have covered the basics of scalability and one basic practical example in my previous articles. Do check them out before this if you have no idea about scalability at all; this will help you connect with the topic, going forward from here.

Now, let us take one step back to understand what System design is -

System Design is the process of defining the elements such as the architecture, modules, and components, the different interfaces of those components, and the data that goes through the system. It is meant to satisfy the specific needs and requirements of a business or organization through the engineering of a coherent and well-running system.

What is the back-of-the-envelope calculation?

The back-of-the-envelope calculation is an approximate calculation that we either do in our minds or on a piece of paper, but the steps are not 100% accurate. Nonetheless, the results aim to be more accurate than wild guesses and less accurate than formal calculations.

That is the expectation of these. So, what is an approximation?

For example, what would be the value of (11 x 9.667)? verbally and with no calculators around.

Personally, my answer will be either 100 or 110, and I will simply choose one hundred for my convenience If I must derive another approximation from this value.

Although the exact answer is 106.337, I would avoid wasting time trying to get an exact calculation again as the entire discussion around this topic will be based on assumptions itself.

I believe I will have a cheat sheet on the last page of this article.

Now, back-of-the-envelope calculations are integral in system designs and help choose the right configurations and technologies for the system, and yes, you need an idea of how scalability works in technology to get an approximation value closer to what is actually expected.

So, this will include having scope for you to identify request-response size, database size, cache size, counts of microservices, load balancers, etc. The network bandwidth requirement estimation is also required for designing the system.

It is always a clever idea to estimate the scale of the system before starting any high-level or low-level designs. (LLD & HLD) I will make separate articles for this. But now, in this topic, let us just work with a few examples and understand the basic understanding and how this estimation works and looks like.

As per my POV, a system design has four major areas where the back-of-the-envelope calculations are required, and I will only talk about those: load estimates, database storage, cache estimates, and bandwidth estimates. Also, additionally, knowing if the system is read-heavy or write-heavy helps in better design and estimation.

Understanding the type of system helps to do good estimates. Also, knowing if the system is read-heavy or write-heavy is good for starting the estimations. For example, a tiny URL is read-heavy, whereas web crawlers are write-heavy systems.

Although this is more helpful in choosing the technologies and database types, it can help in making some decisions in caching, Caching is typically used to improve performance and is useful for reading-heavy systems and not for write-heavy systems.

Always remember a short word, BKMGTP.

B: Byte: Ten: 10 K: Kilo: Thousand: 1000 M: Mega: Million: 1000 0000 G: Giga: Billion: 1000 000 000 T: Tera: Trillion: 1000 000 000 000 P: Peta: Quadrillion: 1000 000 000 000 000

K*K = M | M*K =G | G*K = T | T*K =P

Let us start with load estimates, which is the traffic estimation of your system. So, the load estimate helps identify the volume of requests the system is going to process. You can say, requests per second, requests per minute, requests for an hour, or even daily user access (BAU). It is usually measured per second and can be scaled to per day if required.

Following an example of this in back-of-the-envelope format,

Let us assume a system has a write-to-read ratio of 1:100, and there are one million write calls per day.

Please try to figure out what would be the new right-per-second request.

As per the assumption ratio given, the new write request per sec we derive would be.

1M / (24 hours * 3600 seconds) ~= 12 requests/sec.

Then read requests per second that we would use basic match form the ratio to derive would equal to - ~1200 requests/sec.

Let us move on -

Now, database estimations. So, these are especially important as the database is an integral component of any application. Sometimes architects and engineers spend most of the time on business logic and services that need to be built and little time on database selection and estimates.

Note - application response time directly depends on the data source and underlying database response time. Database selection and database modeling are important, and applications can use RDBMS, key-value stores, structured data, etc., to improve performance. Structured data should be stored in an RDBMS, and unstructured data in No SQL, MongoDB, Cassandra, etc. Now, the database type should always be decided based on the use case and without following an anyone-size-fits-all approach.

You can be asked questions around these, for example following.

How much space would it take to store the contents of one hundred million write requests?
How much for storing the content of one million webpages if we are designing a WebCrawler?
What if we substitute each word with an integer index?
How many machines of 512 GB storage size would fit?
Which database can be the best fit, etc., etc.?

Same previous question with a twist, the write-to-read ratio is equal to one used to one hundred (1:100), and there are one million write calls per day.

and say we must store it for 10 years.

Now let us solve this one step at a time.

We can assume the write request is 1 KB.

Then, the storage per day data required will be equal to 1M * 1KB = 1GB (gigabyte)

Storage per year = 1GB * 365days = 360GB (gigabyte)

Storage for 10 years = 360GB * 10years = 3.6TB (terabytes)

Finally, we cannot forget to have some storage for audit, user, security, etc.

Let us assume all of these data will take up another 0.4 TB piling up in the storage.

So, in total for 10 years, we roughly need 3.6TB +0.4TB = 0.4 TB.

and that's it basically we just figured out with assumptions the back-of-envelope calculation for the storage requirement in 10 years.

Well, if you experience these kinds of questions, try to figure out one part of the ratio and the next part would be a mathematical breeze.

If you are thinking, how did I figure out this write request would take 1 KB? That was my assumption. You are as good as your assumptions and your overall knowledge of scalability on these.

In system design, the better you understand designing a system, the better you will be at the architecture of a system. And say if you do not know the basic principles of what a service does or what a service does not do, is not capable of, limitations of those, and it all boils down to your understanding of overall systems.

Another interesting but sometimes offsetting topic is a Cache estimate! There is no hard-defined rule for cache requirements. Some applications do not maintain caches at all and on the other hand, some follow 10% to 30% of the database storage as cache, and some go with 20% to 30% of presently frequently accessed data. After researching norms, I have found out it is preferred to be a split of 80/20 or 70/30 ratio for database usage to cache size, keeping it 20-30% more frequently accessed. That increases the performance. Let us do some back-of-the-envelope calculations for the cache.

Assume that the storage per day is 1 GB.

Then, the cache requirement would be 20% of total storage per day i.e.

(20*1GB)/100 = 200 MB

Seems like simple math, but no,

The above cache calculation is incorrect.

Write to Read Ratio equals 1:100 Write Request per day - let us assume to be 1M Assuming Read Requests per day 100M.

Per request size: 1KB

Total Read Requests Size: 100 * 1 KB = 100 gigabytes per day.

Cache Estimates =20% of Read storage (20*100 GB)/100= 20 GB.

We can say cache estimates with these assumptions are 20% of the reading.

And let us not forget, that caches are very costly.

Overall, good estimates help in better budgeting.

A quick memorizing tip would be if you remember the ten million requests per day with 1 KB byte size is 10 GB per day, then we can quickly calculate for the latter part of the ratio.

Finally, network bandwidth estimates. So, what is the bandwidth? No, bandwidth estimates, network bandwidth. What kind of questions you will get around maybe:

What upstream and downstream speed is required to have the system function as expected?
What would the internet speed require at peak time?

Let us assume the same example for the ratio as the previous one and figure this out.

Write to Read Ratio equals 1:100 Write Request per day - let us assume to be 1M Assuming Read Requests per day 100M.

Per request size: 1KB

Total Read Requests Size: 100M * 1 KB = 100 GB per day.

Bandwidth (per second) = 100 GB per day / (24*3600) = 1157.40741 KB/sec = ~1 MB per second.

And write is 10% of reading requests, so 1 MBPS /10 = 0.01 MBPS.

As Write requests are 1/100 of Reading Requests, we can say that ~1MBPS is the required bandwidth estimate we are looking for in this Software system.

Now we have an idea about the concepts, and we have also figured out how to calculate them individually so Let us take one more example covering them all.

Let us suppose you build a clone of Twitter, and you have the following assumptions.

Three hundred million monthly active users,
Fifty percent of the users use Twitter daily,
Users tweet two times per day on average,
Ten percent of post content is media.
Data is stored for five years.

Now, let us understand the estimations we can derive from these assumptions.

Query per second (QPS) estimate:

Daily active users (DAU) = 300 million * 50% = 150 million
Tweets QPS = 150 million * 2 tweets / 24 hour / 3600 seconds = ~3500
Peek QPS = 2 * QPS = ~7000

We will only estimate media storage here.

Average tweet size:
tweet_id 64 bytes
text 140 bytes
media 1 MB
Media storage: 150 million 2 10% * 1 MB = 30 TB per day
5-year media storage: 30 TB 365 5 = ~55 PB

Feel free to get all the derivations you can from you own assumptions and start searching for mock questions to practice.

I will insert the cheat sheet here -

Quick Summary

Load Estimations: Identify the volume of requests the system is going to process. Measure in requests per second, per minute, per hour, or per day. Example: If the system has a write-to-read ratio of 1:100 and there are 1 million write calls per day, approximate the new right per second request as 1,000,00024 hours×3600 seconds24 hours×3600 seconds1,000,000.
Database Storage Estimations: Consider the write-to-read ratio and the number of requests per day. Calculate storage per day, per year, and for the entire duration data needs to be stored. Example: For a 1 KB write request, estimate the total storage needed for 10 years as 1 million×1 KB×365×101 million×1 KB×365×10.
Cache Size Estimations: Determine the percentage of the database size to be used for caching. Cache is temporary storage for frequently accessed data. Example: If storage per day is 1 GB, and you decide to use 20% for caching, the cache requirement is 20×1 GB/10020×1 GB/100.
Network Bandwidth Estimations: Consider the read and write ratios, and request sizes, and ensure bandwidth does not become a bottleneck. Calculate bandwidth per second required for smooth system functioning. Example: If the system has a write-to-read ratio of 1:200, and there are 1 million write calls and 100 million read calls per day, estimate bandwidth as 100 GB per day24×3600 seconds24×3600 seconds100 GB per day.
Basic Math Tips: Learn to work with ratios and proportions. Approximate values for quick calculations, e.g., 5 * 9.667 can be rounded to fifty for convenience. Memorize key ratios for quick reference.
Assumptions: Make informed assumptions to fill in missing values. Assumptions should be reasonable and based on context. Assumptions function as indicators for understanding and problem-solving.
Practice: Regular practice is key to mastering back-of-the-envelope estimations. Revisit and refine estimations based on feedback and learning.

Remember, back-of-the-envelope calculations are meant to be quick and approximate, providing an estimate to guide decision-making in the pilot stages of system design.

Thanks for the read, As always please reach out for any questions/feedback.

Back-of-envelope Calculations

Shankhya Chatterjee

Techno-Functional Consultant | Bridging the Gap Between Business Needs and Technical Solutions | Certified in ETRM | CTRM

Recommended by LinkedIn

More articles by Shankhya Chatterjee

Insights from the community

Others also viewed

Crafting a Production-Grade Application with Best Practices and Insights

The Superhero Guide to Turbocharged APIs: 5 Performance-Boosting Techniques

🚀Demystifying Load Balancers: A Practical Guide for Startup Architects 🚀

The Importance of Right Architecture for High-Traffic Websites and Mobile Apps

Cloud-Native Essentials: Abstracted Endpoints

Understanding System Design Concepts: CAP Theorem, Scaling, Load Balancers, and More (Part 1)

Understanding Kubernetes Ingress Controllers: Working, Benefits and Configuration.

The Pillars of System Design: A Blueprint for Building Scalable and Resilient Systems

Containers and Containerization

NodePort Service in Kubernetes

Explore topics