Back-of-envelope Calculations
You need to have a basic sense of scalability to effectively conduct the back-of-the-envelope estimations. I have covered the basics of scalability and one basic practical example in my previous articles. Do check them out before this if you have no idea about scalability at all; this will help you connect with the topic, going forward from here.
Now, let us take one step back to understand what System design is -
System Design is the process of defining the elements such as the architecture, modules, and components, the different interfaces of those components, and the data that goes through the system. It is meant to satisfy the specific needs and requirements of a business or organization through the engineering of a coherent and well-running system.
What is the back-of-the-envelope calculation?
The back-of-the-envelope calculation is an approximate calculation that we either do in our minds or on a piece of paper, but the steps are not 100% accurate. Nonetheless, the results aim to be more accurate than wild guesses and less accurate than formal calculations.
That is the expectation of these. So, what is an approximation?
For example, what would be the value of (11 x 9.667)? verbally and with no calculators around.
Personally, my answer will be either 100 or 110, and I will simply choose one hundred for my convenience If I must derive another approximation from this value.
Although the exact answer is 106.337, I would avoid wasting time trying to get an exact calculation again as the entire discussion around this topic will be based on assumptions itself.
I believe I will have a cheat sheet on the last page of this article.
Now, back-of-the-envelope calculations are integral in system designs and help choose the right configurations and technologies for the system, and yes, you need an idea of how scalability works in technology to get an approximation value closer to what is actually expected.
So, this will include having scope for you to identify request-response size, database size, cache size, counts of microservices, load balancers, etc. The network bandwidth requirement estimation is also required for designing the system.
It is always a clever idea to estimate the scale of the system before starting any high-level or low-level designs. (LLD & HLD) I will make separate articles for this. But now, in this topic, let us just work with a few examples and understand the basic understanding and how this estimation works and looks like.
As per my POV, a system design has four major areas where the back-of-the-envelope calculations are required, and I will only talk about those: load estimates, database storage, cache estimates, and bandwidth estimates. Also, additionally, knowing if the system is read-heavy or write-heavy helps in better design and estimation.
Understanding the type of system helps to do good estimates. Also, knowing if the system is read-heavy or write-heavy is good for starting the estimations. For example, a tiny URL is read-heavy, whereas web crawlers are write-heavy systems.
Although this is more helpful in choosing the technologies and database types, it can help in making some decisions in caching, Caching is typically used to improve performance and is useful for reading-heavy systems and not for write-heavy systems.
Always remember a short word, BKMGTP.
B: Byte: Ten: 10 K: Kilo: Thousand: 1000 M: Mega: Million: 1000 0000 G: Giga: Billion: 1000 000 000 T: Tera: Trillion: 1000 000 000 000 P: Peta: Quadrillion: 1000 000 000 000 000
K*K = M | M*K =G | G*K = T | T*K =P
Let us start with load estimates, which is the traffic estimation of your system. So, the load estimate helps identify the volume of requests the system is going to process. You can say, requests per second, requests per minute, requests for an hour, or even daily user access (BAU). It is usually measured per second and can be scaled to per day if required.
Following an example of this in back-of-the-envelope format,
Let us assume a system has a write-to-read ratio of 1:100, and there are one million write calls per day.
Please try to figure out what would be the new right-per-second request.
As per the assumption ratio given, the new write request per sec we derive would be.
1M / (24 hours * 3600 seconds) ~= 12 requests/sec.
Then read requests per second that we would use basic match form the ratio to derive would equal to - ~1200 requests/sec.
Let us move on -
Now, database estimations. So, these are especially important as the database is an integral component of any application. Sometimes architects and engineers spend most of the time on business logic and services that need to be built and little time on database selection and estimates.
Note - application response time directly depends on the data source and underlying database response time. Database selection and database modeling are important, and applications can use RDBMS, key-value stores, structured data, etc., to improve performance. Structured data should be stored in an RDBMS, and unstructured data in No SQL, MongoDB, Cassandra, etc. Now, the database type should always be decided based on the use case and without following an anyone-size-fits-all approach.
You can be asked questions around these, for example following.
Same previous question with a twist, the write-to-read ratio is equal to one used to one hundred (1:100), and there are one million write calls per day.
and say we must store it for 10 years.
Now let us solve this one step at a time.
We can assume the write request is 1 KB.
Then, the storage per day data required will be equal to 1M * 1KB = 1GB (gigabyte)
Storage per year = 1GB * 365days = 360GB (gigabyte)
Storage for 10 years = 360GB * 10years = 3.6TB (terabytes)
Finally, we cannot forget to have some storage for audit, user, security, etc.
Let us assume all of these data will take up another 0.4 TB piling up in the storage.
So, in total for 10 years, we roughly need 3.6TB +0.4TB = 0.4 TB.
and that's it basically we just figured out with assumptions the back-of-envelope calculation for the storage requirement in 10 years.
Well, if you experience these kinds of questions, try to figure out one part of the ratio and the next part would be a mathematical breeze.
Recommended by LinkedIn
If you are thinking, how did I figure out this write request would take 1 KB? That was my assumption. You are as good as your assumptions and your overall knowledge of scalability on these.
In system design, the better you understand designing a system, the better you will be at the architecture of a system. And say if you do not know the basic principles of what a service does or what a service does not do, is not capable of, limitations of those, and it all boils down to your understanding of overall systems.
Another interesting but sometimes offsetting topic is a Cache estimate! There is no hard-defined rule for cache requirements. Some applications do not maintain caches at all and on the other hand, some follow 10% to 30% of the database storage as cache, and some go with 20% to 30% of presently frequently accessed data. After researching norms, I have found out it is preferred to be a split of 80/20 or 70/30 ratio for database usage to cache size, keeping it 20-30% more frequently accessed. That increases the performance. Let us do some back-of-the-envelope calculations for the cache.
Assume that the storage per day is 1 GB.
Then, the cache requirement would be 20% of total storage per day i.e.
(20*1GB)/100 = 200 MB
Seems like simple math, but no,
The above cache calculation is incorrect.
Write to Read Ratio equals 1:100 Write Request per day - let us assume to be 1M Assuming Read Requests per day 100M.
Per request size: 1KB
Total Read Requests Size: 100 * 1 KB = 100 gigabytes per day.
Cache Estimates =20% of Read storage (20*100 GB)/100= 20 GB.
We can say cache estimates with these assumptions are 20% of the reading.
And let us not forget, that caches are very costly.
Overall, good estimates help in better budgeting.
A quick memorizing tip would be if you remember the ten million requests per day with 1 KB byte size is 10 GB per day, then we can quickly calculate for the latter part of the ratio.
Finally, network bandwidth estimates. So, what is the bandwidth? No, bandwidth estimates, network bandwidth. What kind of questions you will get around maybe:
Let us assume the same example for the ratio as the previous one and figure this out.
Write to Read Ratio equals 1:100 Write Request per day - let us assume to be 1M Assuming Read Requests per day 100M.
Per request size: 1KB
Total Read Requests Size: 100M * 1 KB = 100 GB per day.
Bandwidth (per second) = 100 GB per day / (24*3600) = 1157.40741 KB/sec = ~1 MB per second.
And write is 10% of reading requests, so 1 MBPS /10 = 0.01 MBPS.
As Write requests are 1/100 of Reading Requests, we can say that ~1MBPS is the required bandwidth estimate we are looking for in this Software system.
Now we have an idea about the concepts, and we have also figured out how to calculate them individually so Let us take one more example covering them all.
Let us suppose you build a clone of Twitter, and you have the following assumptions.
Now, let us understand the estimations we can derive from these assumptions.
Query per second (QPS) estimate:
We will only estimate media storage here.
Feel free to get all the derivations you can from you own assumptions and start searching for mock questions to practice.
I will insert the cheat sheet here -
Quick Summary
Remember, back-of-the-envelope calculations are meant to be quick and approximate, providing an estimate to guide decision-making in the pilot stages of system design.
Thanks for the read, As always please reach out for any questions/feedback.