System Design - Simplified

System Design - Simplified

The whole System Design topic is so amorphous that reading many articles about it, can result in very little practical material. If you want to learn why System Design is so important and also get a practical process of how to do it well and efficient - bear with me.

I like starting off everything with why, and not what. Every solution, process or tool were created to solve a certain problem. Let's explore an example that shows us why System Design is so important.

Today, we want to deliver content in realtime. To do so, we need to process the data as quick as possible, and not let it "starve". There are two main approaches for processing that data - batch and stream. While lately stream processing has gained much popularity over batch processing, as we look deeper we find that it's use-case dependent and sometimes batch can serve us better, even for low latency systems. Sounds shocking? Let me demonstrate using a children song my niece likes called "The Five Little Monkeys".

No alt text provided for this image

In this song, there are five monkeys jumping on a bed. One falls off the bed, and his mom calls the doctor for advice. Immediately after, another monkey falls off, and again the mother calls the doctor. This goes on and on until no monkeys are left on the bed. Obviously a software engineering problem.

The mother, is extremely inefficient in I/O - she could have turned five calls (streaming the monkey falls) into one call only, if she had waited them all to fall (batching the monkey falls). Of course, there are pros and cons for each approach. Sometimes we can't afford to let our monkeys wait, and sometimes we can't even tell if we are waiting for nothing - are more monkeys even going to fall? See my post regarding full list of batch and stream processing pros and cons. Nevertheless, here, a better solution might be to wait a minute to see if more monkeys are going to fall and only then call the doctor. This is a trade-off the mother should consider.

Another example is throttling requests. If we want to throttle requests to be x amount per y period of time for each user, and we handle thousands of requests each second, we might want to gather z period of time before updating it in our DB (summing amount of requests in z period of time before accessing our DB to update). This of course means that our system will be accurate up to the level of z, but imagine the I/O efficiency difference.

Now that we have understood that most problems (even the one's that look rather simple) can lead to many decisions and choices we have to make - that will eventually effect the quality of the final product, we can move to the 'what part'.

"System Design is the process of defining the different components and how they interact with each other so that it meets the user-end requirements".

Looking at the broader picture, taking all the factors into consideration and making the best decisions for our business, is what System Design all about. Here's how I do it, process I've learned (and adjusted here and there to simplify) from experienced engineers throughout my military service and professional career.

No alt text provided for this image

Starting off with Specification. Before we start designing a solution, we must fully understand the problem. All the workflows, behaviors and functionalities of the final product. Result of this step should be doc where all of the above are mentioned and everything in that doc should be fully agreed by the stakeholders. It should not be technical at all. I find that stating the obvious here, can do magic. What I usually do in this step is write in order to think - first I persist my thoughts, and then index them. For those purposes I like focus writer - a clutter free writing software, but you can use whatever text editor your'e comfortable with. In many companies, this step is done with the collaboration of Product Managers. Here's a simplified flow of the specification process:

No alt text provided for this image

Next up, is Design Overview. This is a technical representation of the specification doc. The way I structure it:

  1. Background. Short brief of why we are all gathered here.
  2. Related works - if exist, and why they are not sufficient.
  3. Detailed design - mainly including each component API, communication between components, data validation, logging and monitoring. The idea here is to reference the specification doc as much as possible. Every choice that we've made must meet the user end requirement. We shouldn't make things up, everything was agreed upon. Sometimes we are tempted to choose the newest technologies, the one's that have the most 'hype' even if it will make our solution more complicated and will require further adjustments. We must control our emotions here, and put the business first. Some items will not be linked back to the specification doc, such as logging and health checks because they don't have a direct user input.
  4. Alternative approaches, and why they were not chosen.
  5. Roll-out plan - A/B testing, feature flags. Often we want to release changes only to sub-set of users to test usage.
  6. Time estimations. This part is one of the most challenging to me, and I would skip it gladly. Thing is, that every user requirement - comes with a deadline. The further we are late supplying the need, the more irrelevant our solution becomes. We can discuss all day long about how to estimate times correctly, and many books were written on that topic, but in my opinion this is the kind of thing that the more experienced you are, the better you become. A basic technique I use is to split the solution into smaller stages, and estimate the development time of each. Time estimations can affect the actual solution we choose in many cases. There's always a trade-off between making the best solution and making our development process faster, and we must balance it together with the stakeholders.
  7. Future work. Here we discuss the vision of our solution, what should happen with it as we go along. This can reference years ahead, depending on how the stakeholders value this product. This can include added components to make it more robust, changes in communication between components etc...

Once the overall Design Overview is finished, we are going down to detail about each component in the Component Overview. Here, we explain about the component structure (i.e. in case of a REST API component, we will explain about the layer separation, what services BL and DAL will contain etc...). We also detail about the internal implementation of each API that the component should supply to other components (mentioned in the design overview). For example, in case of a database component overview, we can say "to allow fast fetching of contacts by name field we will create a BTree index on the name column in contacts table". One last thing is to mention third parties the component will depend on - like frameworks and other components. Each component mentioned in the design overview, whether if it's a new component, or one that is being changed, should have a component overview.

You've probably noticed we haven't drew any diagrams yet. That is because Build Diagrams step comes last in the System Design process. Boxes with names and arrows don't mean a whole lot if there's nothing behind them to explain how they actually communicate, and how they should be implemented. This step should be a recap and a visualisation of our technical solution. I use simple boxes and arrows, I don't like and don't remember tech diagrams vocabularies. I use Google drawings but draw.io and many other applications will do the trick.

No alt text provided for this image

That's it. The all process, hopefully simplified and practical. A question that I get often is when should we go through this all process? Every change in our system? That sounds like an over-kill. My answer here is that you use your common sense, to think wether this new feature/problem/requirement needs to go through this designing process. I would say that if a change touches two components or more, it probably needs designing. You have to remember - the smaller the problem is, the shorter the process will take. We should aspire to design every problem, even smaller ones. What now seems small, might be hard to change in the future as we gather more data from users, and as our code base grows.

On a final note, in the beginning of this article we've discussed one trade-off that is rather common in the Software Engineering world, but there are many. We probably encounter one almost every day. From availability vs consistency (heard of the CAP Theorem?), to indexing (sacrifice write speed for faster read?), Caching (faster results but our system becomes complex and often in-consistent), robust framework (which often leads to unnecessary boilerplates and bloated code) or flexible vanilla (which might result in longer development time)? Polling (stateless) vs Long Polling (stateful), and many many more.

I hope that after reading this article, you'll be able to design better systems, and make better decisions for your business :)

In this article I've took inspiration from many articles and engineers and I specifically recommend Husssein Nasser youtube channel for quality backend engineering content.

Yair Zaslavsky

Senior Software Engineer at Cash App by Block (formerly Square), owner and senior consultant at LIDA group

2y

Great example, though in Australia they sing about 5 little joeys jumping on the bed. 5 little joeys jumping on the bed, One fell off and bumped its head Mummy called the Doctor, and the Doctor said - no more joeys jumping on the bed (Pay attention to the spelling of mum - the correct Australian way ;) )

Like
Reply
Yossi Melamed

Fast Lane Israel CEO | Mamram Alumni Association Chairman | Peter Paltchik CEO & Manager | Entrepreneur

2y

Thanks for sharing Matan! You had a great lecture at our last conference. #IdfTech3D

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics