How to Work With a Large Legacy Codebase Like a Pro?
Learning to code is hard, but understanding a legacy codebase is another level of hard even for experienced software developers. So today, I will show you how to understand it like a professional.When you join a new company, you have a chance to help the company improve some of their processes tremendously with your fresh perspective. But in most cases, from my experience, it rarely happens like that t give you the best chance of helping out and being effective, it is important to know how to navigate a large, legacy codebase and work with others on it.
Code is a Human By-Product
The legacy codebase you're working on is the result of the decisions made by the business, engineering leaders, and developers in your company. And that means you have to be careful while dealing with it because it is called "legacy codebase" for those reasons. It is a common saying that "The code you wrote is not an extension of you". But the truth is we often still feel prickly whenever people talk about our code unfavorably. Or we subconsciously don't like to face the consequences of another persons' decisions in the form of a legacy codebase. That is why you need to be careful when you join a company. If you join a company that values process, you'll probably be guided by docs or colleagues with an understanding of the context of the codebase. What if you joined a company that has yet to prioritize such processes? Whether they guide you or not, here's what you'll need to do.
Be Curious, Don't Be Critical
It is part of your job to understand the legacy codebase. So being critical might make colleagues (developers and managers) think you're rebuking them. And in reality, most experienced professional developers have written legacy code so be curious, don't be critical. Be empathic. Instead of saying something like "This code is crap" or any kind of complaint, be curious. Be willing to learn the stories behind the codebase. Wait, I know, it is easier said than done – but you have to do it anyway to do well at your job so find out why they did it that way. Ask your colleagues to explain things to you. Watch them while they're working with it. Try to understand how it works.
Don't Code Yet – Use the Platform
It is tempting to rush into coding, but no – try to explore the platform first. Check everything about the platform from speed to UI/UX.Why is that important?Your job is to build for the users and you can't understand what they feel if you're not in their shoes. So, put yourself in their shoes first. Use the platform to feel what the end use also users feel.Why is it your job to build for users? Well, those who hire you want you to deliver solutions based on the plans they have at hand. But indirectly, they're doing everything for the end users. Your understanding of the users' pain points might help you in transforming their ideas into products or improve their decisions on what to build and how to build it.See, it is easy to explain things with knowledge, exposure, and experience. But time and time again has shown that it is better to check out the platform to see how it feels instead of going with your assumptions.And the experience you gather from using the platform will help you connect the codebase with the features of the platform and give you a better understanding of the codebase. That is another reason why you should use the platform before diving deep into code.
Read the Most Important Part of the Codebase
The Pareto principle (otherwise known as the 80/20 rule) almost works everywhere. It can help you in navigating a codebase, too.
Instead of tinkering with the codebase randomly, ask your colleagues with deep experience of the codebase about the files and folders they use almost all the time. You could focus on these files and folders, and from there you can move onto others as required by the tasks you're given. Then check other more critical files that can help you understand how the codebase is glued together such as:
It is important to read these parts of the codebase because they reveal important operations within it. Reading the config files and others can be really boring, so you don't have to understand it all at once. You can always revisit it later. That is the key.
Study the Workflows in the Codebase
Take your time to understand the workflow of the most important parts of the codebase.
Learn how this connects to that. Check what happens if you connect and disconnect this and that. By tracing the flow of operations within a codebase, you stand a chance to learn more about the codebase. This experience will help you act with precision when you're implementing features or fixing bugs within the codebase. Oh, wait! Are you wondering about how to do this? Okay, you can start with a function. Read it, then read and understand other functions or components that use it. You can repeat that process with modules, classes, and others until you have a solid understand of the codebase. You can also troubleshoot how the codebase handles requests and responses if applicable. Above all, find how everything is connected to understand the codebase.
Research the Libraries and Frameworks
You're likely going to find hardcoded code, libraries, and frameworks (internal or external) within a legacy codebase. The libraries and frameworks might not be in the mainstream yet. So, you'll need to research them and figure out how they're used, especially as required by your codebase. You can do this by googling their versions.Sometimes, the libraries might be designed within your organization. In that case, you'll need to seek for supports from colleagues who understand the context of the frameworks and libraries.Honestly, it can be hard to ask for help if the helpers are now your subordinates. But the truth is, asking for help from them doesn't mean you're not competent to get things done. Remember, they have context of the codebase. They wrote the code, so they're responsible for helping you understand the codebase and why they made their decisions. You don't have to feel bad about seeking help in this case. Even if you feel bad about it, it is okay to ask for help.
Recommended by LinkedIn
Understand the Hardcoded Code
By now you would have heard it or experienced it yourself that some code can't be touched though they seem not to do anything. From experience, I have learned that this kind of code is basically control code or mathematical expressions. This is what I mean: A piece of code may do nothing, but another part of the code is checking if it is available before making a decision. So what would happen if you remove the code that is checked by the other part of the code? Well, things will break or you get unexpected results. Most of the time, hardcoded code is just mathematical expressions that is not known by developers dealing with the code.
That reminds me of what happened to me recently. I was building a JavaScript package to convert GitHub to a Serverless Database. The structures of the data I had at hand dictated that I should use nested for loops, but I couldn’t do so because browsers don’t have the capacity to run complex operations like the server. So, I decided to come up with some mathematical expressions that made it possible to achieve all I wanted without nested loops. I knew the expression would appear hardcoded to other developers, but it got the job done without losing speed.
Anyway, I added a context to it – I explained what it does, how it does it, and why I chose to do it that way. All I am saying, in essence, is that most hardcoded code is control flows or mathematical expressions unknown to the developers working on a codebase. Knowing this will set you on the right track whenever you have to deal with hardcoded code.
Extend First, and Refactor Slowly
The first instinct we do have as developers when we see a legacy codebase is to rewrite or refactor it. But we always forget that extending it should be the first thing because it keeps the business going – it achieves the interests of business leaders.
By extending a legacy codebase, I mean using its APIs to build new features. But we have to make sure whatever features we add don’t have the bad traits we see in legacy codebases. Yes, I know, it is easier said than done. Sometimes, circumstances will force you to repeat that bad trait you hate. Yolo! You’re not alone. Also, you need to refactor slowly. By this, I mean you shouldn't rush to refactor a legacy codebase. Be patient until you understand the codebase and its contexts. Extend first, and refactor slowly.
Document Your Journey to Understand a Legacy Codebase
If your organization appreciates process and empathy, it is good to document your journey as you begin to understand the codebase – from setting it up to working through every part of it.
You might improve your company’s onboarding process if the path to setup your codebases and understanding them is clearly documented. At the same time, it will make life easier for the people coming after you and even help the people before you or your future self. Document everything, including possible challenges and how to fix them. Don’t forget to encourage others to improve your documentation to make things easier for others as it has been done for them.
Doing this may present you as a leader and get you some leadership opportunities. Anyway, don’t force it. Do so only if it is allowed in your organization or you know how to help them adopt it.
Work effectively on a large codebase
As a codebase becomes larger, it is getting harder to understand everything. I spent a tremendous amount of time before on either open sourced projects or proprietary codebases, oh sorry, I don’t mean “spent”, I mean wasted. So the point of this doc is to give you some of my thoughts on how to explore a codebase.
Curiosity killed a cat. I am not saying that you shouldn’t have curiosity to explore a codebase, on the contrary, you definitely should, but I want to warn you that an overly excessive amount curiosity might easily destroy your productivity.
The above points help me a lot to reshape my behaviors on reading code. My old habit of reading code was really bad: I was the type of person who would try to understand everything and always found myself uncomfortable if I used code that I hadn’t read yet. I spent more time reading code than working on the tasks. My output was dissatisfactory and I blamed myself for not knowing enough amount of code. I was totally wrong! When I started to think about how many lines of code have a direct impact on my work. I started to realize that the value of reading lots of code is not as significant to me as I originally expected. Being comfortable of working with APIs without worrying about underlying implementations dramatically improves my productivity: It helps me to focus on the things that I want to build and reduce the amount of time to read the code that are irrelevant to my work (though some of the code are used in my work).
I am not discouraging you to read the code, however, the right expectation of how reading code will help you. A bad behavior of reading code is extremely dangerous, but if you are able to cultivate a good set of behaviors, it helps you a really long way. So what behaviors are good in my mind?
A good understanding of a codebase, including all the quirks and pitfalls, will definitely help you advance your impact, skills and career in the long run: your code will be more consistent with the codebase, you will debug issues more quickly, your code will contain less bugs, you will find more opportunities to build impactful projects by taking advantage of in house technologies, so on and so forth. It just requires a little more time and a little more patience.