Caching is not always in-memory
Caching is a beautiful technique we have seen in software development. Whenever we think about Caching, immediately our mind has a picture of Redis or Memcached. This is because everyone has made these the go-to standard tech for caching in our industry.
But is Caching always in-memory?
Whenever we talk about Redis or Memcached or similar tools used for caching, we always think that Caching is an in-memory technique, where we use our RAM storage to store data for fast access. But is it so?
What actually is Caching?
Caching is actually a technique to get / fetch data quickly. The importance is on the word quickly, and no where does it say it needs to be in-memory.
Caching can be at any place where we store a duplicate of the data which is faster than the original source of data.
Let's see some examples of caching in real life -
My recent use-case
Recently, while building a FinTech product on Cosmocloud, we had to fetch the Stock Market data from the Stock Exchange in order to power the whole application -- Stock price, trades, financial data, etc.
Now, if we call the Stock Exchange API regularly, that could cause a Rate Limiting issue with the Exchange, specially with things like Balance Sheets, Income Statements, etc, which are not updated as frequently.
We could very well sync that data into our own primary database, and then power our application with the data we already have. This technique has these advantages -
Recommended by LinkedIn
Do we need Redis / In-Memory Cache for this?
Lets see the different perspectives...
The size of the data
The size of this data can be huge if we are syncing 30+ years of Stock Market data from our Data Partner, will this fit in memory?
In-Memory Retention and Failover
What happens if our Redis server goes down? What would be the performance impact of fetching all the data again from the Data Partner again (source)?
Do we really need millisecond speed or better uptime?
Scaling and maintaining Redis (and similar in-memory dbs) are a huge impact on costing of the application - The number of developers / devops engineers needed, the cost of RAM on the cloud, creating partitions and maintaining large clusters is just a very big deal.
Our main issue to solve was Data Sync, Latency of calling 3rd Party, Uptime Guarantee and Decreasing API costs from the Data Partner.
If we are able to solve our minimum requirement using a Primary Database, MongoDB for example, this MongoDB database itself becomes a Cache layer which caches a copy of the data coming from the Source of Truth.
Let's see what other examples can you share of Caching, do comment if any interesting use-case comes to your mind.
I write on interesting system design and software architectures, primarily focusing on Databases, Platform and Backend. Subscribe to this newsletter to stay updated 💜
Engineer
6moInformative
SDE-2 at Atlassian | Ex-Microsoft
10moJust a basic question about this problem. Pulling historical data from partner and syncing it to our data later is like one time setup. What about real time data? Does the data partner provide callback based system for it? Or the fintech platform will have a job based mechanism to pull latest data at regular intervals? Callback based systems look more efficient here to me ! Have tried searching about callbacks from stock exchanges, but didn’t find any, not sure if it exists.
Backend Developer | NodeJS | Rust
10moEven Redis has a file-based saving mechanism that ensures both durability and speed
Director of Engineering - Entrata
10moVery true and a nice read. We can cache the majority of static data, what about the real-time data, e.g. stock price changing every second??
Software Engineer @Qualimatrix | Migrating your web apps to Next.js@14
10moThanks, learnt lot of things from this read Shrey 🙌