Caching is not always in-memory
Caching is not always in-memory

Caching is not always in-memory

Caching is a beautiful technique we have seen in software development. Whenever we think about Caching, immediately our mind has a picture of Redis or Memcached. This is because everyone has made these the go-to standard tech for caching in our industry.

But is Caching always in-memory?

Whenever we talk about Redis or Memcached or similar tools used for caching, we always think that Caching is an in-memory technique, where we use our RAM storage to store data for fast access. But is it so?

What actually is Caching?

Caching is actually a technique to get / fetch data quickly. The importance is on the word quickly, and no where does it say it needs to be in-memory.

Caching can be at any place where we store a duplicate of the data which is faster than the original source of data.

Let's see some examples of caching in real life -

  • We are caching data in Redis to fetch user's name and other info, rather than querying primary database, to speed up queries.
  • We are caching Twitter Tweets data in our primary database rather than calling Twitter API regularly
  • We are caching Stock Market data in our primary database rather than calling the Stock Exchange API regularly.
  • We are caching partially pre-computed values in a separate table/collection in our database, rather than calculating the maths on every query. (For example, storing total sum and total records separately, to calculate avg)

My recent use-case

Recently, while building a FinTech product on Cosmocloud, we had to fetch the Stock Market data from the Stock Exchange in order to power the whole application -- Stock price, trades, financial data, etc.

Now, if we call the Stock Exchange API regularly, that could cause a Rate Limiting issue with the Exchange, specially with things like Balance Sheets, Income Statements, etc, which are not updated as frequently.

We could very well sync that data into our own primary database, and then power our application with the data we already have. This technique has these advantages -

  • No extra third party network call -- decreases latency
  • No dependency on third party uptime and SLA
  • No Rate Limiting issue (DdoS)
  • Saving on Pricing if API is billed per usage.

Do we need Redis / In-Memory Cache for this?

Lets see the different perspectives...

The size of the data

The size of this data can be huge if we are syncing 30+ years of Stock Market data from our Data Partner, will this fit in memory?

In-Memory Retention and Failover

What happens if our Redis server goes down? What would be the performance impact of fetching all the data again from the Data Partner again (source)?

Do we really need millisecond speed or better uptime?

Scaling and maintaining Redis (and similar in-memory dbs) are a huge impact on costing of the application - The number of developers / devops engineers needed, the cost of RAM on the cloud, creating partitions and maintaining large clusters is just a very big deal.

Our main issue to solve was Data Sync, Latency of calling 3rd Party, Uptime Guarantee and Decreasing API costs from the Data Partner.

If we are able to solve our minimum requirement using a Primary Database, MongoDB for example, this MongoDB database itself becomes a Cache layer which caches a copy of the data coming from the Source of Truth.

Let's see what other examples can you share of Caching, do comment if any interesting use-case comes to your mind.

I write on interesting system design and software architectures, primarily focusing on Databases, Platform and Backend. Subscribe to this newsletter to stay updated 💜
Pratyush Prateek

SDE-2 at Atlassian | Ex-Microsoft

10mo

Just a basic question about this problem. Pulling historical data from partner and syncing it to our data later is like one time setup. What about real time data? Does the data partner provide callback based system for it? Or the fintech platform will have a job based mechanism to pull latest data at regular intervals? Callback based systems look more efficient here to me ! Have tried searching about callbacks from stock exchanges, but didn’t find any, not sure if it exists. 

thành trần

Backend Developer | NodeJS | Rust

10mo

Even Redis has a file-based saving mechanism that ensures both durability and speed

Shriram Bhandari

Director of Engineering - Entrata

10mo

Very true and a nice read. We can cache the majority of static data, what about the real-time data, e.g. stock price changing every second??

Vivek Singh Bhadouria

Software Engineer @Qualimatrix | Migrating your web apps to Next.js@14

10mo

Thanks, learnt lot of things from this read Shrey 🙌

To view or add a comment, sign in

More articles by Shrey Batra

Insights from the community

Others also viewed

Explore topics