CockroachDB vs. PostgreSQL: Why I'm Switching Sides.

CockroachDB vs. PostgreSQL: Why I'm Switching Sides.

While working on the system and infrastructure design for a project, I thought of doing more research to see the DB(Database) options I had besides the traditional Postgres I usually use. While doing my research, I found a couple of things, but I ended up settling with CoackroachDB, and I’m liking that decision. Let's look at the reasons why I made this decision and why you should also do the same.


But why am I moving from PostgreSQL? Is there are problem with it?  PostgreSQL is fantastic and it will suffice for most people. But PostgreSQL is quite old, dating back to 1996, so scaling with the current infrastructure is not really good. CockroachDB, on the other hand, is built for the current infrastructure and excels when it comes to horizontal scaling.


Let's look a bit deeper at some of the advantages that CockroachDB has to see why I prefer this option.


The first architecture, PostgreSQL, is built on the traditional single-master node architecture, which means a single node is responsible for all the read and write operations. This is not bad per se, but it just does not scale well at all. On the other hand, CockroachDB is built on a distributed architecture where data is distributed across multiple nodes, and each node handles a portion of the workload, and this is perfect for scaling horizontally.


The second is scalability, which is something every system designer really needs to take into consideration, especially when you will eventually or are already dealing with large datasets, a wrong decision here could bottleneck your system pretty fast.


PostgreSQL can scale vertically by adding more resources to a single node; this simply means you can only increase the CPU or RAM, and also storage. If you are a cloud or DevOps engineer, you have probably seen how this is a problem. The cost of scaling vertically can become very crazy with time as cloud services charge a significant amount the more CPU and RAM a virtual machine(EC2 or Compute engine) uses. Scaling only vertically may create a bottleneck, as it doesn't ensure a balanced distribution of incoming load or achieve the anticipated speed.

Let's explain this better with an illustration

We can look at this as an illustration of a person who cuts down trees(a lumberjack). It is obvious that if he switches from a machete to an axe, his performance or cutting speed will improve, and if he switches from an axe to a chainsaw, the performance is further improved, and so on. But the reality is that there is just so much that a single person can do, no matter how good or high-tech his tool is. This is basically what vertical scaling is, as well as its bottleneck.


Let's go back to the illustration, and this time let's use a horizontal scaling approach. This time, instead of one man, we have 10 men. Even though these men are using axes, they will cut down more trees than a single man with a chainsaw, and if we now give these men all chainsaws, they will do 10x the work of a single man at the same time duration. But you 10 men with axes are already doing more work, so you don't need to invest in getting them all chainsaws; just save that money and put it somewhere with more need. This is basically what horizontal scaling is and its major benefits. 


Two heads are better than one and four hands are better than two, all these sayings are concepts of horizontal scaling.

With the benefits of horizontal scaling, you can scale your cockroachDB to match your increasing data load, and you can do it at a lesser cost compared to vertical scaling because a few low-end servers will be cheaper than a big, high-end server.

Let's look at some numbers, below is a comparison of the cheapest high-end EC2 instance with a not-so-low-end EC2 instance.


First Set —---------------------

These are not so low-end vs entry-level high-end instances or servers.

t4g.xlarge: 4vCPU + 16GB RAM - $0.1344 / hr

t3.micro: 1vCPU + 1GB RAM - $0.0104 / hr


Monthly price comparison:

Config one: 1x t4g.xlarge - $120 / month

Config two: 6x t4g.micro - $45 / month

Config three: 10x t4g.micro - $75 / month


Second Set —---------------------

These are lowest-end vs mid-level high-end instances or servers.

tg4.nano: 2vCPU + 0.5GB RAM - $0.0042 / hr

T4g.2xlarge:  8vCPU + 32GB RAM - $0.2688 / hr


Monthly price comparison:

Config one: 1x t4g.xlarge - $194 / month

Config two: 6x tg4.nano - $18 / month

Config three: 10x tg4.nano - $30 / month

Config four: 20x tg4.nano - $60 / month

Config five: 50x tg4.nano - $151 / month


Even if you are as buff as John Cena you probably won't be able to take down 50 vicious teenagers 😂.

NOTE: I didn't even use the highest-end servers in this comparison because the price difference will just be ridiculous. You will start seeing $500+, $800+, and even more for a single server per month.


Let's look at the first set comparison, you can have up to 10 servers and distribute your DB workload across each of them using cockroachDB, this will, in turn, give amazing performance and still cost you less than a postgresDB in a single high-end server. The higher the spec of the server the higher the monthly cost. So why lock yourself into a DB that can only scale vertically and cost more? When you can opt-in for a DB that can scale horizontally, giving you a whole squad of servers, get better performance, and still pay less!


To wrap up there are other benefits to cockroachDB like inbuilt replication methods and fault tolerance mechanisms. Although there are a lot of pros to using cockroachDB, if I were to rank them my number one pro is the scalability. Even if you are a start-up you can be surprised how your single-node PostgreSQL can bottle-neck quickly, not to talk about the sky-high bills you will face every month.


Other info:

AWS EC2 Prices

Host your own CockroackDB for Free

Tomasz Anderson

Account Executive @ CockroachDB | The distributed SQL database trusted by industry leaders

5mo

Great write-up, thanks Joshua Edward

Andrew Eidenshink

Build your app, we'll handle the data

10mo

Thanks for sharing Josh!

To view or add a comment, sign in

More articles by Joshua Edward

  • IGT-Pt2: Understanding Pointers in Go - When & How

    IGT-Pt2: Understanding Pointers in Go - When & How

    Pointers are a fundamental concept in Go programming, offering a powerful tool for managing memory and improving…

  • BWJ-Pt4: DB Wars - SQL vs. NoSQL

    BWJ-Pt4: DB Wars - SQL vs. NoSQL

    In the ever-evolving landscape of data management, the choice between SQL (Structured Query Language) and NoSQL (Not…

    2 Comments
  • BWJ-Pt3: The UI/UX for Backend

    BWJ-Pt3: The UI/UX for Backend

    In the realm of software development, we often hear about the significance of User Interface (UI) and User Experience…

    2 Comments
  • BWJ-Pt2: Unlocking Learning Potential

    BWJ-Pt2: Unlocking Learning Potential

    The Power of Repetition in Software Development Education Let's delve into the idea presented in the post linked below…

  • BWJ-Pt1: Send only what is essential

    BWJ-Pt1: Send only what is essential

    🚀 Importance of Sending Only the Essentials in Backend APIs 🌐 In the realm of backend development, efficiency is key.…

    1 Comment
  • MeiliSearch: My utilization of the Search Engine

    MeiliSearch: My utilization of the Search Engine

    First, let's do a quick overview of what Meilisearch is. Meilisearch is an open-source full-text search engine, just…

    4 Comments
  • Create your own Redis server

    Create your own Redis server

    Creating your own Redis server on an AWS EC2 instance running Debian is a great way to harness the power of in-memory…

Insights from the community

Others also viewed

Explore topics