Important Factors to Reduce SQL Database Downtime and Maintain Database Continuity

Important Factors to Reduce SQL Database Downtime and Maintain Database Continuity

Today, we will talk about why business continuity is important. I just give you some data. About half of the small businesses do not reopen their business following a disaster. And let me tell you that this is a remarkably high percentage.

So, what differentiates a small business from a large company? Well, for sure, the presence of a disaster recovery plan. A large company has no doubts that its data is the main asset and therefore it must always be kept safe and always accessible.

A large company needs to know how long it can afford to go without data. This means how long the system can be idle before it is restored to an operational state. This time is called Recovery Time Objective (RTO).

So how much is your company's RTO? How much is the RTO of your largest customer?

We know this well. Nowadays, we are living in a society that is driven and fed by data. So, data is very important. It is also essential to reduce downtime.

Another concept we hear a lot about is that of high availability (HA).

When is a database highly available?

Well, the definition is by no means trivial. “A database is highly available when users can access data without worrying if the server is crashing or losing data.”

Over time, many technologies have been developed to create both HA and DR solutions.

Basically, all these technologies aim to have a copy of the data available on another server, which is the concept of redundancy. If we place this server at a location other than the main server, we also create a DR or disaster-proof solution. However, the two concepts, as we will see, often overlap each other.

Now let's see what techniques are present in SQL Server that can ensure business continuity.

Disaster Recovery Techniques supported by SQL Server

Let’s analyze disaster recovery techniques supported by SQL Server from the point of view of the Recovery Time Objective (RTO).

  • Backup and Restore

The first technique is obviously the Backup and Restore. It's the simplest approach to getting an HA (High Availability) / DR (Disaster Recovery) solution and it is free.

You need a secondary server ready to restore the full database backup of each database. But this can’t be a real-time process. You will need to try many times to evaluate if the duration of the entire process meets your RTO. Restoring many huge databases require a lot of time.

Since this method is completely manual, it is even more important to define a disaster recovery plan that can be followed by any IT.

The next step in automating disaster recovery procedures is to have the data that is continuously kept aligned. You will then have your database on the main server and another database on a secondary server.

SQL Server allows you to do this in three diverse ways: Log Shipping, SQL Server Replication, and Mirroring.

In this way, we are certainly going to decrease the RTO.

  • Log Shipping

The log shipping functionality was introduced in SQL Server 2000, but only in the Enterprise (and developer) editions. Starting from SQL Server 2017, it can also be used in the Standard edition. This feature allows you to send the transaction database log backup from your server to a secondary server.

These are the logics:

  • The backup is performed on the primary database and a copy of the log backup is made on the secondary server.
  • On the secondary server, the copy of the transaction log is restored.

No alt text provided for this image

Now, if this solution meets your RTO, it has the merits of being inexpensive.

On the other hand, it is a manual HA/DR system. In fact, you have to manually point your system and applications to the secondary database. This is the main limitation, along with the fact that the duration of the operation depends on the size of the log file.

Depending on where we send the transaction log, log shipping can be considered an HA or DR solution. If both systems are in the same datacenter, we speak of HA. If the systems are in different datacenters, we speak of a DR solution.

SQL Server Transactional Replication

SQL Server Transactional Replication is primarily a DR (disaster recovery) solution. It is the first solution that we meet today where data changes that occur are provided as they occur or precisely in near real-time.

Conceptually, the term “Replication” refers to copying and distributing database objects from one database to another in a synchronized way to maintain data consistency.

Personally, I appreciate this technology more for having the data on another server other than the primary one. Querying a secondary server can lighten the workload of the primary server. With that in mind, you get a kind of workload balancing.

SQL Server supports several types of replications: merge replication, snapshot replication, peer-to-peer replication, bi-directional replication, updatable subscriptions, and transactional replicas. However, the changes that occur in the publisher are delivered to the subscriber as soon as they occur and are also applied to the subscriber in the same order they occurred in the publisher.

Of the “jargons” used in the replicas, some terms are defined that have become standard. These are:

The publisher is a database instance that makes data available to other locations through replication. A Publisher can have one or more publications, each defining a logically related set of objects and data to replicate.

A distributor is a database instance that acts as a store for replication specific data, associated with one or more Publishers. Each Publisher is associated with a single database (known as a distribution database) at the Distributor. The distribution database stores replication status data, metadata about the publication, and, in some cases, acts as a queue for data moving from the Publisher to the Subscribers. In many cases, a single database server instance acts as both the Publisher and the Distributor. This is known as a local Distributor. When the Publisher and the Distributor are configured on separate database server instances, the Distributor is known as a remote Distributor.

Subscriber is a database instance that receives replicated data. A Subscriber can receive data from multiple Publishers and publications. Depending on the type of replication chosen, the Subscriber can also pass data changes back to the Publisher or republish the data to other Subscribers.

An article identifies a database object that is included in a publication. A publication can hold diverse types of articles, including tables, views, stored procedures, and other objects. When tables are published as articles, filters can be used to restrict the columns and rows of the data sent to Subscribers.

A publication is a collection of one or more articles from one database. The grouping of multiple articles into a publication makes it easier to specify a logically related set of database objects and data that are replicated as a unit.

Subscription is a request for a copy of a publication to be delivered to a Subscriber. The subscription defines what publication will be received, where, and when. There are two types of subscriptions: push and pull.

No alt text provided for this image

Database Mirroring

Database mirroring is an HA solution where data is kept aligned between a Principal Server and a Mirroring server. In case of problems, an optional third server called Witness Server can start automatic failover but only if the mirroring type is in High Safety mode.

For completeness, speaking of mirroring, it must be said that there are two different ways of implementation:

  • A so-called "high-security mode" or synchronous mode. Here, any changes made to the main database are first pushed to the mirrored database. When the changes are committed on the mirrored server, they will also be committed on the main server.
  • A so called “high performance mode” or asynchronous mode. Here, any modification made to the principal database will be committed first in the principal database, then committed changes will be sent to the mirrored database.

While the high-security mode is an HA solution, the high-performance mode is a DR solution if the mirroring the database is on multiple data centers.

However, it cannot be said that this commonly used technology will no longer be supported in new versions of SQL Server. Instead, Microsoft recommends moving towards Always on Availability Groups technology.

Always on Availability Groups

As mentioned in the previous paragraph, the mirroring feature has been deprecated. In its place, Microsoft has introduced a new feature called Always on Availability Group.

SQL Server Always-On Availability Group provides high availability and disaster recovery solution and supports Automatic Failover.

Once a failover occurs, the failover target (current secondary) takes over the primary role and brings the database online to accept connections.

We can have one primary replica and multiple secondary replicas. These multiple replicas can be in synchronous or asynchronous data commit mode.

In synchronous data commit mode, the primary receives the acknowledgment from the secondary replica and then it commits transaction on the primary. Whereas in asynchronous data commit mode, the primary commits the transactions without waiting for any confirmation from the secondary replica.

To use the automatic failover feature, both primary and secondary replicas must use the synchronous commit mode and both replicas should be synchronized.

Load Balancing: We can also use this technology to improve performance and perhaps balance the workload.

In fact, having the same data on multiple servers, we can perform part of the queries on each of the servers. And this is a great plus from multiple performance standpoints. We have SQL Server database engines and we might as well use them.

Reduce SQL Database Downtime

What can we say after seeing the HA and DR (disaster recovery) techniques supported by SQL Server?

A company must know the value of its RTO. From this, you must choose the best technology to use by evaluating costs and benefits. Clearly, real-time data alignment is more than desirable, and even more so is automatic failover. Obviously, all this has an economic cost and a constructive complexity.

So, figure out what your RTO is, choose the right technology, and try.

Do you want to know a trick to reduce downtime?

The trick is amazingly simple: familiarize yourself with the procedures!

This is essential because it will allow you to be faster and safer when the failure really happens. Simulate periodically a breakdown and the time it takes to be back online. Indeed, I recommend you use these tests to fine-tune the last missing aspect.

Disaster Recovery Plan Document

Disaster Recovery Plan Document is an essential document that will help you cut down on downtime and remind you of all the steps to follow. On the internet, you will find many pre-packaged ones but I suggest you create your own. Within this document, we must put all the information we may need in an emergency. Yes, even the cell phone numbers of the technicians who must be available.

Within this document, we can put everything we believe to be useful. For example:

  • Step-by-step checklist of the recovery procedure
  • Hardware and software system configuration and password
  • Telephone numbers of all interested technicians, DBAs, network managers, and supervisors

I hope that these indications are useful and that you have found the reading interesting!

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics