Overview of high availability and replication options

SQL Server has a variety of options when it comes to automatically creating "copies" of databases on other SQL Servers. This may be done for a variety of reasons:

a) high availability - to ensure a database never goes down

b) for reporting - to offload to another SQL Server a copy of the data

c) to handle remote locations - if 20 locations all need to be able to view and/or update data in a shared database

d) for performance - to offload to another SQL Server a copy of the data to spread the workload

SQL Server has a variety of options:

  • Server clustering

  • Database mirroring:

  • Log shipping

  • Snapshot replication

  • Merge replication

  • Transactional replication (with no updating allowed by subscribers)

  • Transactional replication (with updating allowed by subscribers)

  • Peer-to-peer replication (SQL Server 2008 Enterprise only)

It is sometimes hard to keep track of which of these approaches is best for the situation.


Server Clustering

Key thoughts:

  1. Used to guarantee high availability.

  2. A "virtual SQL server" is created. Windows Cluster Administrator controls which actual SQL Server is used when a client application connects to the virtual SQL Server.

  3. SQL Server (the service) is installed on every physical server

  4. The SQL Server databases are stored on a network hard drive that can be shared between all the servers.

  5. Windows server clustering automatically rolls-over "active" server to another server if server goes down. This is called a "failover".

  6. SQL Server Standard supports 2 node clustering.

  7. SQL Server Enterprise supports up to 16 node clustering.

  8. SQL Server setup is used to add nodes to a cluster.

Failback policies:

By default, groups are set not to failback. Unless you manually configure your group to failback after failover, it continues to run on the alternate node after the failed node comes back online.

When you configure a group to automatically failback to the preferred node, you specify whether you want the group to failback as soon as the preferred node is available or to failback only during specific hours that you define. This option is useful if you want the failback to occur after peak business hours, or if you want to make sure the preferred node is able to support the group when it does come back online.

When setting the "Allow failback", you have two options, Immediate and Failback Between. When setting it to Immediately, ailback will occur as soon as the Cluster Service detects that the preferred owner is now online. When setting the Failback Between, it is hours on a 24 hour clock (i.e. 1 is 1:00am, 15 is 3:00pm, etc). So if you were to set this to Failback between 1 and 2, it means between 1:00am and 2:00am. This is used to prevent interruption of the availability of the cluster resources in that group during peak hours.If you were to set this to between 1 and 1, it means the current 24-hours and is the same setting as Immediate.

Database Mirroring

3 configuration options:

  1. High protection: 2 servers and safety turned on for maximum data redundancy ("synchronous mode") - changes are committed at both the principal and the mirror

  2. High performance: 2 servers and safety turned off for maximum performance ("asynchronous mode") - changes are committed at the principal only

  3. High availability: With automatic failover (requires a 3rd witness server and safety turned on)

Key thoughts:

  1. Updates cannot be made on the secondary server.

  2. Secondary server cannot be queried.

  3. A "mirrorendpoint" is the Windows port that the mirror is listening on.

  4. You can pause mirroring.

  5. You can stop mirroring.

  6. When you stop the mirroring process, you need to run the "restore with recovery" command on the mirrored copy because the mirrored copy is left in the "restoring state".

To apply patches in a mirroring setup:

1. Patch the witness first.

2. Pause the mirroring.

3. Patch the mirror and re-boot if necessary.

4. Resume mirroring.

5. Let synchronization on the mirror catch up so the mirror is current.

6. Fail over to the mirror.

7. Pause mirroring.

8. Patch the principal and re-boot if necessary.

9. Resume mirroring.

10. Let synchronization on the principal catch up so the principal is current.

11. Fail over to the principal.

Log Shipping

Key thoughts:

  1. One primary server and 1 or more secondary servers.

  2. You can have a server act as the monitor server

  3. Secondary servers can be queried

  4. Considered to be easier to setup than replication.

  5. 4 jobs are created: a) backup job on primary server; b) copy job on secondary servers that copies backup file; c) restore job on secondary servers that restores backups; d) an alert job on the option monitor server.

  6. The secondary servers are read-only. No updates are allowed.


Replication basics:

3 databases are involved: Publisher, Distributor and Subscriber(s). The Distributor can be on the same database server as the Publisher, or it can be remote.

Publications consist of multiple articles. Subscriber servers subscribe to a publication not an article.

Transactional Replication

Key thoughts:

  1. Can allow updates by subscribers, or not.

  2. Faster than merge replication.

  3. See this for a good summary of how to create: http://www.sql-server-performance.com/2010/transactional-replication-2008-r2/

  4. See http://msdn.microsoft.com/en-us/library/ms151794.aspx for a good description of how transactional replication works when you want to allow the subscribers to update the data.

Merge Replication

Key thoughts:

  1. Updates can occur at any level.

  2. Slower than transaction replication.

  3. By default, a log of conflicts is created at both publisher and subscriber level but this can be changed when you create the publication.

  4. SQL has default rules to handle conflicts. The default resolution mechanism behaves differently, depending on whether a subscription is a client subscription or a server subscription. You assign priority values to individual Subscribers that use server subscriptions; changes made at the node with the highest priority win any conflicts. For client subscriptions, the first change written to the Publisher wins the conflict.

  5. In merge replication, conflict resolution takes place at the article level. For publications composed of several articles, you can have different conflict resolvers serving different articles, or the same conflict resolver serving one article, several articles, or all the articles comprising a publication.

  6. You can tailor your own conflict resolution: use column based, or row based, or your own set of more complicated rules.