Thursday, July 26, 2012

SQL Server 2008 Failover Clustering Best Practices


SQL Server 2008 Failover Clustering Best Practices

■ Before installing SQL Server failover clustering, understand the prerequisites and verify that the clustering hardware is supported and certified by both the hardware vendor and Microsoft.

■ Leverage the Cluster Validation Tool included with Windows Server 2008 to ensure that all nodes within the cluster meet the prerequisites for deploying failover clustering.

■ Use identical hardware for all nodes in the cluster. This includes processor, memory, and firmware.

■ Configure Failover Clustering Feature from a Windows Server 2008 perspective prior to SQL Server 2008 Failover Clustering.

■ Ensure disk drive letters are identical on all nodes within the cluster.

■ Avoid having the quorum resource and other cluster resources from sharing the same disk.

■ Ensure that the public network adapter is the first adapter in the network binding list.

■ Disable NETBIOS on the private/heartbeat network adapters.

■ Disable write-back caching on host controllers.

■ Do not configure dynamic disks because clustering supports only basic disk configurations.

■ Determine whether a single-instance or multiple-instance configuration will be implemented. Plan the disk layout accordingly while taking future growth into account.

■ Identify which SQL Server features will be installed.

■ Do not use the same service account for Windows and SQL Server failover clustering.

■ Configure dependencies for shared disks so that they can be recognized and leveraged by SQL Server 2008.

■ Before using multiple instances, understand the impact of multipleinstance configurations and performance degradation on the surviving node if a failover occurs.

■ Change the service accounts only via SQL Server Configuration Manager.

■ Do not configure DTC resources within the same service or application as the SQL Server virtual instances.

■ Use the SQL Server Installation Center to modify a SQL Server failover cluster installation.

■ Ensure that each virtual server name is unique on the Active Directory domain.

■ Understand the benefit and impact associated with the different quorum models included in Windows Server 2008.

■ For advanced SQL Server 2008 failover clustering installations, use the planning and deployment tools included with the SQL Server Installation Center.

Note from book "Microsoft SQL Server 2008 Management and Administration".

SQL Server 2008 Failover Clustering Terminology


Failover Clustering Terminology

■ SQL Server virtual server—A SQL Server virtual server is, in fact, a cluster-configured resource group that contains all resources necessary for SQL Server to operate on the cluster. This includes the NetBIOS name of the virtual server, a TCP/IP address for the virtual server and all disk drives, and vital SQL Server services required to operate in a clustered configuration. In a multiple instance, two or more node clusters and one SQL Server virtual server are created per node, whereas the NetBIOS name and TCP/IP address of the cluster form the virtual server. When failover occurs in this configuration, the entire SQL Server virtual server fails over to the surviving node in the cluster dynamically.

■ Heartbeat—A single User Datagram Protocol (UDP) packet is sent every 500 milliseconds between nodes in the cluster across the internal private network. This packet relays health information about the cluster nodes as well as health information about the clustered application. If there is no response during a heartbeat to indicate that the node is alive, the cluster begins the failover process. In SQL Server 2008, this interval can be changed. This capability is useful when you are using a geographically dispersed cluster.

■ Failover—Failover is the process of one node in the cluster changing states from offline to online, resulting in the node taking over responsibility of the SQL Server virtual server. The Cluster Service fails over a group in the event that node becomes unavailable or one of the resources in the group fails.

■ Failback—Failback is the process of moving a SQL Server virtual server that failed over in the cluster back to the original online node.

■ Quorum resource—The quorum resource, also referred to as the witness disk in Windows Server 2008, is the shared disk that holds the cluster server’s configuration information. All servers must be able to contact the quorum resource to become part of a SQL Server 2008 cluster. In Windows Server 2003, the entire cluster’s health and vitality depends on the quorum drive being available. With Windows Server 2008, a majority quorum model can be leveraged to eliminate this single point of failure.

■ Resource group—A resource group is a collection of cluster resources such as the SQL Server NetBIOS name, TCP/IP address, and the services belonging to the SQL Server cluster. A resource group also defines the items that fail over to the surviving nodes during failover. These items also include cluster resource items such as a cluster disk. It is also worth noting that a resource group is owned by only one node in the cluster at a time.

■ Cluster resource—Cluster resources contain vital information for the SQL Server virtual server and include its network TCP/IP addresses, NetBIOS name, disks, and SQL Server services, such as the System Attendant. These cluster resources are added to services or applications when the virtual server is created to form SQL Server virtual servers. With Windows Server 2008, a clustered resource is known as an application and a service.

■ Dependency—A dependency is specified when creating cluster resources. Similar to a dependency on SQL Server services, a cluster resource identified as a dependency indicates that a mandatory relationship exists between resources. Before a cluster resource is brought online, the resource defined as a dependent must be brought online first. For instance, the virtual server NetBIOS name is dependent on the TCP/IP address; therefore, the TCP/IP address of the virtual server must be brought online before the NetBIOS name is brought online.

■ Majority Node cluster—In this configuration, each node is responsible for contributing one local disk to the quorum disk set that is used as storage disks. This configuration limits the majority node resource to one owner at a time. Because the quorum does not require a shared disk, the solution is typically used for geographically dispersed clusters.

■ Failover Cluster Management—Formerly known as Cluster Administrator, Failover Cluster Manager in Windows Server 2008 is a tool used by cluster and database administrators for accessing, creating, and administering Windows clusters. The Failover Cluster Administrator console is included in Windows Server 2008 and can be launched from any active node within the cluster. Additional administration and management tasks include viewing, creating, and deleting services or applications, cluster resources, and nodes.

■ Cluster witness disk or file share—The cluster witness or the witness file share are used to store the cluster configuration information, and to help determine the state of the cluster when some if not all of the cluster nodes cannot be contacted.

■ LUNs—LUN stands for Logical Unit Number. An LUN is used to identify a disk or a disk volume that is presented to a host server or multiple hosts by the shared storage device. Of course, there are shared storage controllers, firmware, drivers, and physical connections between the server and the shared storage, but the concept is that an LUN or set of LUNs is presented to the server for use as a local disk. LUNs provided by shared storage must meet many requirements before they can be used with failover clusters, but when they do, all active nodes in the cluster must have exclusive access to these LUNs.

Note from book "Microsoft SQL Server 2008 Management and Administration".