The Windows Server 2008 Failover Clustering feature provides high availability for services and applications. To ensure applications and services remain highly available, it is imperative the cluster service running on each node in the cluster function at the highest level possible. Providing redundant and reliable communications connectivity among all the nodes in a cluster plays a large role in ensuring for the smooth functioning of the cluster. Configuring proper communications connectivity within a failover cluster not only provides access to highly available services required by clients but also guarantees the connectivity the cluster requires for its own internal communications needs. The sections that follow discuss Windows Server 2008 Failover Clustering networking features, functionality and recommended processes for the proper configuration and implementation of network connectivity within a cluster.
The following sections provide the information needed to understand failover cluster networking and to properly implement it.
Windows Server 2008 Failover Cluster networking features
Windows Server 2008 Failover Clustering introduces new networking capabilities that are a major shift away from the way things have been done in legacy clusters (Windows 2000\2003 and NT 4.0). Some of these take advantage of the new networking features that are included as part of the operating system and others are a result of feedback that has been received from customers. The new features include:
New cluster network driver architecture
The legacy cluster network driver (clusnet.sys) has been replaced with a new NDIS level driver called the Microsoft Failover Cluster Virtual Adapter (netft.sys). Whereas the legacy cluster network driver was listed as a Non-Plug and Play Driver, the new fault tolerant adapter actually appears as a network adapter when hidden devices are displayed in the Device Manager snap-in (Figure 1).
Figure 1: Device Manger Snap-in
The driver information is shown in Figure 2.
Figure 2: Microsoft Failover Cluster Virtual Adapter driver
The cluster adapter is also listed in the output of an ipconfig /all command on each node (Figure 3).
Figure 3: Microsoft Failover Cluster Virtual Adapter configuration information
The Failover Cluster Virtual Adapter is assigned a Media Access Control (MAC) address that is based on the MAC address of the first enumerated (by NDIS) physical NIC in the cluster node (Figure 4) and uses an APIPA (Automatic Private Internet Protocol Addressing) address.
Figure 4: Microsoft Failover Cluster Virtual Adapter MAC address
The goal of the new driver model is to sustain TCP/IP connectivity between two or more systems despite the failure of any component in the network path. This goal can be achieved provided at least one alternate physical path is available. In other words, a network component failure (NIC, router, switch, hub, etc…) should not cause inter-node cluster communications to break down, and communication should continue making progress in a timely manner (i.e. it may have a slower response but it will still exist) as long as an alternate physical route (link) is still available. If cluster communications cannot proceed on one network, the switchover to another cluster-enabled network is automatic. This is one of the primary reasons that each cluster node must have multiple network adapters available to support cluster communications and each one should be connected to different switches.
The failover cluster virtual adapter is implemented as an NDIS miniport adapter that pairs an internally constructed virtual route with each network found in a cluster node. The physical network adapters are exposed at the IP layer on each node. The NETFT driver transfers packets (cluster communications) on the virtual adapter by tunneling through the best available route in its internal routing table (Figure 5).
Figure 5: NetFT traffic flow diagram
Here is an example to illustrate this concept. A 2-Node cluster is connected to three networks that each node has in common (Public, Cluster and iSCSI). The output of an ipconfig /all command from one of the nodes is shown in Figure 6.
Figure 6: Example Cluster Node IP configuration
Note: Do not be concerned with the name ‘Microsoft Virtual Machine Bus Network Adapter’ as these examples were derived from cluster nodes running as Guests in Hyper-V.
The Microsoft Failover Cluster Virtual Adapter configuration information for each node is shown in Figure 7. Keep in mind; the default port for cluster communication is still TCP\UDP: 3343.
Figure 7: Node Failover Cluster Virtual Adapter configuration information
When the cluster service starts, and a node either Forms or Joins a cluster, NETFT, along with other components, is responsible for determining the node’s network configuration and connectivity with other nodes in the cluster. One of the first actions is establishing connectivity with the Microsoft Failover Cluster Virtual Adapter on all nodes in the cluster. Figure 8 shows an example of this in the cluster log.
Figure 8: Microsoft Failover Cluster Virtual Adapter information exchange
Note: You can see in Figure 8 that the endpoint pairs consist of both IPv4 and IPv6 addresses. The NETFT adapter prefers to use IPv6 and therefore will choose the IPv6 addresses for each end point to use.
As the cluster service startup continues, and the node either Forms or Joins a cluster, routing information is added to NETFT. Using the three networks mentioned previously, Figure 9 shows each route being added to a cluster.
Route between 1.0.0.31 and 1.0.0.32
Route between 192.168.0.31 and 192.168.0.32
Route between 172.16.0.31 and 172.16.0.32
Figure 9: Routes discovered and added to NETFT
Each ‘real’ route is added to the ‘virtual’ routes associated with the virtual adapter (NETFT). Again, note the preference for NETFT to use IPv6 as the protocol of choice.
The capability to place cluster nodes on different, routed networks in support of Multi-Site Clusters
Beginning with Windows Server 2008 failover clustering, individual cluster nodes can be located on separate, routed networks. This requires that resources that depend on IP Address resources (i.e. Network Name resources), implement an OR logic since it is unlikely that every cluster node will have a direct local connection to every network the cluster is aware of. This facilitates IP Address and hence Network Name resources coming online when services\applications failover to remote nodes. Here is an example (Figure 10) of the dependencies for the cluster name on a machine connected to two different networks.
Figure 10: Cluster Network Name resource with an OR dependency
All IP addresses associated with a Network Name resource, which come online, will be dynamically registered in DNS (if configured for dynamic updates). This is the default behavior. If the preferred behavior is to register all IP addresses that a Network Name depends on, then a private property of the Network Name resource must be modified. This private property is called RegisterAllProvidersIP (Figure 11). If this property is set equal to 1, all IP addresses will be registered in DNS and the DNS server will return the list of IP addresses associated with the A-Record to the client.
Figure 11: Parameters for a Network Name resource
Since cluster nodes can be located on different, routed networks, and the communication mechanisms have been changed to use reliable session protocols implemented over UDP (unicast), the networking requirements for Geographically Dispersed (Multi-Site) Clusters have changed. In previous versions of Microsoft clustering, all cluster nodes had to be located on the same network. This required ‘stretched’ VLANs be implemented when configuring multi-site clusters. Beginning with Windows Server 2008, this requirement is no longer necessary in all scenarios.
Support for DHCP assigned IP addresses
Beginning with Windows Server 2008 Failover Clustering, cluster IP address resources can obtain their addressing from DHCP servers as well as via static entries. If the cluster nodes themselves have at least one NIC that is configured to obtain an IP addresses from a DHCP server, then the default behavior will be to obtain an IP address automatically for all cluster IP address resources. The new ‘wizard-based’ processes in Failover Clustering understand the network configuration and will only ask for static addressing information when required. If the cluster node has statically assigned IP addresses, the cluster IP address resources will have to be configured with static IP addresses as well. Cluster IP address resource IP assignment follows the configuration of the physical node and each specific interface on the node. Even if the nodes are configured to obtain their IP addresses from a DHCP server, individual IP address resources can be changed to static addresses (Figure 12).
Figure 12: Changing DHCP assigned to Static IP address
Improvements to the cluster ‘heartbeat’ mechanism
The cluster ‘heartbeat’, or health checking mechanism, has changed in Windows Server 2008. While still using port 3343, it is no longer a broadcast communication. It is now unicast in nature and uses a Request-Reply type process. This provides for higher security and more reliable packet accountability. Using the Microsoft Network Monitor protocol analyzer to capture communications between nodes in a cluster, the ‘heartbeat’ mechanism can be seen (Figure 13).
Figure 13: Network Monitor capture
A typical frame is shown in Figure 14.
Figure 14: Heartbeat frame from a Network Monitor capture
There are properties of the cluster that address the heartbeat mechanism; these include SameSubnetDelay, CrossSubnetDelay, SameSubnetThreshold, and CrossSubnetThreshold (Figure 16).
Figure 16: Properties affecting the cluster heartbeat mechanism
The default configuration (shown here) means the cluster service will wait 5.0 seconds before considering a cluster node to be unreachable and have to regroup to update the view of the cluster (One heartbeat sent every second for five seconds). The limits on these settings are shown in Figure 17. Make changes to the appropriate settings depending on the scenario. The CrossSubnetDelay and CrossSubnetThreshold settings are typically used in multi-site scenarios where WAN links may exhibit higher than normal latency.
Figure 17: Heartbeat Configuration Settings
These settings allow for the heartbeat mechanism to be more ‘tolerant’ of networking delays. Modifying these settings, while a worthwhile test as part of a troubleshooting procedure (discussed later), should not be used as a substitute for identifying and correcting network connection delays.
Support for IPv6
Since the Windows Server 2008 OS will be supporting IPv6, the cluster service needs to support this functionality as well. This includes being able to support IPv6 IP Address resources and IPv4 IP Address resources either alone or in combination in a cluster. Clustering also supports IPv6 Tunnel Addresses. As previously noted, intra-node cluster communications by default use IPv6. For more information on IPv6, please review the following:
Microsoft Internet Protocol Version 6
Implementing networks in support of Failover Clusters
The main consideration when designing Failover Cluster networks is to ensure there is built-in redundancy for cluster communications.This is typically accomplished by having a minimum of two physical Network Interface Cards (NICs) installed in each node that will be part of the cluster.These cards must be supported by two separate and distinct buses (e.g.Two PCI NICs).Many people think a single multi-port NIC card meets this requirement – it does not as this configuration creates a single point of failure for all cluster communications.The best configuration would be two multi-port NICs running on separate buses and having fault tolerance implemented by way of NIC Teaming software (provided by 3rd Party vendors.) and being physically connected to separate network switches.
Note:NIC Teaming is not supported on iSCSI connections.Please review the iSCSI Cluster Support: Frequently Asked Questions.The appropriate fault-tolerant mechanism for iSCSI connectivity would be multi-path software. Please review the Microsoft Multi-path I/O: Frequently Asked Questions.
There are two primary design scenarios when planning for Failover Cluster network connectivity.In the first scenario (and the most common), all nodes in the cluster are located on the same networks.In the second scenario, nodes in the cluster are located on separate and distinct routed networks (this is very common in multi-site cluster implementations).Figure 18 shows an example of the second scenario.
Figure 18:Multi-site cluster (network connectivity only)
Note:Even though it is supported to locate cluster nodes on separate, routed networks, it is still supported to connect nodes in a multi-site cluster using stretched Virtual Local Area Networks (VLAN).This configuration places the nodes on the same network(s).
It is important in any cluster that there are no NICs on the same node that are configured to be on the same subnet.This is because the cluster network driver uses the subnet to identify networks and will use the first one detected and ignore any other NICs configured on the same subnet on the same node.The cluster validation process will register a Warning if any network interfaces in a cluster node are configured to be on the same network.The only possible exception to this would be for iSCSI (Internet Small Computer System Interface) connections.If iSCSI is implemented in a cluster, and MPIO (Multi-Path Input/Output) is being used for fault-tolerant connections to iSCSI Storage, then it is possible that the network interfaces could be on the same network. In this configuration, the iSCSI network in the Failover Cluster Manager should be configured such that cluster would not use it for any cluster communications.
Note:Please consult the iSCSI Cluster support: Frequently Asked Question.
As previously mentioned, Windows Server 2008 accommodates cluster nodes being located on separate, routed networks by including a new logic, called an OR logic, when it comes to IP Address resources.Figure 19 illustrates this.
Figure 19:IP Address Resource OR logic
When a Network Name resource is configured with an OR dependency on more than one IP Address resource, this means at least one of the IP Address resources must be able to come Online before the Network Name resource can come Online.Since a Network Name resource can be associated with more than one IP Address, there is a property of a Network Name resource that can be modified so DNS registrations will occur for all of the IP Addresses.The property is called RegisterAllProvidersIP (See Figure 20).
Figure 20:Network Name resource properties
Note:In Figure 20 above, Failover Cluster PowerShell cmdlets were used to access cluster configuration information.This is new in Windows Server 2008 R2.For more information, review the TechNet Cmdlet Reference.
The default registration behavior is to register only the IP Address that can come Online on the node.Implementing this other behavior by modifying the setting to (1) can assist name resolution in a multi-site cluster scenario.
Note:Please review KB 947048 for other things to consider when deploying failover cluster nodes on different, routed subnets (multi-site cluster scenario).
While Failover Clusters require a minimum of two NICs to provide reliable cluster communications, there are scenarios where more NICs may be desired and\or required based on the services or applications that are running in the cluster.One such scenario we already mentioned – iSCSI connectivity to storage.The other scenario involves Microsoft’s virtualization technology – Hyper-V.
The integration of Failover Clustering with Hyper-V was introduced in Windows Server 2008 (RTM) in the form of making Virtual Machines highly available in a cluster by being able to move (Failover) the Virtual Machines between the nodes in the cluster using a process called Quick Migration.In Windows Server 2008 R2, additional capabilities were introduced including Live Migration and Cluster Shared Volumes (CSV).These features improved the high availability story for Virtual machines, but also introduced new networking requirements.The inner workings of Hyper-V networking will not be discussed here.For more information, please download this whitepaper (http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=3fac6d40-d6b5-4658-bc54-62b925ed7eea).
The networking requirements in a Hyper-V Cluster supporting Live Migration and using Cluster Shared Volumes (CSV) can add up quickly as illustrated in Figure 21.
Figure 21: Hypothetical Networking Requirements
For more information on Live Migration and Cluster Shared Volumes in Windows Server 2008 R2, visit the Microsoft TechNet site.
Using Cluster Shared Volumes in a Failover Cluster in Windows Server 2008 R2
Hyper-V:Using Live Migration with Cluster Shared Volumes in Windows Server 2008 R2
Troubleshooting cluster networking issues
As previously stated, it is important that redundant and reliable cluster communications connectivity exist between all nodes in a cluster.However, there may be times when communications connectivity within a cluster gets disrupted either because of actual network failures or because of misconfiguration of network connectivity.A loss of communications connectivity with a node in a cluster can result in the node being removed from cluster membership.When a node is removed from cluster membership, it will terminate its cluster service to avoid problems or conflicts as other nodes in the cluster take over the services or applications and resources that were hosted on the node that was removed.The node will attempt to rejoin the cluster when the cluster service restarts.This problem can also have broader effects because the loss of a node in a cluster affects ‘quorum’.Should the number of nodes participating in a cluster fall below a majority; all highly available services will be taken Offline until ‘quorum’ is re-established (The quorum model,No Majority: Disk Only, is the one exception.However, this model is not recommended).
Here are some recommended troubleshooting procedures for cluster connectivity issues:
1.Examine the system log on each cluster node and identify any errors reportinga loss of communications connectivity in the cluster or even broader network related issues.Here are some example cluster related error messages you may encounter:
Figure 22:Cluster Network Connectivity error messages
Source:http://technet.microsoft.com/en-us/library/cc773562(WS.10).aspx
Figure 23:Network Connectivity and Configuration error messages
Source:http://technet.microsoft.com/en-us/library/cc773417(WS.10).aspx
2.If the system logs provide insufficient detail, generate the cluster logs and inspect the contents for more detailed information concerning the loss of network connectivity.
Note: Generate the cluster logs by running this PowerShell cmdlet –
3.Verify the configuration of all networks in the cluster.
4.Verify the configuration of network connectivity devices such as Ethernet switches.
5.Run an abbreviated cluster validation process by selecting only the Network tests.
The tests that are executed are shown here:
The desired end result is this:
As an example, here is the section in the validation report that shows the results for the List Network Binding Order test –
Some of the common issues seen with respect to the network validation tests include, but may not be limited to:
·Multiple NICs on a cluster node configured to be on the same subnet.
·Excessive latency (usually > 2 seconds) in ping tests between interfaces on cluster nodes.
·Warning that the firewall has been disabled on one or more nodes.
6.Conduct simple networking tests, such as a ‘ping’ test, across all networks enabled for cluster communications to verify connectivity between the nodes.Use network monitoring tools such as Microsoft’s Network Monitorto analyze network traffic between the nodes in the cluster (Refer to Figures 13 and 14).
7.Evaluate hardware failures related to networking devices such as Network Interface Cards (NICs), network cabling, or network connectivity devices such as switches and routers as needed.
8.Review the change management log (if one exists in your organization) to determine what, if any, changes were made to the nodes in the cluster that may be related to the disruption in communications connectivity.
9.Consider opening a support incident with Microsoft because if a node is removed from cluster membership, this means there were no networks configured on that node that could be used to communicate with other nodes in the cluster.If there are multiple networks configured for cluster use, as recommended, then cluster membership loss indicates a problem that affects all the networks or the system’s ability to send or receive heartbeat messages.
Note:For additional information on Troubleshooting Windows Server 2008 consult TechNet.
Hopefully, the information provided in this three part blog was helpful and will assist in properly configuring network connectivity in Windows Server 2008 Failover Clusters.
The Windows Server 2008 Failover Clustering: Networkingthree-part blog series has been out for a little while now.Hopefully, it has been helpful.Little did I know there would be an opportunity to write another part.This segment will be short as it covers a very specific scenario.One that we rarely see, but we have encountered it enough that I felt it might be worth writing about it.
There are applications written to access resources that are being hosted in Microsoft clusters running on Windows Server 2008 (RTM + R2).The resource could be a File Server, could be a SQL database, or whatever.The point is that the required resource is being hosted in a Failover Cluster.It is hoped that applications that need to function in this manner are written properly to locate the required resource being hosted in a cluster.By that I mean I would expect an application to be written in a manner where it would first query a name server (DNS server) and then use the information obtained to make a proper connection to the required cluster resource.In a Failover Cluster, that connection point is known as a Client Access Point (CAP).A CAP consists of a Network Name (NetBIOS) resource and one or more IP Address resources.The default behavior in a Windows Server 2008 cluster is to dynamically register CAP information in a DNS server provided it is configured to support Dynamic Updates.This occurs when the CAP is brought Online in the cluster. There are applications that are not written in this manner.There are some application that are written in such a way that they will make a local connection on a cluster node by binding to the first network adapter and then use the IP address configured for that adapter.The end result is in a cluster, the first connection listed in the binding order by default is the Microsoft Failover Cluster Virtual Adapter. This adapter uses an IP address that is drawn from the APIPA (Automatic Private IP Addressing) address range which is non-routable and not registered in DNS.
To assist with helping make these types of applications work better, we can use a utility that has been released for public download on the Microsoft MSDN site.The utility is called ‘nvspbind.’So, the first step is to download and install the utility on each cluster node. The options we will be using are shown in Figure 1.
Figure 1:Options for nvspbind
First we need to identify the adapter that is the Microsoft Failover Cluster Virtual Adapter by using the nvspbind /n command (Figure 2).The adapter is ‘Local area connection* 9’.
Figure 2:Identify the Microsoft Failover Cluster Virtual Adapter
Next, we use the 'nvspbind /o ms_tcpip’to determine the binding order for IPv4 (Figure 3).
Figure 3: Listing the bindings for IPv4
We can see here, that the adapter is listed at the top of the binding order for IPv4 which is causing the problem for some applications.We need to move the adapter down in the binding order so we will use the following command to accomplish that –
C:\nvspbind /- “local area connection* 9” ms_tcpip (Figure 4).
Figure 4: Moving the adapter down in the binding order for IPv4
Note:The adapter can be moved further down by using /-- if desire.
Once the adapter has been positioned correctly in the binding order, the application can be tested to see if it now works as desired.
To further highlight the effect of this utility, we can inspect the registry.First, we need to locate some information for the Microsoft Failover Cluster Virtual Adapter.Navigating to the following registry key (Figure 5), and locate the adapter –
HKEY_LOCAL_MACHINE\SYSTEM\CurrenControlSet\Class\{4D36E972-11CE-BFC1-08002BE10318}
Figure 5:Microsoft Failover Cluster Virtual Adapter NetCfgInstanceId
The same information shown in Figure 5 is also displayed in Figure 2.
With the information in hand, navigate to the following registry key (Figure 6) to verify the adapter is no longer listed at the top of the binding order.
Figure 6: HKLM\SYSTEM\CurrentControlSet\services\Tcpip\Linkage
本日志由 flyinweb 于 2011-10-19 17:13:11 发表,目前已经被浏览 875 次,评论 0 次;
作者添加了以下标签: Failover Cluster;
引用通告:http://www.517sou.net/Article/705/Trackback.ashx
而且直接配置文件是效率最高的,通过其它驱动效率都相对较低,BDB
这个测试不太准确,看官方的测试结果:http://bind-dlz.sourceforg
为什么使用BDB时QPS这么低? 我在bind版本基本相似的环境中测试的
It is quite useful and interesting too.
VIRT 的上限是64G,也就是36位, cat /proc/cpuinfo的结果是:addre
昨天要准备用线程重写webbench,试验了下Fedora Linux 2.6.35.14
不明白您的具体的意思是什么?
已经发送到你QQ邮箱