myguangzhou
作者myguangzhou·2011-10-08 14:55
技术经理·Uniwise

Linux Ethernet Bonding Driver

字数 13321阅读 9321评论 3赞 1

Linux Ethernet Bonding Driver

Applies to:

Linux OS - Version: Enterprise Linux 3.0 to Oracle Linux 6.1 with Unbreakable Enterprise Kernel [2.6.32] - Release: RHEL3 to OL6U1
Generic Linux

Purpose

This document describes how to configure ethernet device bonding for Linux for high availability.

Scope and Application

This document applies to almost all architectures with 2.4 and 2.6 kernels. This is mostly taken from the bonding HowTo by Thomas Davis, Willy Tarreau, Constantine Gavrilov, Chad N. Tindel, Janice Girouard and Jay Vosburgh.

This document is specifically applicable to Enterprise Linux.

Linux Ethernet Bonding Driver

Introduction

The Linux bonding driver provides a method for aggregating multiple network interfaces into a single logical "bonded" interface. The behavior of the bonded interfaces depends upon the mode; generally speaking, modes provide either hot standby or load balancing services. Additionally, link integrity monitoring may be performed.

The latest version of the bonding driver can be found in the latest version of the linux kernel, found on http://kernel.org

The latest version and the complete document can be found in either the latest kernel source (named Documentation/networking/bonding.txt), or on the bonding sourceforge site http://www.sourceforge.net/projects/bonding

Configuration

In Enterprise Linux the system does not automatically load the network adapter driver unless the ethX device is configured with an IP address. Because of this constraint, users must manually configure a network-script file for all physical adapters that will be members of a bondX link. Network script files are located in the directory:

/etc/sysconfig/network-scripts

The file name must be prefixed with "ifcfg-eth" and suffixed with the adapter's physical adapter number. For example, the script for eth0 would be named /etc/sysconfig/network-scripts/ifcfg-eth0. Place the following text in the file:

DEVICE=eth0
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none

The DEVICE= line will be different for every ethX device and must correspond with the name of the file, i.e., ifcfg-eth1 must have a device line of DEVICE=eth1. The setting of the MASTER= line will also depend on the final bonding interface name chosen for your bond. As with other network devices, these typically start at 0, and go up one for each device, i.e., the first bonding instance is bond0, the second is bond1, and so on.

Next, create a bond network script. The file name for this script will be /etc/sysconfig/network-scripts/ifcfg-bondX where X is the number of the bond. For bond0 the file is named "ifcfg-bond0", for bond1 it is named "ifcfg-bond1", and so on. Within that file, place the following text:

DEVICE=bond0
IPADDR=192.168.1.1
NETMASK=255.255.255.0
NETWORK=192.168.1.0
BROADCAST=192.168.1.255
ONBOOT=yes
BOOTPROTO=none
USERCTL=no

Be sure to change the networking specific lines (IPADDR, NETMASK, NETWORK and BROADCAST) to match your network configuration.

Finally, it is necessary to edit /etc/modules.conf (or /etc/modprobe.conf, depending upon your distro) to load the bonding module with your desired options when the bond0 interface is brought up. The following lines in /etc/modules.conf (or modprobe.conf) will load the bonding module, and select its options:

alias bond0 bonding
options bond0 mode=balance-alb miimon=100

Replace the sample parameters with the appropriate set of options for your configuration.

Finally run "/etc/rc.d/init.d/network restart" as root. This will restart the networking subsystem and your bond link should be now up and running.

Querying Bonding Configuration

Each bonding device has a read-only file residing in the /proc/net/bonding directory. The file contents include information about the bonding configuration, options and state of each slave.

For example, the contents of /proc/net/bonding/bond0 after the driver is loaded with parameters of mode=0 and miimon=1000 is generally as follows:

Ethernet Channel Bonding Driver: 2.6.1 (October 29, 2004)
Bonding Mode: load balancing (round-robin)
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 1000
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 1

Slave Interface: eth0
MII Status: up
Link Failure Count: 1

The precise format and contents will change depending upon the bonding configuration, state, and version of the bonding driver.

Configuring Bonding for High Availability

High Availability refers to configurations that provide maximum network availability by having redundant or backup devices, links or switches between the host and the rest of the world. The goal is to provide the maximum availability of network connectivity (i.e., the network always works), even though other configurations could provide higher throughput.

High Availability in a Single Switch Topology

If two hosts (or a host and a single switch) are directly connected via multiple physical links, then there is no availability penalty to optimizing for maximum bandwidth. In this case, there is only one switch (or peer), so if it fails, there is no alternative access to fail over to. Additionally, the bonding load balance modes support link monitoring of their members, so if individual links fail, the load will be rebalanced across the remaining devices.

Bonding Modes

Apart from the active-backup (1) and broadcast (3) modes which are for multiple switch topology (see below), you can use the following modes for single switch topologies:

  • balance-rr (0): Round-robin policy: Transmit packets in sequential order from the first available slave through the last.  This mode provides load balancing and fault tolerance. This is the default mode. So if no mode is stated in the /etc/modprobe.conf, the bonding driver will work in balance-rr mode. But it is the best practice to state the mode in /etc/modprobe.conf as above.
  • balance-xor (2): XOR policy: Transmit based on the selected transmit hash policy.  This is based on a hash function over MAC address using XOR operation. This mode provides load balancing and fault tolerance.
  • 802.3ad (4): IEEE 802.3ad Dynamic link aggregation.  Creates aggregation groups that share the same speed and duplex settings.  Utilizes all slaves in the active aggregator according to the 802.3ad specification. Prerequisites:

1.     Ethtool support in the base drivers for retrieving the speed and duplex of each slave.

2.     A switch that supports IEEE 802.3ad Dynamic link aggregation.

3.     Most switches will require some type of configuration to enable 802.3ad mode.

  • balance-tlb (5): Adaptive transmit load balancing: channel bonding that does not require any special switch support.  The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave.  Incoming traffic is received by the current slave.  If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave. Prerequisite:
    • Ethtool support in the base drivers for retrieving the speed of each slave.
  • balance-alb (6): Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. Receive traffic from connections created by the server is also balanced. When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all active slaves in the bond. Prerequisites:

1.     Ethtool support in the base drivers for retrieving the speed of each slave.

2.     Base driver support for setting the hardware address of a device while it is open. 

An Issue with balance-alb (6)

Mode 6 bonding has an inherent race condition with floating IP addresses. No other mode should have this problem. i.e. 5 should be fine (Thanks to Ed Hoek for this analysis)

Race Condition
Every time a mode 6 bond sends a normal ARP reply, the peers on the subnet assume that the source of that arp reply is the MAC address they should send to for that IP address. To correct this, the mode 6 bond must follow the ARP reply with a gratuitous ARP to every host on the subnet that it knows about re-assigning them to their respective slaves. This operation interrupts other network processing on the bonded host. If node A sends out the gratuitous ARP to reclaim the interface while node B is in the middle of sending out gratuitous ARP requests to all the other hosts, the router will get a gratuitous ARP from node A, reclaiming it, and then another one from node B a few seconds later, completing the mode 6 re-balancing. Since the ARP from node B came last, that's the one the router will use. In all likelyhood, the failback operation itself is triggering an ARP reply from node B, so it's simply a matter of how far down the list of hosts to re-ARP to the router happens to be that determines the probability of a favorable or unfavorable race.

Since none of the other modes do this sort of active ARP mangling, they will not have this problem. Mode 5 would be the closest one without this race condition.

High Availability in a Multiple Switch Topology

With multiple switches, the configuration of bonding and the network changes dramatically. In multiple switch topologies, there is a trade off between network availability and usable bandwidth.

Below is a sample network, configured to maximize the availability of the network:



                |                                     |
                |port3                           port3|
          +-----+----+                          +-----+----+
          |          |port2       ISL      port2|          |
          | switch A +--------------------------+ switch B |
          |          |                          |          |
          +-----+----+                          +-----++---+
                |port1                           port1|
                |             +-------+               |
                +-------------+ host1 +---------------+
                         eth0 +-------+ eth1


In this configuration, there is a link between the two switches (ISL, or inter switch link), and multiple ports connecting to the outside world ("port3" on each switch). There is no technical reason that this could not be extended to a third switch.

Bonding Modes

In a topology such as the example above, the active-backup and broadcast modes are the only useful bonding modes when optimizing for availability; the other modes require all links to terminate on the same peer for them to behave rationally.

  • active-backup (or 1): This is generally the preferred mode, particularly if the switches have an ISL and play together well. If the network configuration is such that one switch is specifically a backup switch (e.g., has lower capacity, higher cost, etc), then the primary option can be used to insure that the preferred link is always used when it is available.
  • broadcast (or 3): This mode is really a special purpose mode, and is suitable only for very specific needs. For example, if the two switches are not connected (no ISL), and the networks beyond them are totally independent. In this case, if it is necessary for some specific one-way traffic to reach both independent networks, then the broadcast mode may be suitable.

 

如果觉得我的文章对您有用,请点赞。您的支持将鼓励我继续创作!

1

添加新评论3 条评论

feidaodaofeidaodao系统运维工程师GA
2011-11-11 16:42
了解过
myguangzhoumyguangzhou技术经理Uniwise
2011-11-11 10:20
feidaodao: 负载均衡模式不太好用,需要连到同一台交换机。一般最多用的是failover模式。
你用过?
feidaodaofeidaodao系统运维工程师GA
2011-11-10 23:03
负载均衡模式不太好用,需要连到同一台交换机。一般最多用的是failover模式。
Ctrl+Enter 发表

作者其他文章

相关文章

相关问题

相关资料

X社区推广