动态负载平衡DNS简介
解决网络过载的问题的一个解决方法是在现有的DNS中加入动态负载平衡的特性.
随着计算机网络的应用的日益广泛,在互联网上的负载也变得日益拥挤,这经常导致服务器无法正常地响应,并且影响了一些应用程序的崩溃。而且,这种现象的发生是动态的。解决这个问题的一个方法是建造更加强大的服务器,而另外一个途径就是将客户请求分散到多个服务器上。后者是解决这个问题的一种巧妙的方法,通过这种方法实际上是一种平衡的艺术,可以避免一些服务器过于繁忙而另外的服务器非常空闲的状态。跨服务器的需求分配技术成为网络技术的一个重要课题。
我们来考虑这么两种情况:首先,每个TCP进程会消耗32比特的内存,这样,一个有32MB内存的服务器从理论上支持100万的连接。其次,在多个拥有同样内容的服务器中,用户总是喜欢根据他们自己的经验(或者是一些监测数据)访问一些服务负载较小的服务器,比如说,GetRight就可以选择一个较佳的服务器进行FTP下载。但是,我们可以可以通过定期地监测服务器的状态并将请求指向最佳服务器来实现请求的分配。这种在多个服务器中根据服务器负载动态定向请求的技术称之为动态负载平衡。这个功能可以加入域名服务(DNS)中,而这是因为域名服务器本身就充当了解析客户请求的主要责任,而具有这种特性的DNS称为dlbDNS(dynamic load balance DNS)。在这里,最佳服务器指的是通过一种排名算法的出最佳排名的服务器。
在这里,我们将要解释通过dlbDNS对DNS扩展所带来的好处。首先,我们必须要考虑dlbDNS设计应该达到的性能:
(1)新的设计必须与原来的DNS应用兼容。
(2)该设计必须要易于配置。
(3)负载平衡必须快速而且有效。
(4)一个主机可以属于多个组或者簇。
(5)对一个请求的响应应当动态地产生。
(6)对服务器的监控应当由不同的进程所产生。
(7)TTL的值应当设为最小以防止其他名字服务器的缓存的响应。
(8)最终的设计应当是一个通用性的名字服务器,可以被同时用于简单的、反向的和动态的请求。
(9)对错误应当有所响应。
(10)负载平衡的过程对用户来说是透明的。
负载平衡模型
有四种负载平衡平衡模型可供使用:首先,RFC1794描述了使用一个特别区域代理以从外部资源获得信息的负载平衡方法,这样,一个新的区域通过名字服务器被载入。这个方法的问题是大量的信息量,包括静态的或者是可能需要分配的信息量,都在区域中进行循环地传送。同时,这个方法也不支持根据被请求的名字所回应的动态创建的虚拟/动态域名。
第二个模型是通过一个专门的负载平衡服务器来解释请求并将其指向一个最佳服务器。这种设计由负载服务器在内部使用虚拟的IP地址。而这种服务器的问题在于需要在被监控地服务器群中加入另外一台服务器而不是使用现有的资源。
第三个模型是通过一个远程监视系统来监视不同服务器的性能,从而提供给DNS一个反馈。这个设计可以帮助解决无法直接观测的系统问题,同时提供给用户以访问时间的测算。这种方式的问题就是在于需要依靠远程网络进行监视并且分发数据。
最后一种方案就是通过内部监视系统来监视服务器的性能,并且提供给DNS的反馈。这主要的优点就是易维护性和管理性,而且也没有安全方面的问题。dlbDNS就是使用的这种方式。
负载平衡算法
最初,负载平衡只是为了允许DNS代理可以支持机器簇的概念,在这里,这些机器的功能都是类似的或者相同的。而且,它并不需要特别关心选择了哪台机器。这样,负载就被平均地分配在一系列实际上并不相同的主机上。因为机器有着不同的配置和能力,这样,我们就需要更加复杂的算法。
“循环算法A”可以以一种循环方式在服务器中平均的分配请求。但是,尽管这些请求是被动态地处理,对于不同的性能特点的忽视使这种算法的一个问题。
“负载平均算法A”可以根据服务器的负载分配请求。这个设计非常简单而且也较为低廉。但是这种算法却不能应付服务器在配置和潜力方面有差异的情况。
“排名算法A”基于如下所示的用户的数目和负载平均的列表。这个算法是比较合理的,因为它根据最少的单个访问以及较低负载平均来进行排名最佳主机的。这个算法在dlbDNS中确定最佳服务器的时候被使用。
WT_PER_USER = 100
USER_PER_LOAD_UNIT = 3
FUDGE = (TOT_USER - UNIQ_USER) * (WT_PER_USER/5)
WEIGHT = (UNIQ_USER * WT_PER_USER) + (USER_PER_LOAD_UNIT * LOAD) + FUDGE
在这个列表中,变量的名称的含义如下:
TOT_USER: 登录的用户的总数
UNIQ_USERS: 登录的不重复用户的数目(比如说,用户a和用户b就是两个不重复的用户,而不管他们登录了多少次)
LOAD: 最后一分钟的负载平均乘100
WT_PER_USER: 每个用户的负载量
FUDGE: 如果用户多次登录之后的修正参数
WEIGHT: 服务器的排名
dlbDNS的使用
首先,我们从Internet Software Consortium (http://www.isc.org/bind.html)下载BIND8.1.2(在BIND8.1.2中就支持了dlbDNS的特性),在示例中DNS被安装在dydns.clinux.org上,在一个独立的Linux工作站上进行测试。请看我们的配置:
在我们的配置中,由一个新的属性称为DNAME被加入来区分参加到动态负载平衡的主机。在我们上面这个配置中,我们可以看到,back1.dydns.clinux.org,back2.dydns.clinux.org和b.dydns.clinux.org被用来充当www1.dydns.clinux.org的动态负载,hack1.dydns.clinux.org,hack2.dydns.clinux.org和h.dydns.clinux.org被用来充当www2.dydns.clinux.org的动态负载。如下列表:
named.hosts.clinux
;
;
;named.hosts.wsu
; http://www1.cs.twsu.edu
www1 IN DNAME back1.dydns.clinux.org.
www1 IN DNAME back2.dydns.clinux.org.
www1 IN DNAME b.dydns.clinux.org.
服务器端的算法
以下是dlbDNS中的算法。如果一个服务器的请求是DNAME类型,那么,服务器就会进行如下的一些动作:
1、确定在这个服务中参与的服务器的集合。
2、通过和每个服务器建立一个同步的非连接性的连接获取每个参与的服务器的排名值。
3、根据返回的排名值,确定最佳服务器。
4、处理错误信息。
排名服务算法
一个排名服务运行在参与到动态负载平衡的每个服务器上,以下是算法:
1、从dlbDNS接收排名请求。
2、每一分钟都对主机的排名进行计算,而不是在得到请求的时候才进行计算。因为回应时非常重要的一个因素。
3、确认主机排名是每分钟都进行更新的。
4、处理错误情况,比如说dlbDNS在未等待主机回应的情况下关闭了UDP接口。
模型
如图所示的是dlbDNS的功能。由C所标明的路线指出了通过排名服务更新服务器排名的过程。由B所标明的路线指出了通过dlbDNS和排名服务所确定最佳服务器的通讯。由A所标明的路径指出了用户请求的路径。在图中,因为HOST 1比另外两台服务器有更好的一个排名,所以请求被定向到了HOST 1上。
dlbDNS的好处
这个就不需要多说了,除了可以充分地利用资源之外,因为我们通过DNS来实现负载平衡,这样FTP和TELNET之类的程序也可以使用dlbDNS。
发展方向
目前,在通过BIND的代码中,gethostbyname系统将不能正常工作,这个问题可以通过一个主机和IP地址的列表的配置文件来解决。当然,我们希望由一个更好的解决方法。
第二,排名算法还不完善,算法还不能考虑处理器的数目,对CPU和内存的考虑会使得算法更加有效。
第三,在Linux服务器上,排名算法使用的是/proc文件结构中的文件,这样只能说是动态平衡配置,应该还需要一个更加强大的设计。
Dynamic Load-Balancing DNS: dlbDNS
An attempt to solve the problem of network traffic congestion by adding a dynamic load-balancing feature to the existing DNS.
by Harish V.C. and Brad J. Owens
The rapid growth of computer literacy has led to a dramatic rise in the number of people using computers today. This rise has resulted in the development of intense computation-oriented and resource-sharing applications. These factors together play a prominent role in increasing the load across the Internet, causing severe network traffic congestion. This phenomenon, though dynamic in nature, causes a lot of user frustration in the form of slow response times and repeated crashing of applications.
Developing servers with more capacity and capability of handling this traffic is one way to solve the problem; another is to distribute client requests across multiple servers. This second method is an elegant way of handling this problem, since it uses existing resources and avoids scenarios in which some servers are overloaded while the rest of them are idle. The need for distributing requests across servers is further strengthened, considering:
Each TCP session eats up 32 bytes of memory (a general rule of thumb), causing a server that has 32MB of RAM to theoretically support one million simultaneous connections (see Resources 2).
Given a number of servers, users always log in to their favorite server while overlooking the load on that server.
Distributing a request across servers can be implemented by monitoring the servers regularly and directing the request dynamically to the best server. This way of dynamically directing a request across multiple servers based on the server load is called dynamic load balancing. This feature can be added to the pre-existing Domain Name Service (DNS), as it already plays a prominent role in resolving client requests and can be configured to direct client requests across multiple servers in an effort to avoid network traffic congestion. Here, best server refers to the server with the best rating based on a rating algorithm to be explained later.
Snapshots
We will explain the design, implementation and benefits of a dynamic load-balancing DNS, dlbDNS, which extends DNS.
Minimum Requirements for dlbDNS
Load-Balancing Models
Four load-balancing models are available. First, RFC 1794 (see Resources 1) describes a load-balancing method using a special zone transfer agent that obtains its information from external sources. The new zone then gets loaded by the name server. One problem with this method is that between zone transfers, the weighted information is essentially static or possibly handed out in a round-robin fashion. This method also doesn't allow a virtual/dynamic domain where a response is created dynamically based on the name being queried (see Resources 4).
The second model is a dedicated load-balancing server which intercepts incoming requests and directs them to the best server. This design employs virtual IP addresses for internal use by the load-balancing server. One problem with this is it adds another server to the existing cluster of servers to be monitored, instead of utilizing the available resources.
A third model is a remote monitoring system that monitors the performance of different servers and provides feedback to the DNS. This design helps detect problems not visible internally, and provides truer access time measurements and easy detection of configuration errors that affect external users. The major problem here is the dependency on the remote network to monitor and deliver data (see Resources 5).
Last is an internal monitoring system that monitors the performance of the servers and provides feedback to the DNS. Its major advantages are easy maintainability and administration, closeness to the source of addressable problems and no security hazards (see Resources 5). This design is implemented in dlbDNS.
Load-Balancing Algorithms
Initially, load-balancing was intended to permit DNS agents to support the concept of machine clusters (derived from the VMS usage) where all machines were functionally similar or the same. It didn't particularly matter which machine was picked, as long as the processing load was reasonably well-distributed across a series of actual different hosts. With servers of different configurations and capacities, there is a need for more sophisticated algorithms (see Resources 1).
``Round-robin algorithm A'' can distribute requests in a round-robin fashion evenly across servers. Although the requests are handled dynamically, the problem is the total ignorance of various performance characteristics.
``Load-average algorithm A'' can distribute requests across servers based on the server load. This design is very simple and fairly inexpensive, but fails miserably if servers vary in configuration and potential.
``Rating algorithm A'' is based on the number of users and load-average shown below. This algorithm is reasonable, as its rating favors hosts with the smallest number of unique logins and lower load averages (see Resources 4). This rating algorithm is implemented in dlbDNS to determine the best server.
WT_PER_USER = 100
USER_PER_LOAD_UNIT = 3
FUDGE = (TOT_USER - UNIQ_USER) * (WT_PER_USER/5)
WEIGHT = (UNIQ_USER * WT_PER_USER) + (USER_PER_LOAD_UNIT * LOAD) + FUDGE
where the variables are
TOT_USER: total number of users logged in
UNIQ_USERS: unique number of users logged in
LOAD: load average over the last minute, multiplied by 100
WT_PER_USER: pseudo-weight per user
FUDGE: fudge factor for users logged in more than once
WEIGHT: rating of the server
dlbDNS Implementation
To get started, we downloaded BIND 8.1.2 from the Internet Software Consortium (http://www.isc.org/bind.html). Initially, time was spent installing and understanding DNS. DNS was installed on odie.cs.twsu.edu, a stand-alone Linux workstation.
Listing 1. named.hosts.wsu
During configuration, a new attribute called DNAME was added to distinguish the hosts taking part in dynamic load-balancing. Listing 1 is a snapshot from named.hosts.wsu, containing information on all hosts in a particular zone. In this listing, the set of hosts kira.cs.twsu.edu, sisko.cs.twsu.edu and q.cs.twsu.edu take part in dynamic load-balancing for http://www1.cs.twsu.edu/. The set of hosts kira.cs.twsu.edu, mccoy.cs.twsu.edu and emcity.cs.twsu.edu take part in dynamic load-balancing for http://www2.cs.twsu.edu/. The set of hosts kira.cs.twsu.edu, sisko.cs.twsu.edu and deanna.cs.twsu.edu take part in dynamic load-balancing for http://www3.cs.twsu.edu/. Hosts kira.cs.twsu.edu and sisko.cs.twsu.edu belong to multiple groups.
Server-Side Algorithm
Here is the algorithm we added to the pre-existing DNS feature. If the service requested is of type DNAME, do the following:
Determine the set of participating servers for this service.
Request ratings from all participating servers by establishing a concurrent connectionless (UDP) connection with each server.
Using the ratings returned, determine the best server.
Handle error conditions such as ``server is too busy to return the rating within the time frame'', ``the rating returned by the server gets lost on its way back to the dlbDNS'', ``all servers have same rating'' and ``a server is down''.
Rating Demon Algorithm
A rating daemon runs on each server taking part in dynamic load balancing. Here is the algorithm:
Receive request for rating from dlbDNS and respond by returning the host rating.
Calculate the host rating once every minute rather than calculating it at the time of request, as quick response time is a most important feature.
Ensure the host rating is updated every minute, independent of the dlbDNS request.
Handle error conditions such as dlbDNS closing the UDP sockets without waiting for host response.
Figure 1. dlbDNS.gif
Model
Figure 1 shows the functionality of dlbDNS. The path traced by C indicates the process of updating the server rating by the rating daemons. The path traced by B indicates the communication between dlbDNS and the rating daemons to determine the best server. The path traced by A indicates the path traced by the user request. HOST 1 has a better rating than the other two hosts, so the user request gets directed to HOST 1.
dlbDNS Benefits
Implementing dlbDNS provides efficient utilization of system resources and ensures that facilities newly added to the existing network will be utilized. Since DNS is used, applications such as FTP and TELNET will also utilize dlbDNS.
dlbDNS Current Implementation
Uneven distribution of load across servers has been a major problem in the Computer Science department of Wichita State University. bugs.cs.twsu.edu, kira.cs.twsu.edu, roger.cs.twsu.edu and sisko.cs.twsu.edu are four Linux servers available for students in the department. These servers vary in potential and configuration.
dlbDNS was installed in December 1998 to effectively utilize the servers. lion.cs.twsu.edu, the actual DNS server, was made to direct DNAME requests toward odie.cs.twsu.edu where dlbDNS was installed. The lines added to the configuration file were:
;
bestlinux IN DNAME bugs.cs.twsu.edu.
bestlinux IN DNAME kira.cs.twsu.edu.
bestlinux IN DNAME roger.cs.twsu.edu.
bestlinux IN DNAME sisko.cs.twsu.edu.
;
Here, the bestlinux attribute was added to handle non-web requests from applications such as TELNET and FTP.
Future Work
Currently, the gethostbyname system call fails within the BIND code. This problem is avoided by using a configuration file with a list of host and IP addresses. We'd like to find a better solution.
The rating algorithm is still not complete. An algorithm that takes into account the number of processors, CPU and memory utilization would make the rating algorithm more efficient.
At this time, only Linux servers can take part in the dynamic load-balancing scheme, as the rating algorithm uses files in the /proc file structure. A more extensible design is needed.
Harish V.C. (harish@acm.org) is a graduate student in the Computer Science department of Wichita State University. His research interests include computer and Internet security, networking and operating systems. He is currently working as an intern at IBM.
Brad J. Owens (bjowens@cs.twsu.edu) is a faculty member in the Computer Science department at Wichita State University. His research interests include computer and Internet security, high-speed networking, parallel and distributed programming.