“Building our own Super Computer with Linux Cluster”

Building our own Super Computer with Linux Cluster.

cluster

A dissertation is submitted to the Islamic University for partial fulfillment and requirements for degree of B.Sc. (Hon’s.) in Information & Communication Engineering.

Submitted To-

Dr. Paresh Chandra Barman

Associate professor

Department of Information & Communication Engineering

Islamic University, Kushtia.

Submitted By-

Md. Abdul Matin

Session- 2004-2005

Roll No- 0418020

Reg. No-1186

Department of Information &

Communication Engineering,

Islamic University, Kushtia.

Department of Information & Communication Engineering.

Islamic University, Kushtia, Bangladesh.

Problem Statement

Building our own Supercomputer with Red Hat “Linux Cluster”.

Purpose of the Project:

High-availability clusters (also known as HA Clusters or Failover Clusters) are computer clusters that are implemented primarily for the purpose of providing high availability of services which the cluster provides. They operate by having redundant computers or nodes which are then used to provide service when system components fail. Normally, if a server with a particular application crashes, the application will be unavailable until someone fixes the crashed server. HA clustering remedies this situation by detecting hardware/software faults, and immediately restarting the application on another system without requiring administrative intervention, a process known as Failover.


Chapter 1: Cluster Basics

1.1 Introduction

This document provides information about installing, configuring and managing Red Hat Cluster components. Red Hat Cluster components are part of Red Hat Cluster Suite and allow us to connect a group of computers (called nodes or members) to work together as a cluster. As part of this process, clustering software may configure the node before starting the application on it

HA clusters are often used for critical databases, file sharing on a network, business applications, and customer services such as electronic commerce websites. HA cluster implementations attempt to build redundancy into a cluster to eliminate single points of failure, including multiple network connections and data storage which is multiply connected via Storage area networks.

1.2. Cluster Definition and Type:

A cluster is two or more computers (called nodes or members) that work together to perform a task. Clustering is sharing resources with multiple machines.

There are four major types of clusters:

· Storage

· High availability

· Load balancing

· High performance

1.2.1. Storage

Storage clusters provide a consistent file system image across servers in a cluster, allowing the servers to simultaneously read and write to a single shared file system. A storage cluster simplifies storage administration by limiting the installation and patching of applications to one file system. Also, with a cluster-wide file system, a storage cluster eliminates the need for redundant copies of application data and simplifies backup and disaster recovery. Red Hat Cluster Suite provides storage clustering through Red Hat GFS.

1.2.2. High availability

High-availability clusters provide continuous availability of services by eliminating single points of failure and by failing over services from one cluster node to another in case a node becomes inoperative. Typically, services in a high-availability cluster read and write data (via read-write mounted file systems). Therefore, a high-availability cluster must maintain data integrity as one cluster node takes over control of a service from another cluster node. Node failures in a high-availability cluster are not visible from clients outside the cluster. (High-availability clusters are sometimes referred to as failover clusters.) Red Hat Cluster Suite provides high-availability clustering through its High-availability Service Management component.

1.2.3. Load-balancing clusters

Load-balancing clusters dispatch network service requests to multiple cluster nodes to balance the request load among the cluster nodes. Load balancing provides cost-effective scalability because we can match the number of nodes according to load requirements. If a node in a load-balancing cluster becomes inoperative, the load-balancing software detects the failure and redirects requests to other cluster nodes. Node failures in a load-balancing cluster are not visible from clients outside the cluster. Red Hat Cluster Suite provides load-balancing through LVS (Linux Virtual Server).

1.2.4. High-performance

High-performance clusters use cluster nodes to perform concurrent calculations. A high-performance cluster allows applications to work in parallel, therefore enhancing the performance of the applications. (High performance clusters are also referred to as computational clusters or grid computing.)

1.3. Red Hat Cluster Suite Introduction

Red Hat Cluster Suite (RHCS) is an integrated set of software components that can be deployed in a variety of configurations to suit wer needs for performance, high-availability, load balancing, scalability, file sharing, and economy.

RHCS consists of the following major components (refer to Figure (a), “Red Hat Cluster Suite Introduction”):

Cluster infrastructure — Provides fundamental functions for nodes to work together as a cluster: configuration-file management, membership management, lock management, and fencing.

.

High-availability Service Management — Provides failover of services from one cluster node to another in case a node becomes inoperative.

Cluster administration tools — Configuration and management tools for setting up, configuring, and managing a Red Hat cluster. The tools are for use with the Cluster Infrastructure components, the High-availability and Service Management components, and storage.

Chapter 2: Cluster Configuration Management

2.1. Setting up Hardware:

Setting up hardware consists of connecting cluster nodes to other hardware required to run a Red Hat Cluster. The amount and type of hardware varies according to the purpose and availability requirements of the cluster. Typically, an enterprise-level cluster requires the following type of hardware (refer to Figure (a)). For considerations about hardware and other cluster configuration concerns, refer to

• Cluster nodes — Computers that are capable of running Red Hat Enterprise Linux 5 software, with at least 1GB of RAM.

• Ethernet Cable with connector— This is required for connect the node.

• Ethernet switch or hub for public network — This is required for client access to the cluster.

• Ethernet switch or hub for private network — This is required for communication among the cluster nodes and other cluster hardware such as network power switches and Fiber Channel switches.

• Network power switch — A network power switch is recommended to perform fencing in an enterprise-level cluster.

• Fiber Channel switch — A Fiber Channel switch provides access to Fiber Channel storage.

Picture of primary/backup with a network dispatcher model.

Fig (a): Cluster architecture

2.2 The project scenario was

We configure following Pre-requirement:

1. Static Hostname resolved

2. Configure Webserver.

2.3. Basics Configuration

To set up a cluster, we must connect the nodes to certain cluster hardware and configure the nodes into the cluster environment. Here we use manual configuration.

Node 1 hostname:

[root@localhost]#vim /etc/sysconfig/network

NETWORKING=yes

NETWORKING_IPV6=no

HOSTNAME=node1

:x

[Save]

[root@localhost]#hostname node1

[root@localhost]#logout

Node 1 hosts file:

[root@node1 ~]# vim /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail. localhost.localdomain localhost

192.168.0.1         node1

192.168.0.2         node2

:x

Node 1 IP address :

[root@node1~]system-config-network

Name eth0

Device eth0

Use DHCP [ ]

Static IP 192.168.0.1

Netmask 255.255.255.0

Ok

Node 1 IP Info:

[root@node1 ~]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 00:0c:29:85:30:81 brd ff:ff:ff:ff:ff:ff

inet 192.168.100.254/24 brd 192.168.100.255 scope global eth0

inet 192.168.0.1/24 scope global secondary eth0

inet6 fe80::20c:29ff:fe85:3081/64 scope link

valid_lft forever preferred_lft forever

3: sit0: <NOARP> mtu 1480 qdisc noop

link/sit 0.0.0.0 brd 0.0.0.0

inet6 fe80::20c:29ff:fe79:56dd/64 scope link

valid_lft forever preferred_lft forever

3: sit0: <NOARP> mtu 1480 qdisc noop

link/sit 0.0.0.0 brd 0.0.0.0

Node 2 hostname:

[root@localhost]#vim /etc/sysconfig/network

NETWORKING=yes

NETWORKING_IPV6=no

HOSTNAME=node2

:x

[root@localhost]#hostname node2

[root@localhost]#logout

Node 2 hosts file:

[root@node1 ~]# vim /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail. localhost.localdomain localhost

192.168.0.1         node1

192.168.0.2         node2

:x

Node 2 IP address :

[root@node1~]system-config-network

Name eth0

Device eth0

Use DHCP [ ]

Static IP 192.168.0.2

Netmask 255.255.255.0

Ok

Node 2 IP Info:

[root@node2 ~]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 00:0c:29:79:56:dd brd ff:ff:ff:ff:ff:ff

inet 192.168.0.2/24 brd 192.168.100.255 scope global eth0

Chapter 3: Configuring Red Hat Cluster Software

3.1. Required Software (rpm)

Rgmanager-2.0.46-1.e15.i386.rpm

Cman-2.0.98-1.e15.i386.rpm

System-config-Cluster-1.0.55-1.0.noarch.rpm

Cluster-Administration-en-us-5.2-1.noarch.rpm

3.2. Installing Red Hat Cluster software:

Cluster-Configuration for Node1:

First we can copy RHEL 5.3 DVD into /mnt/a location

[root@node1]#cd /etc/yum.reos.d

[root@yum.repos.d]#vim rhel-debuginfo.repo

[rhel-debuginfo]

name=Red Hat Enterprise Linux $releasever – $basearch – Debug

baseurl=file:///mnt/a/Server

gpgcheck=1

gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release

:x

[root@node1]#cd /mnt/a/Server

[root@Server]#rpm –ivh createrepo-0.4.11-3.e15.noarch.rpm

[root#Server]#createrepo –v /mnt/a

[root@Server]#yum clear all

[root@Server]#yum install openais*

[root@node1]#yum install Cman-2.0.98-1.e15.i386.rpm

[root@Server]#cd /mnt/a/Cluster

[root@Server]# yum install Rgmanager-2.0.46-1.e15.i386.rpm

[root@Cluster]#rpm –ivh System-config-Cluster-1.0.55-1.0.noarch.rpm

[root@Cluster]# yum install Cluster-Administration-en-us-5.2-1.noarch.rpm

Cluster-Configuration for Node2:

First we can copy RHEL 5.3 DVD into /mnt/a location

[root@node2]#cd /etc/yum.reos.d

[root@yum.repos.d]vim rhel-debuginfo.repo

[rhel-debuginfo]

name=Red Hat Enterprise Linux $releasever – $basearch – Debug

baseurl=file:///mnt/a/Server

gpgcheck=1

gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release

:x

[root@Server]#cd /mnt/a/Server

[root@Server]#rpm –ivh createrepo-0.4.11-3.e15.noarch.rpm

[root#Server]#createrepo –v /mnt/a

[root@Server]#yum clear all

[root@Server]#yum install openais*

[root@Server]#yum install Cman-2.0.98-1.e15.i386.rpm

[root@Server]#cd /mnt/a/Cluster

[root@Cluster]# yum install Rgmanager-2.0.46-1.e15.i386.rpm

[root@cluster]#rpm –ivh System-config-Cluster-1.0.55-1.0.noarch.rpm

[root@Cluster]# yum install Cluster-Administration-en-us-5.2-1.noarch.rpm

3.3. Both Nodes cluster.conf file manual configaration:

[root@node1]vim /etc/Cluster/cluster.conf

-<cluster config_version=”2″ name=”new_cluster”>

<fence_daemon post_fail_delay=”0″ post_join_delay=”3″/>

−<clusternodes>

-<clusternode name=”node1″ nodeid=”1″ votes=”1″>

−<fence>

−<method name=”1″>

<device name=”manual-fencing” nodename=”node1″/>

</method>

</fence>

</clusternode>

−<clusternode name=”node2″ nodeid=”2″ votes=”1″>

−<fence>

−<method name=”1″>

<device name=”manual-fencing” nodename=”node2″/>

</method>

</fence>

</clusternode>

</clusternodes>

<cman expected_votes=”1″ two_node=”1″/>

−<fencedevices>

<fencedevice agent=”fence_manual” name=”manual-fencing”/>

</fencedevices>

−<rm>

−<failoverdomains>

−<failoverdomain name=”failover-domains” ordered=”1″ restricted=”0″>

<failoverdomainnode name=”node1″ priority=”1″/>

<failoverdomainnode name=”node2″ priority=”2″/>

</failoverdomain>

</failoverdomains>

-<resources>

<ip address=”192.168.0.10″ monitor_link=”1″/>

<script file=”/etc/init.d/httpd” name=”webserver”/>

</resources>

−<service autostart=”1″ domain=”failover-domains” name=”www-server”>

<ip ref=”192.168.0.10″/>

<script ref=”webserver”/>

</service>

</rm>

</cluster>

3.4. Cluster Configuration Tool

We can access the Cluster Configuration Tool ( Figure 3.4: “ Cluster Configuration Tool ” ) through the Cluster Configuration tab in the Cluster Administration GUI.

#systen-config-cluster

Figure 3.4: Cluster Configuration Tool

The Cluster Configuration tool represents cluster configuration components in the

Configurations file with a hierarchical graphical display in the left /etc/cluster/cluster.conf panel. A triangle icon to the left of a component name indicates that the component has one or more subordinate components assigned to it. Clicking the triangle icon expands and collapses the portion of the tree below a component. The components displayed in the GUI are summarized as follows:

• Cluster Nodes — Displays cluster nodes. Nodes are represented by name as subordinate elements under Cluster Nodes. Using configuration buttons at the bottom of the right frame (below Properties), we can add nodes, delete nodes, edit node properties, and configure fencing methods for each node.

• Fence Devices — Displays fence devices. Fence devices are represented as subordinate elements under Fence Devices. Using configuration buttons at the bottom of the right frame (below Properties), we can add fence devices, delete fence devices, and edit fence-device properties. Fence devices must be defined before we can configure fencing (with the Manage Fencing for This Node button) for each node.

• Managed Resources — Displays failover domains, resources, and services.

• Failover Domains — for configuring one or more subsets of cluster nodes used to run a high-availability service in the event of a node failure. Failover domains are represented as subordinate elements under Failover Domains. Using configuration buttons at the bottom of the right frame (below Properties), we can create failover domains (when Failover Domains is selected) or edit failover domain properties (when a failover domain is selected).

• Resources — For configuring shared resources to be used by high-availability services. Shared resources consist of file systems, IP addresses, NFS mounts and exports, and user-created scripts that are available to any high-availability service in the cluster. Resources are represented as subordinate elements under resources. Using configuration buttons at the bottom of the right frame (below properties), we can create resources (when resources is selected) or edit resource properties (when a resource is selected).

Note

The Cluster Configuration tool provides the capability to configure private resources, also. A private resource is a resource that is configured for use with only one service. We can configure a private resource within a Service component in the GUI.

• Services — For creating and configuring high-availability services. A service is configured by assigning resources (shared or private), assigning a failover domain, and defining a recovery policy for the service. Services are represented as subordinate elements under


Chapter 4: Before Configuring a Red Hat Cluster

4.1. Compatible Hardware:

Before configuring Red Hat Cluster software, make sure that Wer cluster uses appropriate hardware (for example, supported fence devices, storage devices, and Fiber Channel switches).

4.2. Enabling IP Ports on Cluster Nodes

To allow Red Hat Cluster nodes to communicate with each other, We must enable the IP ports assigned to certain Red Hat Cluster components.

[root@node2]#ping 192.168.0.1 (node1)

[root@node1]#ping 192.168.0.2 (node2)

[root@node1]#ping 192.168.0.110 (client PC)

[Client pc may be any operating System]

4.3. Configuration webserver (httpd) both nodes:

Install rpm

[root@node1]cd /mnt/a/Server

[root@Server]rpm –ivh httpd-2.2.3-6.e15.rpm

[root@server]yum install httpd-devel-2.3-6.e15.rpm

[root@Server]rpm –ivh httpd-manual-2.2.3-6.e15.rpm

[root@node1]chkconfig httpd on

[root@node1]service httpd restart

[root@node1]pgrep httpd

[root@node1]cd /etc/httpd/conf

[root@node1]vim httpd.conf

Go to line 251

[:251<┘]

ServerAdmin root@node1 [remove# sign at the beginning of the line]

Go to line 265

[:265<┘]

Server Name node1 [remove# sign at the beginning of the line]

[root@node1]service httpd restart

[root@node1]cd /vat/www/html

[root@html]vim index.html

Write html code:

<html>

<head>

<title>My first cluster site page</title>

<body bgcolor=”red”text=black>

<h1>This is Cluster site page</h1>

</body>

</html>

Save and exit; [:x<┘]

To check webserver now open the browser and write http://node1 or http://192.168.0.1 in the address bar we see our html page

Also we can see

[root@node1]#links node1

4.3. Before Start a Red Hat Cluster nodes

Before we start both cluster nodes, we permanent start cman and rgnamager service. But httpd (web server) off. Here we can it this command

For Node1

[root@node1]#chkconfig httpd off

For Node2

[root@node2]#chkconfig httpd off


Chapter 5: Cluster Administration tools

5.1. Now we have to start cman services on both nodes simultaneously:

[root@node1 ~]# /etc/init.d/cman start

Starting cluster:

Loading modules… done

Mounting configfs… done

Starting ccsd… done

Starting cman… done

Starting daemons… done

Starting fencing… done

[OK]

[root@node2 ~]# /etc/init.d/cman start

Starting cluster:

Loading modules… done

Mounting configfs… done

Starting ccsd… done

Starting cman… done

Starting daemons… done

Starting fencing… done

[  OK  ]

5.2. After running cman services successfully, We should start the rgmanager service on both nodes:

[root@node1 ~]# /etc/init.d/rgmanager start

Starting Cluster Service Manager:                          [  OK  ]

[root@node2 ~]# /etc/init.d/rgmanager start

Starting Cluster Service Manager:                          [  OK  ]

Node 1 Cluster Status:

[root@node1 ~]# clustat

Cluster Status for node-cluster @ Mon Aug 31 00:15:09 2009

Member Status: Quorate

Member Name        ID   Status

—— —-        —- ——

node2    1    Online, rgmanager

node1    2    Online, Local, rgmanager

Service Name              Owner (Last)        State

——- —-              —– ——        —–

service:ftp-server       node1 started

Node 2 Cluster Status:

[root@node2 ~]# clustat

Cluster Status for node-cluster @ Mon Aug 31 00:16:35 2009

Member Status: Quorate

Member Name        ID   Status

—— —-        —- ——

node1 1    Online, Local, rgmanager

node2 2    Online, rgmanager

Service Name              Owner (Last)        State

——- —-              —– ——        —–

service:ftp-server node1     started

5.3. To Move resource from Node 1 to Node 2 manually:

[root@node1 ~]# clusvcadm -r ftp-server -m  node2

Trying to relocate service:ftp-server to node2…Success

service:ftp-server is now running on node2

[root@node1 ~]# clustat

Cluster Status for node-cluster @ Mon Aug 31 00:17:53 2009

Member Status: Quorate

Member Name        ID   Status

—— —-        —- ——

node2 1    Online, rgmanager

node1 2    Online, Local, rgmanager

Service Name              Owner (Last)        State

——- —-              —– ——        —–

service:ftp-server       node2     started

To Move resource from Node 2 to Node 1 manually:

[root@node1 ~]# clusvcadm -r ftp-server -m  node1

Trying to relocate service:ftp-server to node1…Success

service:ftp-server is now running on node1

[root@node1 ~]# clustat

Cluster Status for node-cluster @ Mon Aug 31 00:18:38 2009

Member Status: Quorate

Member Name        ID   Status

—— —-        —- ——

node2 1    Online, rgmanager

node1 2    Online, Local, rgmanager

Service Name              Owner (Last)        State

——- —-              —– ——        —–

service:ftp-server       node1 started

5.4. To turn off power of the Node 1:

[root@node2 ~]# clustat

Cluster Status for node-cluster @ Mon Aug 31 00:19:52 2009

Member Status: Quorate

Member Name        ID   Status

—— —-        —- ——

node2 1    Online, Local, rgmanager

node1 2    Offline

Service Name              Owner (Last)        State

——- —-              —– ——        —–

service:ftp-server       node1     started

To Move resource from offline Node 1 to Node 1 using manual Fence:

[root@node2 ~]# fence_ack_manual -n  node2 -O

done

[root@node2 ~]# clustat

Cluster Status for node-cluster @ Mon Aug 31 00:21:56 2009

Member Status: Quorate

Member Name        ID   Status

—— —-        —- ——

node2 1    Online, Local, rgmanager

node1 2    Offline

Service Name              Owner (Last)        State

——- —-              —– ——        —–

service:ftp-server        node2 started

5.5. To Check Some Other Command:

[root@node1 ~]# ccs_tool lsnode

Cluster name: node-cluster, config_version: 18

Nodename                        Votes Nodeid Fencetype

node2 1    1    manualfence

node1 1    2    manualfence

[root@node1 ~]# ccs_tool lsfence

Name             Agent

manualfence      fence_manual

manual-fence     fence_ack_manual

[root@node1 ~]# cman_tool nodes

Node  Sts   Inc   Joined               Name

1   M    528   2009-08-31 20:40:42  node2

2   M    524   2009-08-31 20:40:40  node1

[root@node1 ~]# cman_tool services

type             level name       id       state

fence            0     default    00010002 none

[1 2]

dlm              1     rgmanager  00010001 none

[1 2]

[root@node1 ~]# cman_tool status

Version: 6.1.0

Config Version: 18

Cluster Name: node-cluster

Cluster Id: 29876

Cluster Member: Yes

Cluster Generation: 528

Membership state: Cluster-Member

Nodes: 2

Expected votes: 3

Total votes: 2

Quorum: 1

Active subsystems: 8

Flags: 2node Dirty

Ports Bound: 0 177

Node name: node1

Node ID: 2

Multicast addresses: 192.168.0.10

Node addresses: 192.168.0.2

[root@node1 ~]# group_tool ls

type             level name       id       state

fence            0     default    00010002 none

[1 2]

dlm              1     rgmanager  00010001 none

[1 2]


Chapter 6: Discussion and Conclusion

6.1. Discussion

The current century is the age of real time communication & resource sharing via network. Online or distributed recourse are shared globally & performance of this system may be increase by using clustering technology.

Cluster is a collection of complete system that work together to provide a single, unified computing capability. Clustered systems provide reliability, scalability, and availability to critical production services. Using Red Hat Cluster Suite, we can create a cluster to suit us needs for performance, high availability, load balancing, scalability, file sharing, and economy.

Everyone wants to do is/her work automatically and efficiently.

6.2. Conclusion

We are configured & design high availability Clustering is sharing resources with multiple machines simplicity and familiarity, rapid development features, and lack of complexity. We need to look into is syncing and failover mechanisms so that 2 web servers at different locations are mirrored copies so that if one goes down, dns is routed to the other server. Be nice to utilize an HA consultant as trial and error can be costly.

——————–

  1. No comments yet.
(will not be published)