存档

2007年6月 的存档

clustat and required packages

2007年6月28日 没有评论

To monitor the cluster and display status at specific time intervals from a shell prompt, invoke clustat with the “-i” time option, where time specifies the number of seconds between status snapshots. The following example causes the clustat utility to display cluster status every 10 seconds
# clustat -i 10

The cluster configuration Tools automatically retains backup copies of the three most recently used configuration files (beside the currently used configuration File). Retaining the backup copies is useful if the cluster does not function correctly because of miconfiguration and you need to return to a precious working configuration.

rgmanager: manager cluster services and resoures.
system-config-cluster: contain the cluster configuration tool.
ccsd: contains the cluster configuration service daemon
magma: contains an interface library for cluster lock management
magma-plugins: contains plugins for the magma library.
cman: manager cluster membership, messaging and notification
cman-kernel: contains required cman kernal modules.
dlm: contains distributed lock managerment
dlm-kernel: contains required DLM kernel module
fence: The cluster I/O fencing system that allows cluster nodes to connect to a variety of network power switches fibre channel switches and integrited power managerment interfaces.
gulm: GuLM lock management tools/lib
iddev: contain libraries used to identify the filesytem(or volume manager) in which device is formatted.

GFS: The RedHat GFS module
GFS-kernel: GFS kernel module
gnbd: The GFS network block device module
gnbd-kernel: Kernel module for the GFS network block device
LVM2-Cluster: Cluster extensions for the logical volume manage.
gndb-kernelheader: gnbd kernel header files.

分类: 科技 标签:

Start/Stop Cluster

2007年6月28日 没有评论

Starting the cluster software in the below order:
# service ccsd start
# service cman start
# service fenced start (DLM cluster only)
# service clvmd start
# service gfs start
# service rgmanager start

Stop the cluster software in the below order:
# service rgmanager stop
# service gfs stop
# service clvmd stop
# service fenced stop
# service cman stop
# service ccsd stop

分类: 科技 标签:

/etc/cluster/cluster.conf

2007年6月28日 没有评论

Cluster name: should be descriptive enough to distinguish it from other clusters and systems on your network.

The config version(optional): is set to 1 by default and is automatically incremented each time you save your cluster configuration. However, if you need to set it to another value, you can specify it at the config_version box.

Specify the Fence Domain Properties parameters: Post-Join delay, Post-Fail delay.
The Post-Join Delay parameter is the number of seconds the fence daemon(fenced) waits before fencing a node after the node joins the fence domain. The Post-Join Delay default value is 3. A typical setting for Post-Join Delay is between 10 and 30 seconds, but can vary according to cluster and network performance.
The Post-Fail Delay parameter is the number of seconds the fence daemon(fenced) waits before fencing a node(a member of fence domain) after the node has failed. The Post-Fail Delay default value is 0. Its value may be varied to suit cluster and network performance. Refer to fenced(8) man page.

A failover domain is a named subset of cluster nodes that are eligible to run a cluster service in the event of a node failure. A failover domain can have the following characgters:
1. Unrestricted: Allows you to specify that a subset of members are preferred, but that a cluster service assigned to this domain can run on any available member.
2. Restricted: Allows you to restrict the members that can run a particular cluster service. If none of the members in a restricted failover domain are available, the cluster service can not be started(either manually or by the cluster software).
3. Unorderd: When a cluster service is assigned to an unorder failover domain. The member at the top of the list in the most preferred, followed by the second member in the list.
By default, failover domain are unrestricted and unordered.

To configure a preferred member, you can create an unrestricted failover domain comprising only one cluster member. Doing that cause a cluster service to run on that cluster member primariily(the preferred member), but allows the cluster service to failover to any of the other members.

The failover domain name should be descriptive enough to distinguish its purpose relative to other names used in cluster.

分类: 科技 标签:

Red Hat Cluster Suite Study Note 2

2007年6月28日 没有评论

Cluster Administrator GUI(system-config-cluster). Cluster Configuration Took, create edit “/etc/cluster/cluster.conf”. Cluster Status Tool, manager high-availability service.

Command-line tools
1.ccs-tool: cluster configuration system tool make online updates to the cluster configuration file. It provides the capability to create and modify cluster infrastructure components(Creat a cluster,add, remove nodes).
2.cman-tool: cluster management tool manages the CMAN cluster manager. It provides that capability to join a cluster, leave a cluster, kill a node, or change the expected quorum votes of a node in cluster.
3.fence-tool: use to join or leave the default fense domain specifically, it starts fence domain to join the domain and kill fenced to leave the domain.
4.clustat: cluster status utility. display the status of the cluster.
5.clusvcadm: cluster user service administration utility. allows you to enable,disable,relocate and restart high availabilty service in a domain.

cman.ko: the kernel module for CMAN
libcman.so.1.0.0: Library for programs that need to interact with cman.ko
DLM: dlm.ko and libdlm.so.1.0.0

gfs.ko: kernel module
gfs_fsck: repairs an unmounted GFS file system
gfs_grow: grows a mounted GFS file system
gfs_jadd: add journals to a mounted GFS file system
gfs_mkfs: create a GFS file system on a storage device
gfs_quota: manager quota on a mounted GFS file system
gfs_tool: configures or tunes a GFS file system
lock_harness.ko: allow a vaviety of locking mechanism
lock_dlm.ko: A lock module that implements DLM locking for GFS
lock_nolock.ko: A lock module for use when GFS is used as a local file system only.

The foundation of a cluster is an advanced host membership algorithm, this algorithm ensures that the cluster maintains complete data integrity by using the following method of inter-node communication.
1. Network connections between the cluster system
2. A cluster configuration system daemon(ccsd) that synchoronize configuration between cluster nodes.

No-single-point-of-failure hardware configuration:
Cluster can include a dual-controller RAID array, multiple bonded network channels, multiple paths between cluster members and storage, and redunant uninterruptible power supply(UPS) system to ensure that no single failure results in application down time or loss of data.

For RedHat cluster suite 4, node health is monitored through a cluster network heartbeat. In previous versions of Red Hat cluster suite, node health was monitored on shared disk. Shared disk is not required for node_heath monitoring in RHCS4.

To improve availability, protect against component failure and ensure data integrity under all failure conditions, more hardware is required.
Disk failure: Hardware RAID to replicate data across multiple disks.
RAID controller failure: Dual RAID controllers to provide redundant access to disk data.
Network Interface failure: Ethernet channel bonding and failover.
Power Source Failure: Redundant uninterruptible Power Supply(UPS) systems
Machine failure: Power Switch

APC: network-attached Power Switch

Because attached storage devices must have the same device special file on each node, it is recommended that the nodes have symmetric I/O subsystems. It means the machines would better have same configuraion.

Do not include any file system used as resource for a cluster device in the node’s local /etc/fstab, because the cluster software must control the mounting and unmounting of service file systems. For optimal performance of shared file systems make sure to specify a 4KB block size with the mke2fs -b command. A smaller block size can cause long fsck times.

分类: 科技 标签:

RedHat Cluster Suite Study Note 1

2007年6月27日 没有评论

There’re two cluster manager: CMAN and GULM. CMAN is an abbreviation for Cluster Manager, GULM is Grand Unified Lock Manager. CMAN uses DLM(Distributed Lock Manager), the difference is: CMAN is a distributed cluster manager,GULM is a client-server cluster manager.

The cluster manager keeps trace of cluster quorum by monitoring the count fo cluster nodes that run cluster manager. In CMAN cluster, all cluster nodes run cluster managers; In a GULM cluster only the GULM servers run cluster manager. If more than half the nodes that run cluster manager are active, the cluster has quorum. If half the nodes that run cluster manager(or fewer) are active, the cluster does not have quorum and all cluster activity is stopped.

Cluster quorum prevents the accurence of “split-brain” condition, a condition where two instance of the same cluster are running. A split-brain condition will allow each cluster instance to access cluster resource without knowledge of the other cluster instance, resulting in corrupted cluster integrity.

In CMAN cluster, quorum is determinted by communication of heartbeats among cluster nodes via Ethernet. Optionally, quorum can be determinted by a combination of communicating heartbeats via Ethernet and through a quorum disk. For quorum via Ethernet, quorum consist of 50 percent of the node votes plus 1. For quorum via quorum disk, quorum consists of user-specified conditions.

The cluster manager keeps track of membership by monitoring hearbeat message from other cluster nodes.

If a clusterr node does not transmit a heartbeat message within a prescribed amount of time, the cluster manager removes the node from the cluster and communications to other cluster infrastructure components that the node is not a member. Again, other cluster infrastructure components determinted what actions to take upon notifications that node is no longer a cluster manager. Fencing would fense the node that is no longer a member.

Lock managerment is a common clusster infrastructure service that provides a mechanism for other cluster infrastructure component to synchoronize their access to share resources.

GFS and CLVM use locks from the lock manager. GFS uses locks from the lock manager to synchoronize access to file system metedate on shared storage. CLVM uses locks from the lock manager to synchoronize update to LVM volumes and volume groups(also on shared storage).

Fencing is the disconnecting of a node from the cluster’s shared storage. Fencing cuts off I/O from shared storage, thus ensuring data integrity.

The cluster infrastructure performs fencing through one of the following programs according to the type of cluster manager and lock manager that is configured:
1. configured with CMAN/DLM: fenced, the fence daemon perform fencing
2. configured with GULM server: GULV performing fencing

When the cluster manager determines that a node has failed, it communicates to other cluster infrastructure components that the node has failed. The fencing program, when notificated that failed, fences the failed node, other cluster infrastructure components determine waht actions to take, that is, they perform any recovery that needs to done. For example, DLM and GFS(in cluster configured with CMAN/DLM),when notified of a node failure, suspend activity until they detect that the fencing program has completed fencing the failed node. Upon confirmation that the failed node is fenced, DLM and GFS perform recovery.DLM releases locks of the failed node; GFS recovers the journal of the failed node.

Fencing method: Power fencing, Fibre channnel switch fencing, GNBD fencing, Other fencing.
Specify a fencing method consist of editing a cluster configuration file to assign a fencing method name, the fence agent and the fencing device for each node in the cluster.

When you configure a node for multiple fencing methods, the fencing methods are cascaded from one fencing method to another according to the order of the fencing methods specified in the cluster configuration file.
1. If a node fails, it is fenced using the first fencing method specified in the cluster configuration file for that nodes.
2. If the first fencing method is not successful, the next fencing method specified for that node is used.
3. If none of the fencing methods is successful, then fencing starts again with the first fencing method specified and continutes looping through the fencing method in the order specified in the cluster configuration file until the node has been fenced.

CCS, Clsuter Configuration System. “/etc/cluster/cluster.conf” is an XML file including cluster name and cluster setting.

A failover domain is a subset of cluster nodes that are run on only one cluster eligate to run a particular cluster service. A cluster service can run on only one cluster node at a time to maintain data integrity.Failover domain, specify failover priority. The priortiy level determines the failover order, determining which node that a cluster service should faile over to. If do not specify failover priority, a cluster service can failure to any node in its failover domain.

Also, you can specify if a cluster service is restricted to run only on nodes of its associatied failover domain.

GFS is a cluster file system, allows a cluster of nodes to simultaneously access a block device that is shared among the nodes. GFS is a nativa filesystem that interfaces directly with the VFS layer of the linux kernel filesystem interface. GFS employs distributed metadata and multiple journals for optimal operation in a cluster. To maintain filesystem integrity, GFS uses a lock manager to coordinate I/O. When one node changed data on a GFS filesystem,that change is immediately visible to other cluster nodes using the filesystem.

The cluster logical volumn manager(CLVM) provides a cluster-wide version of LVM2. CLVM provides the same capabilities as LVM2 on a single node, but make the volumns available to all nodes in a RedHat cluster. The key component in CLVM is clvmd. clvmd is a daemon that provides clustering extensions to the standard LVM2 tool set and allows LVM2 commands to manager shared storage.clvmd runs in each cluster node and distributes LVM metadata updates in a cluster, thereby presenting each cluster node with the same view of the logical volumes. See /etc/lvm/lvm.conf.

GNBD(Global Network Block Device) provides block-device access to RedHat GFS over TCP/IP. GNBD is similar in concept to NBD: GNBD is useful when need for more robust technolies:FC or singe-initiator SCSI.

GNBD consists of two major components:GNBD clients and server. Client runs in a node with GFS and imports a block device exported by GNBD server. Server runs in another node and exports block-level storage from its local storage(either directly attached storage or SAN storage). Multiple GNBD clients can access a device exported by a GNBD server, thus making a GNBD suitable for use a group of nodes running GFS.

There are many ways to synchoronize data among real servers.For example, you can use shell script to post updated web pages to the real server simultaneously. Also, you can use program such as rsync to replicate changed data across all nodes at a set interval.

分类: 科技 标签:

Following up CCTV testing halt

2007年6月18日 没有评论

Test Environment:
2 x nodes: CentOS4.4 + Cluster + GFS
192.168.3.249
192.168.3.52
2 x Pressing source:
SuperMicro 812 box installed Windows Server 2003
SuperMicro 812 box installed Windows XP Professional

Test steps:
1.Build Cluster and make GFS share across the nodes, and exports it by sambe protocol.
2.Install Sobie(developed for CCTV dedicated testing) on Windows boxes.
3.Map the samba share as Windows local directory
4.Run Sobie server/client(40 processes) on the Windows.

Observer the output of the following commands:
# dstat -N eth0,eth1,eth2
Dstat is a tool for monitoring network translation.

When run the Sobie, the logic volumn is missed! Finally, I got the underlying reason, the Web GUI setting changed my BA880 setting including existed logical volumns, which should also had been re-formatted.

It took me nearly about two hours to fix it and recreate fresh physical volumens, volumn groups and logical volumns. This testing are now running and I will check it tomorrow morning.

——————————-
Update at July 30th, 2007:
This issue has been proved to be related with Samba when duplicate in Greatwall Lab.

分类: 科技 标签:

UT issue resolved

2007年6月11日 没有评论

Last week one of Tyan platforms failed after running 15 hours, our engineer investigated in this issue and reported we have to replace and re-configure it. As Zhufeng went to Beijing, Gavin told me to fix this issue and repair the relationship with UT customer. In UT, I add the new nodes. There’re two things we should care in the future:

1. Configure the DNS reserve setting. To verify the current settin, execute the following command to resolve the host, in my case:
# host 172.23.5.6
# host 172.23.5.7
Above two return the strings “ba880.nas_ba880.uit”.
# host ba880.nas_ba880.uit
It should return the correct IP addresses, 172.23.5.6 and 172.23.5.7.

2. Configure the /etc/iscsi.con”f and /etc/initial.conf. make sure the /etc/initial.conf lines is as same as the key in BS3000(storage devices). Because the former one has the right setting it can find the iscsi devices,(# iscsi-ls), I just copy its /etc/iscsi.conf and /etc/inital.conf to the new added nodes, then restart the iscsi service.

分类: 科技 标签:

SLED 10.0 Installation

2007年6月6日 没有评论

There is not DVD-RW driver to burn the DVD image downloaded from internet. This article introduces how to install SLED10.0 by network.

NFS server/Image directory: 192.168.3.246:/x86/suse.iso
Target machine: AMD x86-64 platform, it has been installed CentOS5.5.

First, log on CentOS5.5 system, and mount the suse.iso to its local directory:
# mount -o loop suse.iso /c
# cd /c/boot/x86-64/loader

Copy the following two files to /boot:
# cp linux /boot
# cp initrd /boot

Modify “/etc/grub.conf”, append the below lines:

title SuSE
root (hd0,0)
kernel /boot/linux
initrd /boot/initrd

Reboot the system with suse kernel, and load the network modulars. In this case, “xforce-series” modulars must be loaded. The left steps are very similar to RHEL: configure local IP, identify the NFS path.Note, the NFS path should be full path including “.iso” name, for example, “/vm/suse.iso”, while RHEL only need NFS share path, e.g.”/vm”.

This machine will be used for compiling the source code.

分类: 科技 标签: