存档

作者存档

Alexandre Pato To Milan!

2007年8月3日 没有评论

From www.acmilan.com 8/2/2007
MILANO – The great talent Alexandre Pato will train with Milan and will be able to play the friendly matches starting from the 3rd of September 2007. From teh 3rd of January 2008 onwards he will also be able to play in the official games.

分类: 科技 标签:

Today of last year

2007年7月31日 没有评论

I went to work at Plasmon,Zhuhai in today of last year.
This 8 months experience is very senseful for myself development.

分类: 科技 标签:

Monthly Summary

2007年7月30日 没有评论

Study on CentOS 4.5 new features (Cluster Suite and Global File System) from the beginning of July, and finnally submit a detail Feasibility Study report at July 20th, the today before my birthday. This researching document indicating how to configure failover NFS(both active/passive and active/active) and GFS performance tunning, can greatly help our team in the next generation software. However, Samba failover across cluster has not been fixed.

Last week, I leaded three Engineers(Lin Peng, Hou Xianghua, Pang Rangsheng) to prepare the machines for UT customer. I learned a lot when working with this new group, and the most important thing is I must let every member in my team know his responsibilty. I act as directer, who stand in center to help the team accomplish the works. Second, make clear conclusion, continute or cancel the test, always focus on the target that these machines need be send to customer at Friday. Always encourage the team, calmly analyze the meeting problems and try to resolve it bravely. Respect everyone, and have good communication. In effection, clearly mind, strong decision and bravery are three key points.

In the following spare time, I think I should enhance the program ability and know deeply about computer networking. I plan to review C program at home, and if I have free time in office, I’d like to the book named “Computer Networking”. This study scheme is arranged to be accomplished in August.

Phillip, you have been 28 years old. Keep studing, never give up.

分类: 科技 标签:

Install Java Plug-In on Mozilla Firefox

2007年7月10日 没有评论

The BS3000/3000e WebGUI requires Java support, while the default Mozilla firefox on my Mandriva2K7 does not implement this plug-in. I fix this issue with the following steps.

1. Download the Java(TM) 2 RE(j2re-1.4.2-03-linux-i586.bin)
# wget ftp://ftp.informatik.hu-berlin.de/pub/Java/Linux/JDK-1.4.2/i386/03/j2re-1.4.2-03-linux-i586.bin

2. Change to the directory “/usr/local”
# cp j2re-1.4.2-03-linux-i586.bin /usr/local
# cd /usr/local
# chmod 777 j2re-1.4.2-03-linux-i586.bin

3. Extract the contents of the Java 2 SDK by running the shell script
# ./j2re-1.4.2-03-linux-i586.bin

4. Add j2re1.4.2/bin to PATH,
# export PATH=/usr/local/j2re1.4.2/bin:$PATH

5. Implement the Java Plug-In to Firefox:
#ln -s /usr/local/j2re1.4.2/plugin/i386/mozilla/libjavaplugin_oji.so /usr/lib/mozilla-firefox-1.5.0.7/plugins/

Note: It has to be a symbolic link, copying libjavaplugin_oji.so will NOT work. OK, the firefox is able to support Java GUI now.

分类: 科技 标签:

LVM2

2007年7月7日 没有评论

Logical Volume provide the following advantages over using physical storage directly:
1. Flexible capacity: file system can extend across multiple disks, since you can aggregate disks and partitions into a single logical volume.
2. Resizable storage pools: Extend/Reduce logical volumes
3. Online data relocation
4. Convenient device naming
5. Disk stripping, stripes data across two or more disks, this can dramatically increase throughout.
6. Mirror volume: logical volumes provide a convenient way to configure a mirror for your data.
7. Volumes snapshots: take device snapshots for consistent backups or to test the effect of changes without affecting the real data.

For the RHEL4 release, the original LVM1 becomes to LVM2. LUM is backward compatible with LVM1, with the expection of snapshot and cluster support. Convert a volume group from LUM1 to LVM2 with vgconvert command.

The clustered logical volume manager(CLVM) is a set of clustering extension to LVM, these extensions allows a cluster of computers to manager shared storage using LVM. clmvd daemon runs in each cluster computer with the same view of the logical volumes.

CLVM allows a user to configure logical volumes on a shared storage by locking access to physical storage while a logical volume is being configured, CLVM uses the locking service provides by high available symmetric infrastructure.

LVM logical volumes:
1. linear volumes
2. stripped logical volumes
3. mirrored logical volumes
4. snapshot volumes

Initializing a block device as a physical volume places a label near the start of the device. By default, the LVM label is placed in the second 512-bytes sector. You can overwrite this default by placing the label on any of the first 4 sectors.
LVM label identifies the devices as LVM physical volume. It contains a random unique identifier(the uuid) for the physical volume. It also stores the size of the block device in bytes, and it records where the LVM metadata will be stored on the devices.

LVM metadata is small an stored as ASCII.

Currently LVM allows you to store 0, 1 or 2 identifical copies of its metadata on each physical volume, the default is 1 copy, once you configure the number of metadata copies on the physical volume, you can not change the number at a later time. The first copy is stored at the start of the device, shortly after the label. If there’s a second copy, it is placed at the end of the device. If you accidently overwrite the area at the beginning of your disk by writting to a different disk than you intend, a second copy of metadata at the end of the device will allow you to recovery the metadata.

In Linux kernel, sector are considered to be 512 bytes in size.

Stripping performance:
LVM can not tell that two PV are on the same physical disk, if you create a striped logical volumes when two PV are on the same physical disk, the stripes could be on different partitions on the same disk. This would result in a decrease in performance rather than an increase.

Stripped logical volumes:
The filesystem lays the data out across the underlying physical volumes, you can control the way the data is written to the physical volumes by creating a stripped logical volumes.For large sequential reads and writes, this can improve the efficiency of the data I/O.

Stripping enhances performance by writing data to a predetermined number of physical volumes in round-round fashion. With stripping, I/O can be done in pararrel. In some situations, this can result in near-linear performance gain of each addtional pv in the stripe.

If you have two-way strip that uses up an entire volume group, adding a single physical volume to the volume group will not enable you to extend the stripe. Instead, you must add at least two PVs to the vg.

分类: 科技 标签:

clustat and required packages

2007年6月28日 没有评论

To monitor the cluster and display status at specific time intervals from a shell prompt, invoke clustat with the “-i” time option, where time specifies the number of seconds between status snapshots. The following example causes the clustat utility to display cluster status every 10 seconds
# clustat -i 10

The cluster configuration Tools automatically retains backup copies of the three most recently used configuration files (beside the currently used configuration File). Retaining the backup copies is useful if the cluster does not function correctly because of miconfiguration and you need to return to a precious working configuration.

rgmanager: manager cluster services and resoures.
system-config-cluster: contain the cluster configuration tool.
ccsd: contains the cluster configuration service daemon
magma: contains an interface library for cluster lock management
magma-plugins: contains plugins for the magma library.
cman: manager cluster membership, messaging and notification
cman-kernel: contains required cman kernal modules.
dlm: contains distributed lock managerment
dlm-kernel: contains required DLM kernel module
fence: The cluster I/O fencing system that allows cluster nodes to connect to a variety of network power switches fibre channel switches and integrited power managerment interfaces.
gulm: GuLM lock management tools/lib
iddev: contain libraries used to identify the filesytem(or volume manager) in which device is formatted.

GFS: The RedHat GFS module
GFS-kernel: GFS kernel module
gnbd: The GFS network block device module
gnbd-kernel: Kernel module for the GFS network block device
LVM2-Cluster: Cluster extensions for the logical volume manage.
gndb-kernelheader: gnbd kernel header files.

分类: 科技 标签:

Start/Stop Cluster

2007年6月28日 没有评论

Starting the cluster software in the below order:
# service ccsd start
# service cman start
# service fenced start (DLM cluster only)
# service clvmd start
# service gfs start
# service rgmanager start

Stop the cluster software in the below order:
# service rgmanager stop
# service gfs stop
# service clvmd stop
# service fenced stop
# service cman stop
# service ccsd stop

分类: 科技 标签:

/etc/cluster/cluster.conf

2007年6月28日 没有评论

Cluster name: should be descriptive enough to distinguish it from other clusters and systems on your network.

The config version(optional): is set to 1 by default and is automatically incremented each time you save your cluster configuration. However, if you need to set it to another value, you can specify it at the config_version box.

Specify the Fence Domain Properties parameters: Post-Join delay, Post-Fail delay.
The Post-Join Delay parameter is the number of seconds the fence daemon(fenced) waits before fencing a node after the node joins the fence domain. The Post-Join Delay default value is 3. A typical setting for Post-Join Delay is between 10 and 30 seconds, but can vary according to cluster and network performance.
The Post-Fail Delay parameter is the number of seconds the fence daemon(fenced) waits before fencing a node(a member of fence domain) after the node has failed. The Post-Fail Delay default value is 0. Its value may be varied to suit cluster and network performance. Refer to fenced(8) man page.

A failover domain is a named subset of cluster nodes that are eligible to run a cluster service in the event of a node failure. A failover domain can have the following characgters:
1. Unrestricted: Allows you to specify that a subset of members are preferred, but that a cluster service assigned to this domain can run on any available member.
2. Restricted: Allows you to restrict the members that can run a particular cluster service. If none of the members in a restricted failover domain are available, the cluster service can not be started(either manually or by the cluster software).
3. Unorderd: When a cluster service is assigned to an unorder failover domain. The member at the top of the list in the most preferred, followed by the second member in the list.
By default, failover domain are unrestricted and unordered.

To configure a preferred member, you can create an unrestricted failover domain comprising only one cluster member. Doing that cause a cluster service to run on that cluster member primariily(the preferred member), but allows the cluster service to failover to any of the other members.

The failover domain name should be descriptive enough to distinguish its purpose relative to other names used in cluster.

分类: 科技 标签:

Red Hat Cluster Suite Study Note 2

2007年6月28日 没有评论

Cluster Administrator GUI(system-config-cluster). Cluster Configuration Took, create edit “/etc/cluster/cluster.conf”. Cluster Status Tool, manager high-availability service.

Command-line tools
1.ccs-tool: cluster configuration system tool make online updates to the cluster configuration file. It provides the capability to create and modify cluster infrastructure components(Creat a cluster,add, remove nodes).
2.cman-tool: cluster management tool manages the CMAN cluster manager. It provides that capability to join a cluster, leave a cluster, kill a node, or change the expected quorum votes of a node in cluster.
3.fence-tool: use to join or leave the default fense domain specifically, it starts fence domain to join the domain and kill fenced to leave the domain.
4.clustat: cluster status utility. display the status of the cluster.
5.clusvcadm: cluster user service administration utility. allows you to enable,disable,relocate and restart high availabilty service in a domain.

cman.ko: the kernel module for CMAN
libcman.so.1.0.0: Library for programs that need to interact with cman.ko
DLM: dlm.ko and libdlm.so.1.0.0

gfs.ko: kernel module
gfs_fsck: repairs an unmounted GFS file system
gfs_grow: grows a mounted GFS file system
gfs_jadd: add journals to a mounted GFS file system
gfs_mkfs: create a GFS file system on a storage device
gfs_quota: manager quota on a mounted GFS file system
gfs_tool: configures or tunes a GFS file system
lock_harness.ko: allow a vaviety of locking mechanism
lock_dlm.ko: A lock module that implements DLM locking for GFS
lock_nolock.ko: A lock module for use when GFS is used as a local file system only.

The foundation of a cluster is an advanced host membership algorithm, this algorithm ensures that the cluster maintains complete data integrity by using the following method of inter-node communication.
1. Network connections between the cluster system
2. A cluster configuration system daemon(ccsd) that synchoronize configuration between cluster nodes.

No-single-point-of-failure hardware configuration:
Cluster can include a dual-controller RAID array, multiple bonded network channels, multiple paths between cluster members and storage, and redunant uninterruptible power supply(UPS) system to ensure that no single failure results in application down time or loss of data.

For RedHat cluster suite 4, node health is monitored through a cluster network heartbeat. In previous versions of Red Hat cluster suite, node health was monitored on shared disk. Shared disk is not required for node_heath monitoring in RHCS4.

To improve availability, protect against component failure and ensure data integrity under all failure conditions, more hardware is required.
Disk failure: Hardware RAID to replicate data across multiple disks.
RAID controller failure: Dual RAID controllers to provide redundant access to disk data.
Network Interface failure: Ethernet channel bonding and failover.
Power Source Failure: Redundant uninterruptible Power Supply(UPS) systems
Machine failure: Power Switch

APC: network-attached Power Switch

Because attached storage devices must have the same device special file on each node, it is recommended that the nodes have symmetric I/O subsystems. It means the machines would better have same configuraion.

Do not include any file system used as resource for a cluster device in the node’s local /etc/fstab, because the cluster software must control the mounting and unmounting of service file systems. For optimal performance of shared file systems make sure to specify a 4KB block size with the mke2fs -b command. A smaller block size can cause long fsck times.

分类: 科技 标签:

RedHat Cluster Suite Study Note 1

2007年6月27日 没有评论

There’re two cluster manager: CMAN and GULM. CMAN is an abbreviation for Cluster Manager, GULM is Grand Unified Lock Manager. CMAN uses DLM(Distributed Lock Manager), the difference is: CMAN is a distributed cluster manager,GULM is a client-server cluster manager.

The cluster manager keeps trace of cluster quorum by monitoring the count fo cluster nodes that run cluster manager. In CMAN cluster, all cluster nodes run cluster managers; In a GULM cluster only the GULM servers run cluster manager. If more than half the nodes that run cluster manager are active, the cluster has quorum. If half the nodes that run cluster manager(or fewer) are active, the cluster does not have quorum and all cluster activity is stopped.

Cluster quorum prevents the accurence of “split-brain” condition, a condition where two instance of the same cluster are running. A split-brain condition will allow each cluster instance to access cluster resource without knowledge of the other cluster instance, resulting in corrupted cluster integrity.

In CMAN cluster, quorum is determinted by communication of heartbeats among cluster nodes via Ethernet. Optionally, quorum can be determinted by a combination of communicating heartbeats via Ethernet and through a quorum disk. For quorum via Ethernet, quorum consist of 50 percent of the node votes plus 1. For quorum via quorum disk, quorum consists of user-specified conditions.

The cluster manager keeps track of membership by monitoring hearbeat message from other cluster nodes.

If a clusterr node does not transmit a heartbeat message within a prescribed amount of time, the cluster manager removes the node from the cluster and communications to other cluster infrastructure components that the node is not a member. Again, other cluster infrastructure components determinted what actions to take upon notifications that node is no longer a cluster manager. Fencing would fense the node that is no longer a member.

Lock managerment is a common clusster infrastructure service that provides a mechanism for other cluster infrastructure component to synchoronize their access to share resources.

GFS and CLVM use locks from the lock manager. GFS uses locks from the lock manager to synchoronize access to file system metedate on shared storage. CLVM uses locks from the lock manager to synchoronize update to LVM volumes and volume groups(also on shared storage).

Fencing is the disconnecting of a node from the cluster’s shared storage. Fencing cuts off I/O from shared storage, thus ensuring data integrity.

The cluster infrastructure performs fencing through one of the following programs according to the type of cluster manager and lock manager that is configured:
1. configured with CMAN/DLM: fenced, the fence daemon perform fencing
2. configured with GULM server: GULV performing fencing

When the cluster manager determines that a node has failed, it communicates to other cluster infrastructure components that the node has failed. The fencing program, when notificated that failed, fences the failed node, other cluster infrastructure components determine waht actions to take, that is, they perform any recovery that needs to done. For example, DLM and GFS(in cluster configured with CMAN/DLM),when notified of a node failure, suspend activity until they detect that the fencing program has completed fencing the failed node. Upon confirmation that the failed node is fenced, DLM and GFS perform recovery.DLM releases locks of the failed node; GFS recovers the journal of the failed node.

Fencing method: Power fencing, Fibre channnel switch fencing, GNBD fencing, Other fencing.
Specify a fencing method consist of editing a cluster configuration file to assign a fencing method name, the fence agent and the fencing device for each node in the cluster.

When you configure a node for multiple fencing methods, the fencing methods are cascaded from one fencing method to another according to the order of the fencing methods specified in the cluster configuration file.
1. If a node fails, it is fenced using the first fencing method specified in the cluster configuration file for that nodes.
2. If the first fencing method is not successful, the next fencing method specified for that node is used.
3. If none of the fencing methods is successful, then fencing starts again with the first fencing method specified and continutes looping through the fencing method in the order specified in the cluster configuration file until the node has been fenced.

CCS, Clsuter Configuration System. “/etc/cluster/cluster.conf” is an XML file including cluster name and cluster setting.

A failover domain is a subset of cluster nodes that are run on only one cluster eligate to run a particular cluster service. A cluster service can run on only one cluster node at a time to maintain data integrity.Failover domain, specify failover priority. The priortiy level determines the failover order, determining which node that a cluster service should faile over to. If do not specify failover priority, a cluster service can failure to any node in its failover domain.

Also, you can specify if a cluster service is restricted to run only on nodes of its associatied failover domain.

GFS is a cluster file system, allows a cluster of nodes to simultaneously access a block device that is shared among the nodes. GFS is a nativa filesystem that interfaces directly with the VFS layer of the linux kernel filesystem interface. GFS employs distributed metadata and multiple journals for optimal operation in a cluster. To maintain filesystem integrity, GFS uses a lock manager to coordinate I/O. When one node changed data on a GFS filesystem,that change is immediately visible to other cluster nodes using the filesystem.

The cluster logical volumn manager(CLVM) provides a cluster-wide version of LVM2. CLVM provides the same capabilities as LVM2 on a single node, but make the volumns available to all nodes in a RedHat cluster. The key component in CLVM is clvmd. clvmd is a daemon that provides clustering extensions to the standard LVM2 tool set and allows LVM2 commands to manager shared storage.clvmd runs in each cluster node and distributes LVM metadata updates in a cluster, thereby presenting each cluster node with the same view of the logical volumes. See /etc/lvm/lvm.conf.

GNBD(Global Network Block Device) provides block-device access to RedHat GFS over TCP/IP. GNBD is similar in concept to NBD: GNBD is useful when need for more robust technolies:FC or singe-initiator SCSI.

GNBD consists of two major components:GNBD clients and server. Client runs in a node with GFS and imports a block device exported by GNBD server. Server runs in another node and exports block-level storage from its local storage(either directly attached storage or SAN storage). Multiple GNBD clients can access a device exported by a GNBD server, thus making a GNBD suitable for use a group of nodes running GFS.

There are many ways to synchoronize data among real servers.For example, you can use shell script to post updated web pages to the real server simultaneously. Also, you can use program such as rsync to replicate changed data across all nodes at a set interval.

分类: 科技 标签: