存档

2005年7月 的存档

MPI library

2005年7月1日 没有评论

In the training class, Jiang said that using Intel’s MPI complier would improve 700% work rate than general complier(gcc). 411 is an secure tool for monitoring cluster service.

Two useful Linux commands, after having modified the fstab or mounted file-system, the following commands should be issued:

# service autofs restart

If we failed in “make” and wanted to recomplie(make) again, we should execute the following command first to delete former complied files which are not useful any more:

# make clean

ROCKs has its own database which records all the actions of cluster. The manual of ROCKs is “NFCA ROCKs cluster usr guide”. Now, the current release of ROCKs is 3.3.

All the above contents are included in the lessons of last date, I forget to write here yesterday. The following contents are todays’ lessons.

Parallel Program Design Methodology, including partition, communication, agglomeration(make all strips together) and mapping.Partition will cut the working load into many strips, just like grid. all these grid will communicate with each other, and they will be made together after computing complete.At last, assign tasks to processors in order to maximize processor utilization and minilizae interprocessor communication.

Distributed and shared memory are two type of parallel program strategy. Distributed nodes are independent and have own memory. Shared memory will share public memory. We can refer “Designing and building parallel program” writen by Lan Foster. Mr Jiang said this document perhaps would be downloaded for Internet.

Computations will be partitioned by Function or Domain.
– Functional decomposition: Divide the computation, then associate the data.Focusing on computations can reveal structure in a problem.
– Domain decomposition:Divide the data into pieces, then associate computation.Focus on largest or most frequently accessed data structure.

The most difficult thing is finding data parallelism.In some case, parallel is not suitable,for example, serial compute liking we could not do input and write parallel, if so, the result must be wrong. For database, data must be commited before be writen in the harddisk.

In the cluster’s computing, race always happens. Programmer use Debugging function frequently to find error in their code, however, during Debugging process, racing are all ignored.It is annoy in the paralle computing because racing really exist in the cluster.We’d better present Itanium platform to customer with Intel’s complier which can maximize the Itanium processor’s function. Tips: parameter “spec” indicate the performance of processor(CPU).

Gaussion, used for chemical computing, forbidden its source code in China area, so we can not improving performancing by Intel’s complier,thourgh optimizing hardware structure would get litter advantage.

The most common MPI implementations is MPICH. Some customers ofen complained that the hardware platform ran slowly when the program was writen by customes themselves or un-certificated in a few years. In fact, it was the non optimize program code causing this matter.We should tell the customers earlier with great patience.

In the Cluster which run computing, if there is one node haltes, then the whold computing failed. We could only check the halted nodes and restarts computing again when ensure every node is good. Is the cluster not stable? For example, there is a work which need one week computing time in one machine, now, imagine making the work computed in a cluster contains 4 nodes, the time will decrease sharply to 7 or 8 hours.If one of the four nodes failed when computing, we have to restart this job with more 7 hours, not waiting for a week. So, the example indicates cluster improves work rate greatly though it has a litter matter.

分类: 科技 标签: