Cluster Computing



A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks. Clusters are usually deployed to improve performance and/or availability over that of a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.

The High Performance Computing (HPC) allows scientists and engineers to deal with very complex problems using fast computer hardware and specialized software. Since often these problems require hundreds or even thousands of processor hours to complete, an approach, based on the use of supercomputers, has been traditionally adopted. Recent tremendous increase in a speed of PC-type computers opens relatively cheap and scalable solution for HPC, using cluster technologies. The conventional MPP (Massively Parallel Processing) supercomputers are oriented on the very high-end of performance. As a result, they are relatively expensive and require special and also expensive maintenance support. Better understanding of applications and algorithms as well as a significant improvement in the communication network technologies and processors speed led to emerging of new class of systems, called clustersof SMP(symmetric multi processor) or networks of workstations(NOW), which are able to compete in performance with MPPs and have excellent price/performance ratios for special applications types.

A clusteris a group of independent computers working together as a single system to ensure that mission-critical applications and resources are as highly available as possible. The group is managed as a single system, shares a common namespace, and is specifically designed to tolerate component failures, and to support the addition or removal of components in a way that's transparent to users.

What is cluster computing?



Development of new materials and production processes, based on high technologies, requires a solution of increasingly complex computational problems. However, even as computer power, data storage, and communication speed continue to improve exponentially; available computational resources are often failing to keep up with what users? demand of them. Therefore high-performance computing (HPC) infrastructure becomes a critical resource for research and development as well as for many business applications. Traditionally the HPC applications were oriented on the use of high-end computer systems - so-called "supercomputers". Before considering the amazing progress in this field, some attention should be paid to the classification of existing computer architectures. SISD (Single Instruction stream, Single Data stream) type computers. These are the conventional systems that contain one central processing unit (CPU) and hence can accommodate one instruction stream that is executed serially. Nowadays many large mainframes may have more than one CPU but each of these executes instruction streams that are unrelated. Therefore, such systems still should be regarded as a set of SISD machines acting on different data spaces. Examples of SISD machines are for instance most workstations like those of DEC, IBM, Hewlett-Packard, and Sun Microsystems as well as most personal computers. SIMD (Single Instruction stream, Multiple Data stream) type computers. Such systems often have a large number of processing units that all may execute the same instruction on different data in lock-step. Thus, a single instruction manipulates many data in parallel. Examples of SIMD machines are the CPP DAP Gamma II and the Alenia Quadrics.
Shared memory (SM) systems have multiple CPUs all of which share the same address space. This means that the knowledge of where data is stored is of no concern to the user as there is only one memory accessed by all CPUs on an equal basis. Shared memory systems can be both SIMD and MIMD. Single-CPU vector processors can be regarded as an example of the former, while the multi-CPU models of these machines are examples of the latter.
Distributed memory (DM) systemsIn this case each CPU has its own associated memory. The CPUs are connected by some network and may exchange data between their respective memories when required. In contrast to shared memory machines the user must be aware of the location of the data in the local memories and will have to move or distribute these data explicitly when needed. Again, distributed memory systems may be either SIMD or MIMD.
Design:
Before attempting to build a cluster of any kind, think about the type of problems you are trying to solve. Different kinds of applications will actually run at different levels of performance on different kinds of clusters. Beyond the brute force characteristics of memory speed, I/O bandwidth, disk seek/latency time and bus speed on the individual nodes of your cluster, the way you connect your cluster together can have a great impact on its efficiency.
Homogeneous and Heterogeneous Clusters.The cluster can either be made of homogeneous machines, machines that have the same hardware and software configurations or as a heterogeneous cluster with machines of different configuration. Heterogeneous clusters face problems of different performance profiles, software configuration management.

Diskless Versus Disk full Configurations

This decision strongly influences what kind of networking system is used. Diskless systems are by their very nature slower performers, than machines that have local disks. This is because no matter how fast the CPU is? , the limiting factor on performance is how fast a program can be loaded over the network.
Network Selection.Speed should be the criterion for selecting the network.? Channel bonding, which is a software trick that allows multiple network connections to be tied, together to increase overall performance of the system can be used to increase the performance of Ethernet networks.
In our environment, the VPN plays a significant role in combining high performance Linux computational clusters located on separate private networks into one large cluster. The VPN, with its power to transparently combine two private networks through an existing open network, enabled us to connect seamlessly two unrelated clusters in different physical locations. The VPN connection creates a tunnel between gateways that allows hosts on two different subnets (e.g., 192.168.1.0/24 and 192.168.5.0/24) to see each other as if they are on the same network. Thus, we were able to operate critical network services such as NFS, NIS, rsh and the queue system over two different private networks, without compromising security over the open network. Furthermore, the VPN encrypts all the data being passed through the established tunnel and makes the network more secure and less prone to malicious exploits.
The VPN solved not only the previously discussed problems with security, but it also opened a new door for scalability. Since all the cluster nodes can reside in private networks and operate through the VPN, the entire infrastructure can be better organized and the IP addresses can be efficiently managed, resulting in a more scalable and much cleaner network. Before VPNs, it was a pending problem to assign public IP addresses to every single node on the cluster, which limited the maximum number of nodes that can be added to the cluster. Now, with a VPN, our cluster can expand in greater magnitude and scale in an organized manner. As can be seen, we have successfully integrated the VPN technology to our networks and have addressed important issues of scalability, accessibility and security in cluster computing.

Post a comment

0 Comments