Chapter 2 - Background On Distributed Systems

2.1 Characteristics of Distributed Systems

Tanenbaum discussed four characteristics of a distributed system:

"There needs to be one global method of communication so that process can operate together." 1

Each part of the system needs to be in constant communication with the other. The common method of communication is provided by CORBA. It was such a natural choice for the DGCC project that there was no further consideration taken. It is a natural communication pathway for data and commands. Due to the time frame in developing this project off-the-shelf solutions needed to be incorporated. This topic of incorporating existing technologies and software will be covered in the next aim of the project.

"Process management must also be the same everywhere. How processes are created, destroyed, started, and stopped must not vary from machine to machine. Not only must there be a single set of system calls available on all machines, but these calls must be designed so that they make sense in a distributed environment." 2

In performing the compilations, each node will be creating, destroying, starting and stopping a variety of processes. There cannot exist a wide variation in methods for controlling these processes. To keep this simple the same operating system and/or process management library will be useful.

"The file system must look the same everywhere. Also, every file should be visible at every location, subject to protection and security constraints, of course." 3

By using a common file system structure programming a system like DGCC is greatly simplified. Files can be manipulated by sending a text string containing the name of the file from a 'sender' object to an 'receiving' object. The objects do not have to be concerned with an actual file transfer, like FTP, but open and close the file like it actually exists on their local file system.

"As a logical consequence of having the same system call interface everywhere, it is normal that identical kernels run on all the CPUs in the system. Doing so makes it easier to co-ordinate activities that must be global." 4

The relevance of this can be seen by the fact that all operating system kernels are not the same. Differences due to obsolescence or new innovations might cause changes in the type of system calls the kernel support. DGCC aims to use a singular operating system to meet this requirement. This of course could be ignored if the various operating system kernels used support the required system calls in the same manner. An example of this could be the different variants of Linux.

Tannebaum talked about what characteristics that a typical distributed system should exhibit by its characteristics. All that is fine to know but what do they provide that a typical single workstation cannot do? In answer to this there are four items that will be encompass the advantages of a distributed system:

"By connecting machines together information can be shared amongst the systems easily." 5

Machines cooperating in a large company can easily transfer documents and other data in relative ease. This is typical in many computer networks that exist in today's offices. A distributed system is no different. Each participating machine needs to transfer work in the form of files, data streams and the like to other machines. Without a underlying structure connecting the machines the task might take an extremely long time to accomplish or be impossible to do. DGCC requires that a network connect each participating computer to promote this information sharing:

"Also spreading processes over a variety of machines reduces costs in comparison to buying one single system (supercomputer)." 6

This is first critical characteristic of a distributed system that DGCC is striving to prove correct. Single supercomputers can typically run into the millions of pounds to build and maintain. A server farm of powerful workstations or a whole office building of typical workstations could be used at a much lower cost of ownership. Upgrade of these systems could be much easier if the systems are cheaper or leased from an equipment provider.

"Promotes communication amongst developers and users of the system through various forms of communications (i.e. E-mail, Chat, video conferencing, etc.)." 7

This project will not attempt to provide any form of communication amongst developers. It is simply not an aim of this project.

"Finally, the system provides more flexibility than any single system." 8

This is a critical issue that is second characteristic this project desires to show. The fact that work can take place on the proposed system is an excellent way of sharing resources. The way in which those resources are distributed depends upon the work being performed. One or more project could be completed at the same time depending upon how participating machines are configured. Also faults due to network or equipment failures do not cause a loss of the whole system. Despite the many advantages of distributed systems there are a few disadvantages:

"Distributed systems are still in their early stages of development. A lot of work will be required to overcome some of the present difficulties of communication (i.e. transfer speed, memory, etc)." 9

DGCC is also in its early stages of development. The system developed to meet the project aims is an initial proof of concept. The use of distributed system like Distributed.net's cipher cracking project and SETI@Home project are just some of the examples of some initial distributed systems in operation. DGCC aims to overcome this disadvantage through continuous development from the open source community developers.

"A second potential problem is due to the communication network. Once the system comes to depend on the network, its loss or saturation can negate most of the advantages the distributed system was built to achieve." 10

At present the system is proposed to exist on a separate network from typical network traffic. With the introduction of IPv6 and Quality of service, DGCC could coexist with existing traffic. It was never an aim of this project to test the degradation of bandwidth caused by the operation of this project. In future, this system and others like it will need to be tested against existing networks to understand their effects.

"By providing sharing of resources (i.e. CPUs, memory and files) we open a potential security problem with users accessing information that they are not allowed to view. Security in a distributed system will have to be established to provide a hierarchy of permissions." 11

At present DGCC can be executed as any user in the system. The only security measures that are present are the operating system and the distributed file system used in this project. The distributed file system will be explained in the next chapter. The aim of this project right now was not to provide any security other than that already mentioned. Chapter 4 covers how security could potentially be incorporated into the DGCC project in the future.


Previous: Chapter 1 - Introduction Home Next: Chapter 2 - Required Properties Of A Distributed System