Distributed systems are everywhere. A distributed system is one in which components located at networked computers communicate and coordinate their actions only by passing messages.
A distributed system is a collection of autonomous computers linked by a computer network and equipped with distributed system software. Distributed system software enables computers to coordinate their activities and to share resources of system like hardware, software and data also.
We define a distributed system as one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages. It is a collection of independent computers that appears to its users as a single coherent system.
A distributed system is a system with multiple components located on different machines that communicate and coordinate actions in order to appear as a single coherent to the end user.
It is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve common goal.
We can also define distributed system as “ A distributed system is a collection of separate and independent software and hardware components, called nodes that are networked and worked together coherently by coordinating and communicating through message passing or events, to fulfill one goal”.
A distributed system is a collection of independent components located on different machines that share messages with each other in order to achieve common goal. As such, the distributed system will appear as if it is one interface or computer to the end user.
Examples of Distributed Systems:
Examples of distributed systems are based on familiar and widely used computer networks: the Internet and the associated World Wide Web, Web Search, online gaming, email, social networks, e-Commerce etc.
The Internet:
The Internet is the most widely known example of a distributed system. The Internet facilitates the connection of many different computer systems in different geographical locations to share information and resources.
The internet is a very large distributed system. It enables users, wherever they are, to make use of services such as the World Wide Web, email and file transfer. Multimedia services are available in the internet, enabling users to access audio and video data including music, radio and TV channels and to hold phone and video conferences.
Financial Trading:
As an example, we look at distributed systems support financial trading markets. The financial industry has long been at the cutting edge of distributed systems technology with its need, in particular, for real-time access to a wide range of information sources (for example, current share prices and trends, economic and political developments). The industry employs automated monitoring and trading applications.
Telecommunication Networks:
Telephone and Cellular networks are examples of distributed networks. Telephone networks have been around for over a century and it started as an early example of a peer to peer network. Cellular networks are distributed networks with base stations physically distributed in areas called cells. As telephone networks have evolved to VOIP (voice over IP), it continues to grow in complexity as a distributed network.
Real-time Systems:
Many industries use real-time systems distributed in various areas, locally and globally. For example, airlines use real-time flight control systems, and manufacturing plants use automation control systems. Taxi companies also employ a dispatch system, and e-commerce and logistics companies typically use real-time package tracking systems when customers order a product.
Distributed Database System:
A distributed database has locations across multiple servers, physical locations, or both. Users can replicate or duplicate the data across systems. Many popular applications use distributed databases and are aware of the heterogeneous or homogenous properties of the distributed database system. A heterogeneous distributed database allows for multiple data models and different database management. An end-user can use a gateway to translate the data between nodes, typically because of merging systems and applications.
A homogenous distributed database is a system that has the same database management system and data model. This is easier to scale performance and manage when adding new locations and nodes.
Characteristics of Distributed System (Features of Distributed System):
Resource Sharing: The main motivating factor for constructing and using distributed systems is resource sharing. Resources such as printers, files, web pages or database records are managed by servers of the appropriate type. For example, web servers manage web pages and other web resources. Resources are accessed by clients – for example, the clients of web servers are generally called browsers.
Openness: Opennessis concerned with extension and improvement of distributed system.
Concurrency: multiple activities are executed at the same time
Scalability: In principle, distributed systems should also be relatively easy to expand or scale. We scale the distributed system by adding more computers in the network.
Fault Tolerance: Its care the reliability of system.
Transparency: Transparency hides the complexity of distributed system to the users and application programs. Differences between the various computers and the ways in which they communicate are mostly hidden from users. The same holds for the internal organization of the distributed system.
Heterogeneity: In distributed systems components can have variety and differences in networks, computer hardware, operating systems, Programming languages and implementations by developers.
Goals of Distributed System:
A distributed system should make resources easily accessible; it should reasonably hide the fact that resources are distributed across a network; it should be open; and it should be scalable.
Making resources accessible:
The main goal of a distributed system is to make it easy for the users (and applications) to access remote resources, and to share them in a controlled and efficient way. Resources can be just about anything, but typical examples include things like printers, computers, storage facilities, data, files, Web pages, and networks, to name just a few. There are many reasons for wanting to share resources. One obvious reason is that of economics. For example, it is cheaper to let a printer be shared by several users in a small office than having to buy and maintain a separate printer for each user. Likewise, it makes economic sense to share costly resources such as supercomputers, high-performance storage systems and other expensive peripherals.
Connecting users and resources also makes it easier to collaborate and exchange information, as is clearly illustrated by the success of the Internet with its simple protocols for exchanging files, mail, documents, audio, and video.
Distribution transparency:
An important goal of a distributed system is to hide the fact that its processes and resources are physically distributed across multiple computers. A distributed system that is able to present itself to users and applications as if it were only a single computer system is said to be transparent.
Types of Transparency:
Access transparency: Access transparency deals with hiding differences in data representation and the way that resources can be accessed by users. At a basic level, we wish to hide differences in machine architectures, but more important is that we reach agreement on how data is to be represented by different machines and operating systems.
Location transparency: Location transparency refers to the fact that users cannot tell where a resource is physically located in the system. Naming plays an important role in achieving location transparency. In particular, location transparency can be achieved by assigning only logical names to resources, that is, names in which the location of a resource is not secretly encoded.
Migration transparency: Distributed systems in which resources can be moved without affecting how those resources can be accessed are said to provide migration transparency.
Relocation transparency: Even stronger is the situation in which resources can be relocated while they are being accessed without the user or application noticing anything. In such cases, the system is said to support relocation transparency.
Replication transparency: replication plays a very important role in distributed systems. For example, resources may be replicated to increase availability or to improve performance by placing a copy close to the place where it is accessed. Replication transparency deals with hiding the fact that several copies of a resource exist. To hide replication from users, it is necessary that all replicas have the same name. Consequently, a system that supports replication transparency should generally support location transparency as well, because it would otherwise be impossible to refer to replicas at different locations.
Concurrency transparency: an important goal of distributed systems is to allow sharing of resources. In many cases, sharing resources is done in a cooperative way, as in the case of communication. However, there are also many examples of competitive sharing of resources. For example, two independent users may each have stored their files on the same file server or may be accessing the same tables in a shared database. In such cases, it is important that each user does not notice that the other is making use of the same resource. This phenomenon is called concurrency transparency. An important issue is that concurrent access to a shared resource leaves that resource in a consistent state. Consistency can be achieved through locking mechanisms, by which users are, in turn, given exclusive access to the desired resource.
Failure transparency: Making a distributed system failure transparent means that a user does not notice that a resource (he has possibly never heard of) fails to work properly, and that the system subsequently recovers from that failure.
Openness: Another important goal of distributed systems is openness. An open distributed system is a system that offers services according to standard rules that describe the syntax and semantics of those services. For example, in computer networks, standard rules govern the format, contents, and meaning of messages sent and received.
Another important goal for an open distributed system is that it should be easy to configure the system out of different components (possibly from different developers). Also, it should be easy to add new components or replace existing ones without affecting those components that stay in place. In other words, an open distributed system should also be extensible. For example, in an extensible system, it should be relatively easy to add parts that run on a different operating system. or even to replace an entire file system. As many of us know from daily practice, attaining such flexibility is easier said than done.
Scalability: Scalability of a system can be measured along at least three different dimensions (Neuman, 1994). First, a system can be scalable with respect to its size, meaning that we can easily add more users and resources to the system. Second, a geographically scalable system is one in which the users and resources may lie far apart. Third, a system can be administratively scalable, meaning that it can still be easy to manage even if it spans many independent administrative organizations.
Challenges of Distributed System:
The challenges arising from the construction of distributed systems are the heterogeneity of their components, openness (which allows components to be added or replaced), security, scalability – the ability to work well when the load or the number of users increases – failure handling, concurrency of components, transparency and providing quality of service.
Heterogeneity:
Distributed system must be constructed from a variety of different networks, operating systems, computer hardware and programming languages. The Internet communication protocols mask the difference in networks, and middleware can deal with the other differences.
The Internet enables users to access services and run applications over a heterogeneous collection of computers and networks. Although the Internet consists of many different sorts of network, their differences are masked by the fact that all of the computers attached to them use the Internet protocols to communicate with one another. Different programming languages use different representations for characters and data structures such as arrays and records. These differences must be addressed if programs written in different languages are to be able to communicate with one another. Programs written by different developers cannot communicate with one another unless they use common standards.
Middleware:
In order to support heterogeneous computers and networks while offering a single-system view, distributed systems are often organized by means of a layer of software-that is, logically placed between a higher-level layer consisting of users and applications, and a layer underneath consisting of operating systems and basic communication facilities, as shown in Fig. below.
The term middleware applies to a software layer that provides a programming abstraction as well as masking the heterogeneity of the underlying networks, hardware, operating systems and programming languages.
The Common Object Request Broker (CORBA) is an example. Some middleware, such as Java Remote Method Invocation (RMI) supports only a single programming language. Most middleware is implemented over the Internet protocols, which themselves mask the differences of the underlying networks, but all middleware deals with the differences in operating systems and hardware.
In addition to solving the problems of heterogeneity, middleware provides a uniform computational model for use by the programmers of servers and distributed applications. Possible models include remote object invocation, remote event notification, remote SQL access and distributed transaction processing. For example, CORBA provides remote object invocation, which allows an object in a program running on one computer to invoke a method of an object in a program running on another computer. Its implementation hides the fact that messages are passed over a network in order to send the invocation request and its reply.
Heterogeneity and mobile code:
The term mobile code is used to refer to program code that can be transferred from one computer to another and run at the destination – Java applets are an example. Code suitable for running on one computer is not necessarily suitable for running on another because executable programs are normally specific both to the instruction set and to the host operating system. The virtual machine approach provides a way of making code executable on a variety of host computers: the compiler for a particular language generates code for a virtual machine instead of a particular hardware order code. For example, the Java compiler produces code for a Java virtual machine, which executes it by interpretation. The Java virtual machine needs to be implemented once for each type of computer to enable Java programs to run. Today, the most commonly used form of mobile code is the inclusion Javascript programs in some web pages loaded into client browsers.
Openness:
Distributed systems should be extensible. The openness of a computer system is the characteristic that determines whether the system can be extended and re-implemented in various ways. The openness of distributed systems is determined primarily by the degree to which new resource-sharing services can be added and be made available for use by a variety of client programs.
Security:
Many of the information resources that are made available and maintained in distributed systems have a high intrinsic value to their users. Their security is therefore of considerable importance. Security for information resources has three components: confidentiality (protection against disclosure to unauthorized individuals), integrity (protection against alteration or corruption), and availability (protection against interference with the means to access the resources).
Encryption can be used to provide adequate protection of shared resources and to keep sensitive information secret when it is transmitted in messages over a network.
Although the Internet allows a program in one computer to communicate with a program in another computer irrespective of its location, security risks are associated with allowing free access to all of the resources in an intranet. Although a firewall can be used to form a barrier around an intranet, restricting the traffic that can enter and leave, this does not deal with ensuring the appropriate use of resources by users within an intranet, or with the appropriate use of resources in the Internet, that are not protected by firewalls.
Scalability:
A distributed system is scalable if the cost of adding a user is a constant amount in terms of the resources that must be added. Distributed systems operate effectively and efficiently at many different scales, ranging from a small intranet to the Internet. A system is described as scalable if it will remain effective when there is a significant increase in the number of resources and the number of users. The number of computers and servers in the Internet has increased dramatically.
Failure Handling:
Computer systems sometimes fail. When faults occur in hardware or software, programs may produce incorrect results or may stop before they have completed the intended computation.
Failures in a distributed system are partial – that is, some components fail while others continue to function. Therefore the handling of failures is particularly difficult. The following techniques for dealing with failures are discussed here:
Detecting failures: Some failures can be detected. For example, checksums can be used to detect corrupted data in a message or a file. It is difficult or even impossible to detect some other failures, such as a remote crashed server in the Internet. The challenge is to manage in the presence of failures that cannot be detected but may be suspected.
Masking failures: Some failures that have been detected can be hidden or made less severe. Two examples of hiding failures:
1. Messages can be retransmitted when they fail to arrive.
2. File data can be written to a pair of disks so that if one is corrupted, the other may still be correct.
Tolerating failures: Most of the services in the Internet do exhibit failures – it would not be practical for them to attempt to detect and hide all of the failures that might occur in such a large network with so many components. Their clients can be designed to tolerate failures, which generally involves the users tolerating them as well.
Recovery from failures: Recovery involves the design of software so that the state of permanent data can be recovered or ‘rolled back’ after a server has crashed. In general, the computations performed by some programs will be incomplete when a fault occurs, and the permanent data that they update (files and other material stored in permanent storage) may not be in a consistent state.
Concurrency:
The presence of multiple users in a distributed system is a source of concurrent requests to its resources. Both services and applications provide resources that can be shared by clients in a distributed system. There is therefore a possibility that several clients will attempt to access a shared resource at the same time. For example, a data structure that records bids for an auction may be accessed very frequently when it gets close to the deadline time. The process that manages a shared resource could take one client request at a time. But that approach limits throughput. Therefore services and applications generally allow multiple client requests to be processed concurrently.
Transparency:
Transparency is defined as the concealment from the user and the application programmer of the separation of components in a distributed system, so that the system is perceived as a whole rather than as a collection of independent components. The implications of transparency are a major influence on the design of the system software.
The aim is to make certain aspects of distribution invisible to the application programmer so that they need only be concerned with the design of their particular application. For example, they need not be concerned with its location or the details of how its operations are accessed by other components, or whether it will be replicated or migrated. Even failures of networks and processes can be presented to application programmers in the form of exceptions – but they must be handled.
Access transparency enables local and remote resources to be accessed using identical operations.
Location transparency enables resources to be accessed without knowledge of their physical or network location (for example, which building or IP address).
Concurrency transparency enables several processes to operate concurrently using shared resources without interference between them.
Replication transparency enables multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers.
Failure transparency enables the concealment of faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components.
Mobility transparency allows the movement of resources and clients within a system without affecting the operation of users or programs. Performance transparency allows the system to be reconfigured to improve performance as loads vary.
Scaling transparency allows the system and applications to expand in scale without change to the system structure or the application algorithms
Distributed System Models:
Distributed System Models is also called Distributed System Design. The common properties and design issues for distributed system are explained in distributed system models.
Systems that are intended for use in real-world environments should be designed to function correctly in the widest possible range of circumstances and in the face of many possible difficulties and threats. Distributed systems of different types share important underlying properties and give rise to common design problems.
Each type of model is intended to provide an abstract, simplified but consistent description of a relevant aspect of distributed system design:
Distributed System Model is divided into two types:
1. Architectural model
2. Fundamental Model
Architectural Models:
Architectural Model deals with organization of components and their interrelationships. The architecture of a system is its structure in terms of separately specified components and their interrelationships.
An architectural model of a distributed system is concerned with the placement of its components and the relationship between them. The overall goal of architectural model is to ensure that the structure will meet present and likely future demands on it.
Major concerns are to make the system reliable, manageable, adaptable and cost-effective. The architectural design of a building has similar aspects – it determines not only its appearance but also its general structure and architectural style and provides a consistent frame of reference for the design.
Architectural models describe a system in terms of the computational and communication tasks performed by its computational elements; the computational elements being individual computers or aggregates of them supported by appropriate network interconnections.
The two most commonly used forms of Architectural Model for distributed systems are :
- Client-Server Model
- Peer-to-Peer Model
Client-Server Model: It is the most important and the most widely used architectural model. In client server model, client processes interact with individual server processes in separate host computers in order to access the shared resources that they manage.
Servers may in turn be clients of other servers. For example, a web server is often a client of a local file server that manages the files in which the web pages are stored.
Web servers and most other Internet services are clients of the DNS service, which translates Internet domain names to network addresses. Another web-related example concerns search engines, which enable users to look up summaries of information available on web pages at sites throughout the Internet. These summaries are made by programs called web crawlers, which run in the background at a search engine site using HTTP requests to access web servers throughout the Internet. Thus a search engine is both a server and a client: it responds to queries from browser clients and it runs web crawlers that act as clients of other web servers.
Peer-to-Peer Model: In this architecture all of the processes involved in a task or activity play similar roles, interacting cooperatively as peers without any distinction between client and server processes or the computers on which they run. In practical terms, all participating processes run the same program and offer the same set of interfaces to each other.
Fundamental Models:
Fundamental models are based on some fundamental properties such as performance, reliability and security. All of the models share the design requirements of achieving the performance and reliability characteristics of processes and networks and ensuring the security of the resources in the system.
Fundamental models based on the fundamental properties that allow us to be more specific about their characteristics and the failures and security risks they might exhibit.
Fundamental models concerned with properties that are common in all of the architectural models. Fundamental model addressed by three models:
Interaction Model:
The Interaction Model is concerned with the performance of processes and communication channels and the absence of a global clock. It identifies a synchronous system as one in which known bounds may be placed on process execution time, message delivery time and clock drift. It identifies an asynchronous system as one in which no bounds may be placed on process execution time, message delivery time and clock drift – which is a description of the behaviour of the Internet.
Failure Model:
The Failure model attempts to give a precise specification of the faults that can be exhibited by processes and communication channels. It defines reliable communication and correct processes. The failure model classifies the failures of processes and basic communication channels in a distributed system.
Security Model:
The Security Model identifies the possible threats to processes and communication channels. It introduces the concept of a secure channel, which is secure against those threats. Some of those threats relate to integrity: malicious users may tamper with messages or replay them. Others threaten their privacy. Another security issue is the authentication of the principal (user or server) on whose behalf a message was sent. Secure channels use cryptographic techniques to ensure the integrity and privacy of messages and to authenticate pairs of communicating principals.
Variations of Client Server Architecture:
· Proxy Server
· Thin Client
· Mobile Agent
Proxy Server:
Proxy Server refers to a server that acts as an intermediary between client and server. The proxy server is a computer on the internet that accepts the incoming requests from the client and forwards those requests to the destination server. It has its own IP address. It separates the client program and web server from the global network.
The purpose of proxy servers is to increase the availability and performance of the service by reducing the load on the wide area network and web servers. Proxy servers can take on other roles; for example, they may be used to access remote web servers through a firewall.
Thin Client:
A thin client is a computer that runs from resources stored on a central server instead of a localized hard drive. It is virtual desktop computing model that runs on the resources stored on a central server.
Thin clients are small, silent devices that communicate with a central server giving a computing experience that is largely identical to that of a PC.
Mobile Agent:
Mobile Agents are transportable programs that can move over the network and act on behalf of the user. A mobile agent is a running program (including both code and data) that travels from one computer to another in a network carrying out a task on someone’s behalf, such as collecting information, and eventually returning with the results. A mobile agent may make many invocations to local resources at each site it visits – for example, accessing individual database entries.
Mobile agents might be used to install and maintain software on the computers within an organization or to compare the prices of products from a number of vendors by visiting each vendor’s site and performing a series of database operations.
Mobile agents (like mobile code) are a potential security threat to the resources in computers that they visit. The environment receiving a mobile agent should decide which of the local resources it should be allowed to use, based on the identity of the user on whose behalf the agent is acting – their identity must be included in a secure way with the code and data of the mobile agent. In addition, mobile agents can themselves be vulnerable – they may not be able to complete their task if they are refused access to the information they need. The tasks performed by mobile agents can be performed by other means. For example, web crawlers that need to access resources at web servers throughout the Internet work quite successfully by making remote invocations to server processes. For these reasons, the applicability of mobile agents may be limited.