System Model. A distributed system is composed of a set of agents with well defined roles that cooperate to achieve a common goal. In practice, an agent can be implemented by a process or collection of them, by a processor, or any computation enabled entity. Moreover, any single entity that implements one agent could also implement multiple of them. Reasoning in terms of agents allows us to specify problems and algorithms more concisely and in terms of heterogeneous agents. Distributed systems can be classified in different axis according to the way agents exchange information, the way they fail and recover, and the relative speeds at which they perform computation. In this work we address asynchronous distributed systems in which agents can crash and recover, and use unreliable communication channels to exchange messages. In asynchronous distributed systems there are no bounds on the time it takes an agent to execute any action or for a message to be transmitted. We show that if such bounds exist, then the protocols we present in this thesis ensure some liveness properties, if the number of failures can be limited in time. Our liveness proofs require the bounds to exist but do not require them to be known by any agent. Even though we assume that agents may recover, they are not obliged to do so once they have failed. For simplicity, an agent is considered to be nonfaulty iff it never fails. Agents are assumed to have access to local stable storage which they can use to keep their state in between failures. State not kept in stable storage is reset after a crash. Lastly, we assume that agents do not execute any arbitrary step, i.e., we do not consider byzantine failures. Although channels are unreliable, we assume that if agents keep retransmitting their messages, then they eventually succeed in communicating with each other. We also assume that messages are not duplicated and cannot be undetectably corrupted.
Appears in 2 contracts
Sources: Doctoral Dissertation, Doctoral Dissertation
System Model. A distributed system is composed of a set of agents with well defined roles that cooperate to achieve a common com- mon goal. In practice, an agent can be implemented by a process or collection of them, by a processor, or any computation com- putation enabled entity. Moreover, any single entity that implements one agent could also implement multiple of them. Reasoning in terms of agents allows us to specify problems and algorithms more concisely and in terms of heterogeneous agents. For example, a client/server appli- cation may be described in terms of two kinds of agents, client and server agents, while a e-mailing system may be described in terms of senders and receivers. Distributed systems can be classified in different axis according to the way agents exchange information, the way they fail and recover, and the relative speeds at which they perform computation. In this work paper we address asynchronous asyn- chronous distributed systems in which agents can crash and recover, and use unreliable communication channels to exchange messages. In asynchronous distributed systems there are no bounds on the time it takes an agent to execute any action ac- tion or for a message to be transmitted. We show that if If such bounds exist, then the protocols we present in this thesis ensure some liveness properties, if exist and the number of failures can be limited in time. Our , however, then the protocols that we present in this paper ensure some liveness proofs require the bounds to exist but do not require them to be known by any agentproperties. Even though we assume that agents may recover, they are not obliged to do so once they have failed. For simplicitysim- plicity, an agent is considered to be nonfaulty iff it never fails. Agents are assumed to have access to local stable storage which they can use to keep their state in between failures. State not kept in stable storage is reset after a crash. Lastly, we assume that agents do not execute any arbitrary step, i.e.that is, we do not consider byzantine failuresfail- ures. Although channels are unreliable, we assume that if agents keep retransmitting their messages, then they eventually even- tually succeed in communicating with each other. We also assume that messages are not duplicated and cannot be undetectably corrupted.
Appears in 1 contract