SICS Distributed Systems Laboratory
Swedish Institute of Computer Science

 
 
  CONTENTS
  Overview
Members
Projects
Software
  RESEARCH
  Peer-to-Peer Research
Fully Decentralized Distributed Systems
Distribution Middleware
Simulation of Large Asynchronous Systems
  Internal Pages
  Group Calendar

    SICS
    Box 1263
    SE-16429 Kista
    Sweden

    +46 8 633 1500
    +46 8 751 7230 (fax)
space

Simulation of Large Asynchronous Systems

Background

Examples of large asynchronous systems are the World Wide Web, peer-to-peer systems (like Gnutella), and biological systems. When we look at such systems, we may be able to measure and study the system as a whole but they cannot be modeled on that level. Instead such systems are seen as being constructed of many interacting autonomous agents. Modeling is thus done on the micro level: the behavior of individual agents, their internal logic, and their interaction with other agents. On the macro scale the behavior of the system as a whole is obtained by simulation.

There are two cases for simulation. In the first we study some existing system and have some knowledge of the macro behavior of the system, but do not really understand this behavior. Hypotheses are formulated in the form of the defined behavior of the individual agents. For instance, if we are modeling the web, we model end-user agents and site agents. If the behavior of a studied system as a whole is faithfully reflected in a simulation, this indicates, but does not prove, that the agent modeling in some fashion captures some truth about the system. The case may be further strengthened if simulations with alternative agent models fail to match.

In the second case we are designing a large asynchronous system. Say we are designing a peer-to-peer system with some particular decentralized algorithm. There are three kinds of agents, end users, application agents, and infrastructure agents. The end user agent is modeled to reflect the expected service use patterns. The application agent behaves as we have programmed them to subject to certain outside events. An application agent might be running on a user machine and then the user turns off his machine. Infrastructure agents represent routers and other elements that are outside the control of our application, but they are well understood and easily modeled. The infrastructure agents will be affected by random factors that simulate routers being down for some limited period of time. The simulation should, of course, support the correctness of the algorithm but this should, of course, be proved by other means. However, expected case macro behavior, in the presence of router failures, machine crashes, user mischief, etc., is generally not possible to obtain by analytical means.

Modeling the World Wide Web

Here with our partners we attempt to model individual users and sites of the WWW. We attempt to answer the question of why and how certain sites become popular. At first sight this may seem a trivial question. Some sites are considered better by the majority of users and therefore become popular. However this is not the case (as simulation shows) as all users do not visit all sites. Users tip their friends about good sites, so the popularity of sites depends very much on the social interaction between end users. Research Strand 3b: The Simulation Environment The simulations that we are interested in involve an enormous number of agents. Scalability is a definite issue. Our simulation environment has been parallalised so as to be able to run on a network of machines. Care was taken to minimize the communication overhead.

The simulation environment is not only our tool to study large systems, but to some extent is also the subject of ongoing research. We strive to improve the simulation platform. Examples are 1) agent behavior language to simplify the construction of new simulations 2) performance tuning and 3) network models for peer-to-peer simulations.

Peer-to-peer simulations

Here simulation is used to guide design. The current focus is on algorithms for distributed naming services. How do algorithms and variations behave under different failure assumptions and frequencies, and different degrees of individual machine stability? Note that in peer-to-peer systems one must usually assume considerabl? higher rates of failure than in traditional distributed systems. Peers are or may be PCs of ordinary end-users whose system (e.g. operating system) may be poorly maintained and whose machine may be subject to sources of outside interference that do not occur in clusters (e.g. a 2-year-old child pulls out the network connection plug).