CSE 120 Lecture Notes - Lecture 17: Process Migration, Distributed Computing, Scalability
66 views6 pages
Distributed System - set of cooperating processes over a network
Degree of “integration”
- Loose: Internet applications, email, web browsing - possible to have a process run
- Medium: remote execution, remote file systems
- Tight: process migration, distributed file systems
- Process “migrates” on its own to the “lightly-loaded” machine
- Machines have to be more tightly coupled
Speed: parallelism, less contention
Reliability: redundancy, fault tolerance, “NSPF” - NO SINGLE POINT OF FAILURE
- If it fails on my computer, other machines can still run it!
Scalability: incremental growth, economy of scale
- Need more computing power? BUY MORE MACHINES
Geographic Distribution: low latency, reliability
- Put some processing power at different areas
- In contrast to a singular high-powered system in one area
Fundamental problems of decentralized control (INHERITED! Can’t get rid of them!)
- State uncertainty: NO node knows exactly what’s going on EVERYWHERE else (i.e. no
shared memory or clock!)
- In order to know WHERE to run, it needs to know the state of all the other
- How does it know? Machines have to continuously send messages to each other!
- The more frequent these messages are, the more costly it is!
- Communication takes time (time is delayed!)
- No notion of a single time - everyone has a slightly “bigger” notion of time
- Action uncertainty: mutually conflicting decisions
- Assume that every system knew the state of their load of other systems
- Then system offloads their process to the system with the lowest load
- However, every other machine may make the SAME decision!
- They will send to the SAME machine which will overload it!
- there is a learning curve, but can be resolved over time
Is Distribution Better?
-Lambda: arrival rate
-Mu (μ): service rate (jobs per unit time)
Single fast server w/ single queue vs Multiple slower servers w/ separate queues
Ex: choosing line to wait in at the supermarket!
Solution: Single queue w/ multiple servers!
Ex: Airport lines
Results: Little’s Law: (total # of things in the system) N = ƛ W (arrival rate * waiting time)
Ex: airpot lines (N = # of people in line), ƛ = # people arrive per unit time
The Client/Server Model - asymmetric
Clients are small, lightweight processes that are short-lived
- “User-side” of application
- Belong to people and can make requests
Servers are giant, long-lived processes
- Up ALL the time
- Always waiting for requests
- Know information
Peer-to-Peer - symmetric (everyone is equal)
- A peer talks directly w/ another peer
- we may become sources of information!
- NO “intermediary” involved
- a dynamic “client/server” momentary relationship
- A requests from B; A acts as client, B as server
- very rarely are systems purely P-P
Distributed Algorithms - building blocks for building bigger distributed applications
- Remember, NO shared memory or shared clock!
- If we had a clock, we could just impose order using timestamps!
Event Ordering - certain things happen in a certain order
“Happened-before” relation: →
A, B events in same process and A before B; then A → B
If A is a send event, B is a receive event: A → B
If A → B and B → C, then A → C
In reality: very subtle in implementation!
- “Timestamp” ALL events based on a local clock (on the machine)
If the clocks are such that the RT is BEFORE the ST, the receiver advances THEIR local clock
- Artificially advance is s.t. receive happened after the send
What if events happened at exactly the same time: CANNOT ALLOW THAT
- Every machine will have an ID
- If two events have the same time, use the machine IDs to resolve the tie!
- Machine w// BIGGER ID: happened after/before (depends on your convention!)
Machine X sends @ time 1:01