CS431 Chapter 7: Mutable Table

16 views24 pages
Chapter 7 – Mutable Table
The Fundamental Problem - We want to keep track of mutable state in scalable manner
Assumptions
State organized in terms of logical records
State unlikely to fit on single machine, must be distributed
Use RDBMS
Relational model with schemas
Powerful, flexible query language
Transactional semantics: ACID
Rich ecosystem, lots of tool support
NoSQL for scalability of flexibility
Horizontally scale “simple operations”
1.
Replicate/distribute data over many servers
2.
Simple call interface
3.
Weaker concurrency model than ACID
4.
Efficient use of distributed indexes and RAM
5.
Flexible schemas
6.
NoSQL - core ideas
Partitioning (sharding) - increase scalability and to decrease latency
Do this on distributive search engine
a.
1.
Replication - to increase robustness (availability) and to increase throughput
2.
Caching - to reduce latency
3.
Key-Value Stores: Data Model
Stores associations between keys and values
-
Keys are usually primitives
Int, strings, raw bytes
-
Values can be primitive or complex
-
Chapter 7 -Mutable Table
Friday, April 5, 2019
21:51
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 24 pages and 3 million more documents.

Already have an account? Log in
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 24 pages and 3 million more documents.

Already have an account? Log in
Values can be primitive or complex
Primitives: ints, strings, etc.
Complex: JSON, HTML fragments, etc
-
Key-Value Stores: Operations
Get and put
-
Multi-get, multi-put
-
Range queries, secondary index lookups
-
Key-value Stores: Implementation
Non-persistent: In-memory hash table <- may lose all your data in memory,
-
Persistent: Wrapper around a traditional RDBMS
What if data doesn’t fit on a single machine - partition the data, hash the key
-
Simple Solution: Partition:
Partition the key space across multiple machines
Hash partitioning, for n machines, store key k at machine h(k) mod n
-
Q: how do we know which physical machine to contact?
Quasi-idea - hash the machine
-
Each machine holds pointers to predecessor and successor
Gets routed to correct one in o(n) hops
a.
1.
Each machine holds pointers to predecessor and successor + "finger table" (+2, +4, +8)
2.
Service Entry - one node holds
The key-service entry
a.
Query the service registry and return the node
b.
Latency not so long
c.
Common model
d.
3.
If there's a new node, the entry of the new node will be passed on to each node by gossip
4.
Replicate: plus 1 and minus 1 case
Circles don’t fail, adjust things appropriately
a.
Replication in predecessor and successor node
b.
5.
Virtual Nodes
Don’t directly hash servers
1.
Create a large number of virtual nodes, map to physical servers
Better load redistribution in event of machine failure
a.
When new server joins, evenly shed load from other servers
b.
2.
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 24 pages and 3 million more documents.

Already have an account? Log in

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers