DS 220 Midterm: ds 220 exam 1 review

359 views3 pages
16 Jan 2019
School
Department
Course
Professor

Document Summary

Acquire: identify data sets, retrieve data and query data. Pre-process information such as cleaning, integrating and packaging. Analyze: select analytical techniques and build models. Tables in sql: a relation or table is a multiset of tuples having the attributes. Multiset: is an unordered list or a set with multiple duplicate instances allowed. Schema: the table name, its attributes and their types. Typical nosql architecture: hashing function maps each key to a server (node) Does not use sql as querying language. Nosql api: get(key): extract the value given a key. Put(key, value) : create or update the value given its key. Delete(key): remove the key and its associated value execute(key, operation, parameters): invoke an operation to the value (given its key) which is a special data structure (e. g. list, set, map etc) Distributed computing: vertical scalability (scale up, horizontal scaling (scale out, reliability, performance. Data model: key value, document and column-based stores. Transactions: no full transaction guarantees, no acid transactions.