Database systems manage data which is at the heart of modern computing applications. We are in the era of big data, in which data is generated from many sources, in high velocity and with great variety. This poses numerous challenges in using and improving database technologies. Big data systems designed to support analytics are maturing and are becoming increasingly important to many applications.
This course covers the fundamentals of traditional databases, such as Oracle and MySQL, and core ideas of recent big data systems. Students will learn important problems in data management that these systems are designed to solve. They will experience with building applications on top of traditional databases, namely SQLite, and state‐of‐the‐art big data platforms, namely MongoDB and Apache Spark. These systems will be running both locally and on the Amazon cloud (Amazon Web Service). The students will be able to determine for themselves the advantages and disadvantages of different systems.
- 10.009 The Digital World (For Intake AY2019)
- [NEW] Data Driven World (For Intake AY2020 and subsequent batches)
- 50.004 Introduction to Algorithms
- 50.005 Computer System Engineering
- Design and implement a database application on top of a relational database management systems (RDBMS).
- Identify major components of database and big data systems.
- Estimate the costs of different database operations.
- Explain how state‐of‐the‐art big data systems differ to one another.
- Implement a cloud‐based big data application.
- Explain how database and big data systems fit together in real‐world applications.
- Use cloud‐based systems.
- Develop a database design for an application.
- List and explain major components of database and big data systems.
- Write complex SQL queries.
- Estimate cost of different database operations.
- Compare different classes of big data systems.
- Write MapReduce and Spark jobs.
- Explain how a database differs to a big data system.
- Design, implement, and deploy database and big data systems on AWS.
Required Texts and Readings
- Abraham Siberschatz, Henry Korth, S Sudarshan. Databse System Concepts, 6th edition.
- Johannes Gehrke, Raghu Ramakrishnan. Database Management Systems, 3rd edition.
- Thomas Erl. Big Data Fundamentals: Concepts, Drivers & Techniques. 2016.
- Armbrust et al. “Above the clouds: A Berkeley View of Cloud Computing”. EECS Technical Report. 2009.