Big Data Processing Technologies (Graduate), Spring 2020

Overview

This course mainly introduces big data storage systems. For both structured and unstructured data, this course illustrates them in storage and database tracks, respectively. Particularly, tranditional big data storage systems are introduced in this course, such as hadoop (HDFS,HBase), Google GFS/Chubby/BigTable, Amazon S3, Microsoft Azure Storage, Ali Cloud, etc. Critical issues like data reliability, data consistency, metadata management are well illustrated in this course. The course aims to help students to understand how file systems/databases really work for big data storage.

The prerequisites for this course are programming language, data structure, operating systems, computer organization, computer architecture, and parallel & distributed systems. Students are expected to have strong programming (i.e., C/C++/Java/Python) background before taking this course.

Collaborators


Instructor

NameEmailOfficeTelOffice Hours
Chentao Wu Room 513 SEIEE Building #3 (021) 3420 8230 By appointment

Teaching Assistant

NameEmailOfficeTelOffice Hours
Huayi Jin 3rd Floor in the Software School Building Just drop by
Han Qiu 3rd Floor in the Software School Building Just drop by

Course Policies

Teamwork

Students are encouraged to talk to each other, to the course staff, or to anyone else about any of the assignments. Assistance must be limited to discussion of the problem and sketching general approaches to a solution. Each student (or team) must write out his or her own solutions to the homework. Any forms of copying code is strictly prohibited.

Late Policy

Cheating

Cheating is NOT tolerated! Please read the SJTU's Academic Code of Conduct if you are not familiar with the definition of cheating. If you are caught cheating on an assignment, you will get a zero for that assignment. Other repercussions are also possible.