Large Scale Data Management (academic year 2019/2020)

Who is the professor that is responsible of the course. Prof. Maurizio Lenzerini.
For whom is this course. This 6 credits course is for the students of the Master in Computer Engineering (School of Information Engineering) of the Sapienza Università di Roma. This course is also for students of the PhD program in Ingegneria Informatica (Computing Engineering).
Which is the structure of the course. The course is structured in 2 sections, which are described below. Each section corresponds to 3 credits, is constituted by approximately 30 lectures, and is taught by a professor in a given semester. Students will do the exam for the two sections, and for each section, they must follow the indications given in the web page of the section. The final exam will be registered by the course coordinator (Prof. Maurizio Lenzerini), and the grade will depend on the grades obtained in the two sections. When a student has passed the exam of the two sections, s/he should send a message to Prof. Maurizio Lenzerini (lenzerini [at] asking to register the exam. Please, look at the news in this page for the dates where registration is scheduled.
When are the lectures of the different sections scheduled? Both sections are scheduled in the second semester.


  • May 16, 2020 The students who want their exam to be registered in the summer session 2020 should book for the exam in the Infostud system. There will be a session in June, a session in July and session in September.

Structure of the course

  • Section 1: Information Integration
    • Teacher: Prof. Maurizio Lenzerini
    • Number of credits: 3
    • Lectures: Second semester (March-May 2020)
    • Programme: Information integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing information integration systems is important in current real world applications, and is characterized by a number of issues that are interesting from both a theoretical and a practical point of view. In the last years, there has been a huge amount of research work on data integration, and a precise, clear picture of a systematic approach to such problem is now available. This section will present an overview of the research work carried out in the area of data integration, with emphasis on the theoretical results that are relevant for the development of information integration solutions. Special attention will be devoted to the following aspects: architectures for information integration, modeling an information integration application, processing queries in information integration, data exchange, and reasoning on queries.
    • Additional information: see the web site of the Information Integration section
  • Section 2: Big Data Management
    • Teacher: Prof. Domenico Lembo
    • Number of credits: 3
    • Lectures: Second semester (March-May 2020)
    • Programme: In one sentence, Big Data is data that exceeds the processing capacity of conventional database systems. In particular, Big Data applications deal with huge amounts of data, possibly collected from a huge number of data sources (volume), with highly heterogeneous format (variety), at a very high rate (velocity). This scenario calls for new technologies to be developed, ranging from new data storage mechanisms to new computing frameworks. In this course we will look at several key technologies used in manipulating, storing, and analyzing big data. In particular, we will study architectures for data intensive distributed applications, Data Warehouse solutions, NoSQL storage solutions, including RDF and graph databases.
    • Additional information: see the web site of the Big Data Management section

Past editions