Master of Science in Engineering in Computer Science
Facoltà di Ingegneria dell'Informazione, Informatica e Statistica
Dipartimento di Ingegneria Informatica, Automatica e Gestionale A. Ruberti
Sapienza Università di Roma

Large Scale Data Management

2021/2022

prof. Domenico Lembo and Marco Console


Who is the professor that is responsible of the course. Prof. Domenico Lembo.

For whom is this course. This 6 credits course is for the students of the Master in Computer Engineering (School of Information Engineering) of the Sapienza Università di Roma. This course is also for students of the PhD program in Ingegneria Informatica (Computing Engineering).

Which is the structure of the course. The course is structured in 2 sections, which are described below. Each section corresponds to 3 credits, is constituted by approximately 30 lectures, and is taught by a professor in a given semester. Students will do the exam for the two sections, and for each section, they must follow the indications given in the web page of the section. The final exam will be registered by the course coordinator (Prof. Domenico Lembo), and the grade will depend on the grades obtained in the two sections. When a student has passed the exam of the two sections, s/he should send a message to Prof. Domenico Lembo (lembo [at] dis.uniroma1.it) asking to register the exam. Please, look at the news in this page for the dates where registration is scheduled.

When are the lectures of the different sections scheduled? Both sections are scheduled in the second semester.


Structure of the course

Section 1: Information Integration

Teacher: Prof. Marco Console

Number of credits: 3

Lectures: Second semester (February-May 2022)

Programme: Information integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing information integration systems is important in current real world applications, and is characterized by a number of issues that are interesting from both a theoretical and a practical point of view. In the last years, there has been a huge amount of research work on data integration, and a precise, clear picture of a systematic approach to such problem is now available. This section will present an overview of the research work carried out in the area of data integration, with emphasis on the theoretical results that are relevant for the development of information integration solutions. Special attention will be devoted to the following aspects: architectures for information integration, modeling an information integration application, processing queries in information integration, data exchange, and reasoning on queries.

Additional information: see the web site of the Information Integration section.

Section 2: Big Data Management

Teacher: Prof. Domenico Lembo

Number of credits: 3

Lectures: Second semester (February-May 2022)

Programme: In one sentence, Big Data is data that exceeds the processing capacity of conventional database systems. In particular, Big Data applications deal with huge amounts of data, possibly collected from a huge number of data sources (volume), with highly heterogeneous format (variety), at a very high rate (velocity). This scenario calls for new technologies to be developed, ranging from new data storage mechanisms to new computing frameworks. In this course we will look at several key technologies used in manipulating, storing, and analyzing big data. In particular, we will study architectures for data intensive distributed applications, Data Warehouse solutions, NoSQL storage solutions, including RDF and graph databases.

Additional information: see the web site of the Big Data Management section.