During the last ten years the developments in computer applications have put more and more emphasis on the management of massive data sets, such as very long strings, huge collections of texts and documents, very large graphs and matrices, etc. The need for handling such data sets arises, for example, in bioinformatics, in meteorology, in the solution of challenging computational problems, in the exploration and visualization of very large computer networks (such as Internet) or social networks.

For various reasons, designing efficient algorithms for dealing with huge information structures represents a major challenge. In most cases data are stored in external memory and we have to devise new algorithmic techniques for handling them minimizing disk accesses. Besides, in such circumstances, new techniques for compression and visualization are required. Note that in some cases data are so large that they cannot even fit in external memory but have to be analyzed 'on the fly'. Finally, when massive data are considered, new algorithms for dealing with faulty data are needed because error probability is not anymore negligible.


This research project brings together five leading groups in algorithmic research in Italy in a joint effort that aims at:

(1) Discovering new algorithmic techniques and methodologies for processing very large information structures;

(2) Identifying and solving key algorithmic problems in important applications that deal with massive data sets; and

(3) Contributing to the accelerated transfer of advanced algorithmic technologies through experiments and the engineering of efficient algorithmic code.

The main emphasis of the project will be on a novel combination of application-oriented research in important domains such as networking, large documents management and simulation of large-scale physical phenomena with innovative methodological work on algorithm engineering, information visualization and general algorithmic techniques.


The project starts on February 9, 2007 and ends on February 9, 2009. The scientific coordinator is Giorgio Ausiello. The research work is partially funded by MIUR, the Italian Ministry of University and Scientific Research.