Data Flow Quality Monitoring in Data Infrastructures (ISTI Grants for Young Mobility seminar series)

Day - Time: 07 December 2016, h.10:30
Place: Area della Ricerca CNR di Pisa - Room: C-29

Andrea Esuli


From an abstract point of view, Data infrastructures (DIs) can be intended as large (eco)systems constituted by data storage and processing components. Such components can be in turn combined into so-called data flows so as to enable arbitrarily complex data manipulation actions serving the consumption needs of DI customers, be them humans or machines. The data resulting from the execution of data flows represent an important asset both for the DI users, typically craving for the information they need, and for the organization (or community) operating the DI, whose existence and cost sustainability depends on the adoption and usefulness of the DI. Hence, it is vital to provide guarantees on the â??correctnessâ?? of the DI data flows behaviour over time, to be somehow quantified in terms of â??data qualityâ?? and in terms of â??processing qualityâ??. In this talk we describe what the state of the art can offer and we introduce MoniQ, a novel approach addressing Data Flow Quality Monitoring, and delve deeper into its details and its applications.