LIFT

Using Local Inference in Massively Distributed Systems

Contacts
Abstract
As the scale of today¿s networked techno-social systems continues to increase, the analysis of their global
phenomena becomes increasingly difficult, due to the continuous production of streams of data scattered among
distributed, possibly resource-constrained nodes, and requiring reliable resolution in (near) real-time.
We will explore a novel approach for realising sophisticated, large-scale distributed data-stream analysis
systems, relying on processing local data in situ. Our key insight is that, for a wide range of distributed data
analysis tasks, we can employ novel geometric techniques for intelligently decomposing the monitoring of
complex holistic conditions and functions into safe, local constraints that can be tracked independently at
each node (without communication), while guaranteeing correctness for the global-monitoring operation.
While some solutions exist for the limited case of linear functions of the data, it is hard to deal with general,
non-linear functions: in this case, a node¿s local function value essentially tells us absolutely nothing about the
global function value. Our fundamental idea is to design novel algorithmic tools that monitor the input domain
of the global function rather than its range. Each node can then be assigned a safe zone (SZ) for its local
values that can offer guarantees for the value of the global function over the entire collection of nodes. This
represents a dramatic shift in conventional thinking and the state-of-the-art. We aim to reduce the amount of
communication and data collection across nodes to a minimum, requiring nodes to communicate only when their
local constraints are violated. Privacy protection, in the case when transmitted data contain sensitive information,
is also revolutionized in our view. We investigate real-life scenarios from network health monitoring, large-scale
analysis of human mobility and traffic phenomena, internet-scale distributed querying, and monitoring sensor networks.

Duration

36 Months

Financial Institution

Unione Europea