Recent internal Seminars  All internal seminars...

More information on internal seminars can be required to Claudia Raviolo

Deep Character-Level Click-Through Rate Prediction for Sponsored Search

17 July 2017, 10:00 - Location: C-29

Amin Mantrach (Senior Research Scientist at Criteo)
Raffaele Perego

Predicting the click-through rate of an advertisement is a critical component of online advertising platforms. In sponsored search, the click-through rate estimates the probability that a displayed advertisement is clicked by a user after she submits a query to the search engine. Commercial search engines typically rely on machine learning models trained with a large number of features to make such predictions. This inevitably requires a lot of engineering efforts to define, compute, and select the appropriate features. In this paper, we propose two novel approaches (one working at character level and the other working at word level) that use deep convolutional neural networks to predict the click-through rate of a query-advertisement pair. Specifically, the proposed architectures only consider the textual content appearing in a query-advertisement pair as input, and produce as output a click-through rate prediction. By comparing the character-level model with the word-level model, we show that language representation can be learnt from scratch at character level when trained on enough data. Through extensive experiments using billions of query-advertisement pairs of a popular commercial search engine, we demonstrate that both approaches significantly outperform a baseline model built on well-selected text features and a state-of-the-art word2vec-based approach. Finally, by combining the predictions of the deep models introduced in this study with the prediction of the model in production of the same commercial search engine, we significantly improve the accuracy and the calibration of the click-through rate prediction of the production system.

Seminars Young Researcher Award 2016

29 June 2017, 10:00 - Location: A-28

Franco Nardini

Seminars by the the recipients of the Young Researcher Award 2016:

Speaker: Davide Basile - Title: Specifying Variability in Service Contracts.
Abstract: In Service-Oriented Computing, contracts offer a way to characterise the behavioural conformance of a composition of services, and to guarantee that the results do not lead to spurious compositions. Through variability modelling, a product line of services is enabled to adapt to customer requirements and to changes in the context in which they operate. In this seminar a formal model for service contracts is presented. It allows to specify variability in service product lines, including (i) feature-based constraints and (ii) four classes of service requests to characterise different types of service agreement. Supervisory Control Theory is exploited to synthesise the orchestration of a composition of services as that satisfies: (i) all feature constraints of the service product line, and (ii) the maximal number of service requests for which an agreement can be reached. Moreover, the orchestration of a service product line, whose number of products is potentially exponential in the number of features, c an be synthesised from only a subset of its products. A protypical tool has been implemented to support the developed theory.

Speaker: Alessio Ferrari - Title: Ambiguity and Tacit Knowledge in Requirements Elicitation Interviews.
Abstract: Interviews are the most common and effective means to perform requirements elicitation and support knowledge transfer between a customer and a requirements analyst. Ambiguity in communication is often perceived as a major obstacle for knowledge transfer, which could lead to unclear and incomplete requirements documents. In this paper, we analyse the role of ambiguity in requirements elicitation interviews, when requirements are still tacit ideas to be surfaced. To study the phenomenon, we performed a set of 34 customer-analyst interviews. This experience was used as a baseline to define a framework to categorise ambiguity. The framework presents the notion of ambiguity as a class of four main sub-phenomena, namely unclarity, multiple understanding, incorrect disambiguation and correct disambiguation. We present examples of ambiguities from our interviews to illustrate the different categories, and we highlight the pragmatic components that determine the occurrence of ambiguity. Along the study, we discovered a peculiar relation between ambiguity and tacit knowledge in interviews. Tacit knowledge is the knowledge that a customer has but does not pass to the analyst for any reason. From our experience, we have discovered that, rather than an obstacle, the occurrence of an ambiguity is often a resource for discovering tacit knowledge. Again, examples are presented from our interviews to support this vision.

Speaker: Riccardo Guidotti - Title: Unveiling Mobility Complexity and Nowcasting Well-Being through complex network analysis.
Abstract: The availability of massive digital traces and of large dataset of retail micro transactions is offering a series of novel insights on the understanding of patterns characterizing human behavior. Many studies related either with mobility or with retail data generally simply focus on places and products with the highest frequencies. We depart from the concept of frequency and we focus on a high level representation using network analytics. We model the datasets analyzed as bipartite networks and we use a quantification of the average complexity of the nodes of the network to unveil hidden relationships and phenomena. In particular, we introduce the concept of mobility complexity of drivers and places as a ranking analysis over the nodes of these networks. In addition, by means of community discovery analysis, we differentiate subgroups of drivers and places according both to their homogeneity and to their mobility complexity. On the other hand, we use a quantification of the average complexity of satisfied needs of a population as an alternative to GDP. We show that this quantification can be calculated more easily than GDP and it is a very promising predictor of the GDP value, anticipating its estimation by six months.

Speaker: Lucia Vadicamo - Title: Improving Metric Search through Finite Isometric Embedding.
Abstract: Metric search is concerned with the efficient evaluation of queries in metric spaces. In general, a large set of objects is arranged in such a way that, when a further object is presented as a query, those objects most similar to the query can be efficiently found. Most mechanisms rely upon the triangle inequality property of the metric governing the space. In this seminar we will examine a class of metric space with a stronger property, named the four-point property. This property gives stronger geometric guarantees, and one in particular, which we named the Hilbert Exclusion property, allows any indexing mechanism that uses hyperplane partitioning to perform better. One outcome of this observation is that a number of state-of-the-art indexing mechanism over high dimensional space can be easily refined to give a significant increase in performance.

Removing unwanted data from heritage range scans

23 June 2017, 14:30 - Location: C-29

Patrick Marais (University of Cape Town, South Africa)
Matteo Dellepiane

Range scans captured by laser scanners are an important source of data in attempts to capture and preserve heritage sites. Unfortunately this data almost always contains unwanted sample points, arising from people, trees and cars as well as scanner artefacts. Removing this data - 'cleaning' the scans - is generally a time consuming and labour intensive process.
In this talk I will highlight recent work with grad students to address the issue of semi-automating the process of range scan cleaning. I will also talk briefly about the work that we have started with collaborators here at CNR, with the aim of learning some kind of description of the data that is typically discarded.

comments to webmaster