Giovani in un'ora - Ciclo di seminari - Prima parte

Day - Time: 21 September 2023, h.11:00
Place: Area della Ricerca CNR di Pisa - Room: C-29

Fabio Carrara


Gianluca Carloni - "Observational Causal Discovery in Images: Leveraging Weak Causal Signals for Classification in Medical Imaging Applications"

Abstract: We present a new method for automatically classifying medical images that uses weak causal signals in the scene to model how the presence of a feature in one part of the image affects the appearance of another feature in a different part of the image. Our method consists of two components: a Convolutional Neural Network backbone and a Causality Factors Extractor module. The latter computes weights for the feature maps to enhance each feature map according to its causal influence in the image’s scene. We evaluate our method on a public dataset of prostate MRI images for prostate cancer diagnosis, using quantitative experiments, qualitative assessment, and ablation studies. We study the behaviour of our models in both Fully-Supervised and One-Shot learning schemes. Our results show that our method improves classification performance and produces more robust predictions, focusing on relevant parts of the image. That is especially important in medical imaging, where accurate and reliable classifications are essential for effective diagnosis and treatment planning. The seminar ends with a discussion of the possible development of the proposed methodology in other areas, such as deep image generation and Cine MRI reconstruction.

Eva Pachetti - "Leveraging Disentangled Features for Automated Classification of Medical Images via Few-Shot and Self-Supervised Learning"

Abstract: The demand for robust deep-learning models capable of delivering competitive and generalizable performance with limited datasets continues to rise. Few-shot learning (FSL) is the key to learning from only a few examples, a particularly critical aspect of domains such as medical imaging, where data availability is typically scarce. In this context, we propose a novel approach that combines self-supervised learning (SSL), feature disentanglement, and meta-learning to achieve FSL. We employ SSL to conduct pre-training, endowing the model with essential prior knowledge about the specific anatomical structures under investigation. However, since SSL alone often struggles to extract meaningful features, we enhance feature disentanglement during the SSL phase, leveraging the Partition-based Invariant Risk Minimization (IP-IRM) method. This helps the model focus on the most relevant features related to the anatomical structures, improving its understanding. Subsequently, we train the model for the downstream task following a meta-learning framework. In this context, we propose a new episodic training approach that exploits different class granularity levels between meta-training and meta-testing tasks. We evaluate our framework's efficacy through a 4-class classification task on MRI images of prostate cancer, which involves categorizing tumours based on their severity. We conduct this evaluation under both 1-shot and 5-shot scenarios, measuring performance with the multi-class AUROC metric. To assess the strength of our approach, we compare its results with a fully-supervised and SSL-only pre-trained few-shot model and with a fully-supervised trained model. Results show that our method outperforms a fully-supervised trained model and that the SSL+IP-IRM pre-training is crucial to getting good classification results.

Luca Ciampi - "Unsupervised Domain Adaptation for Video Violence Detection in the Wild"

Abstract: Video violence detection is a specialized area within human action recognition dedicated to identifying violent behaviors in video clips. Current computer vision techniques, relying on deep Learning methodologies, have achieved remarkable outcomes in this domain. However, their efficacy hinges on the availability of extensive labeled datasets for supervised learning, ensuring their ability to perform well across various testing scenarios. While abundant annotated data may exist for certain predefined domains, manual annotation becomes infeasible when dealing with custom or ad-hoc target domains or tasks. Consequently, in numerous real-world applications, a shift occurs in the data distribution between the training (source) and testing (target) domains, resulting in a notable decline in performance during inference. To tackle this challenge, we propose an unsupervised domain adaptation scheme for video violence detection based on single image classification that mitigates the domain gap between the two domains. Our experiments involve employing labeled datasets that contain violent and non-violent clips in general contexts as the source domain, while the target domain comprises videos tailored explicitly for detecting violent actions in specific scenarios such as public transport or sports games, showing that our proposed solution can significantly enhance the performance of the considered models.