Research Pillars
Trustworthy AI
The Trustworthy AI pillar focuses on Address AI systems issues with respect to security, privacy, compliance and trust to ensure their resilience, robustness and safety of AI against both inherent limitations and malicious actions that could compromise their security and lead to undesirable behaviour.
Project: Detecting fake voices from real recordings and synthetic generations
Third parties involved: Mario Fernando Acosta – Universidad del Norte
This project addresses vulnerabilities in voice-based authentication systems by detecting synthetic (fake) voices using complex-valued deep learning (CVDL). We hypothesize that physical speech characteristics provide a unique fingerprint that can distinguish real from fake voices. The work includes data collection (real and fake voices), complex-valued feature extraction, hyperparameter adjustment, classifier design, and performance validation. We expect a high-performing CVDL model (F1-score ≥ 0.95) for fake-voice detection in the Spanish language.
Project: A novel approach for AI detection content based on fine-tuned LLMs
Third parties involved: Marco Murgia – University of Cagliari
The objective of this project is to study and evaluate state-of-the-art techniques for detecting AI-generated content and to propose a black-box approach that fine-tunes large language models via reinforcement learning and Direct Preference Optimization (DPO). To verify the effectiveness of this approach, its performance is evaluated against a human baseline. The European added value lies in offering an effective tool to counter the spread of fake news, a sensitive topic in recent years.
Project: Detecting and Explaining AI Using Language-Image Contrastive Insights
Third parties involved: Klemen Grm – University of Ljubljana
Given the recent advances in generative AI, distinguishing between human-generated and AI-generated content has become paramount. The seamless integration of AI in content creation poses risks of misinformation, undermining authenticity and accountability. On the other hand, the black-box nature of AI tools also presents a unique problem for their trustworthiness. We plan to address these issues using the tools of language-image contrastive learning. By exploiting the contrastive learning mechanism, we aim to investigate the properties of both image- and language-based generative AI tools. This project focuses on constructing a framework that will not only enhance the explainability of AI operations but also detect AI-generated content. creation poses risks of misinformation, undermining authenticity and accountability. On the other hand, the black-box nature of AI tools also presents a unique problem for their trustworthiness. We plan to address these issues using the tools of language-image contrastive learning. By exploiting the contrastive learning mechanism, we aim to investigate the properties of both image- and language-based generative AI tools. This project will focus on constructing a framework that will not only enhance the explainability of AI operations but also detect AI-generated content.
Project: A Multi-Layer Defense System against LLMs Adversarial Prompt Attacks
Third parties involved: Leonardo Piano – University of Cagliari
This project aims to develop a robust defense system for Large Language Models (LLMs) against prompt adversarial attacks by orchestrating three layers: (i) prompt analysis using Named Entity Recognition and Relation Extraction to identify malicious entities and intents, (ii) prompt classification via fine-tuned LLMs to detect subtle malicious patterns, and (iii) factual verification of LLM answers with Knowledge Graphs. The outcome is an advanced defense system aligned with priorities of the European AI Act, including respect for human rights, harm prevention, safety, and transparency through symbolic reasoning.
Project: Democracy in the Age of Algorithm: Enhancing Transparency and Trust in AI-Generated Content through Innovative Detection Techniques
Third parties involved: Prof. Igor Calzada – University of the Basque
The project aims to fortify democratic integrity by enhancing the trustworthiness of AI-generated content through cutting-edge detection techniques. It will evaluate current LLM outputs for authenticity, develop novel defenses against AI-driven spoofing and paraphrasing attacks, and devise metrics for assessing LLM randomness. Expected outcomes include a detailed trust framework for AI transparency and a peer-reviewed publication. Leveraging expertise from the AI4Gov and KT4D projects, the project will bolster Europe’s leadership in safeguarding democratic processes against AI-generated misinformation, strengthening both societal trust and policy effectiveness.
Project: Towards encryption-free verifiable federated learning for energy use-cases
Third parties involved: Lukas Stippel – Persee Mines Paris PSL
This project advances energy forecasting via privacy-preserving collaborative data sharing using federated frameworks that meet EU AI Act interpretability demands. It will develop trustworthy methods for KAN and fusion models, leveraging encrypted aggregation to maintain efficiency and accuracy, and will validate the approach on real-world datasets to support the energy transition.
Project: Implementing and Evaluating Trustworthy Distributed AI Systems and Models
Third parties involved: Amanda Ericson – Mid Sweden University
This research initiative proposes a framework that aims to leverage the trustworthiness of next-generation AI systems by first contributing to the development of a comprehensive taxonomy and then following up with the construction of a supportive infrastructure. Moreover, operationalizing trust in AI systems is three-folded, both trust in the model, the distribution, and following EU AI regulations. It emphasizes that trust considerations are to be built from the system design’s beginning, supporting “trustworthy by design” principles in AI development. Through evaluation via case studies and a testbed implementation, the project aspires to prove the framework’s efficiency and versatility across diverse domains. Integrating trust into the core of distributed AI systems will enable innovation, security, and ethical responsibility in this new digital age.
