Tecnologies de la Informació i de Xarxes

Data Science
Proposta de tesi Investigadors/es Grup de recerca

Medical image processing

Medical image processing is a key step in the diagnosis of a large number of diseases. Nowadays, we can acquire images of the inside and outside of our bodies using a large variety of devices (ultrasound, magnetic resonance, optic tomography, computed tomography, etc.). Afterward, the acquired images usually need to be denoised, corrected for inhomogeneities, segmented, registered, etc. in order to be able to get relevant information to aid the clinical decision using image-based biomarkers. 
 
On this research line, we would like to explore the latest image processing challenges and develop new image-based biomarkers that aid clinicians in their daily work. This work will be done in collaboration with world-wide recognised clinical institutions in Barcelona. 
 

Dr Ferran Prados

Mail: fpradosc@uoc.edu

 

 

NeuroADaS Lab

Supply Chain Management (SCM) Optimization and Resilience to Disasters and Disruptions

Disasters have a significant and increasing impact all over the world. There is a growing concern about them, so Disaster Risk Reduction (DRR) is increasingly in international agenda. This thesis proposal sets up the scientific and technical basis for a significantly improved resilience to hazards (such as climate related hazards, earthquakes, pandemics, etc.) and their human and socioeconomic impacts related to Suppply Chain Management (SCM).

The proposal is based on three principles, inspired by UN Sendai Framework and related to UN 2030 Agenda for Sustainable Development:
1) Focus on prevention and resilience building oriented. 2) Inclusive “whole-of-society” approach, to involve non-traditional stakeholders not usually involved in DRR planning and decision making (such as households, SMEs, NGOs, etc.). 3) Data-driven approach, to integrate in DRR planning and decision-making diverse types of data (including small data, thick data, and big data) from a wide range of sources, and including reuse of data.
 
This thesis proposal will conduct research about data-based instruments for SCM's optimization and resilience related to disasters and disruptions.
 
 

Dr Josep Cobarsí

Mail: jcobarsi@uoc.edu

ICSO
Misinformation and disinformation through the lens of data analytics
 
This thesis proposal focuses on cases studies about misinformation or disinformation, through the application of quantitative data analytics methods to amounts of digital content such as: social media, mainstream media news and reports, Wikipedia entries, literature about historical events, open data and/or other open or public domain sources. This digital content may be created, updated, influenced and/or used by a wide range of actors: citizens, anonymous agents or activists, governments and public agencies, companies, international organizations, political parties, social organizations, etc. 
 
Research methodologies for these case studies will usually include the advanced conceptualization of misinformation and disinformation events, so to enhance the intensive application of quantitative methods to trace and analyse them through amounts of digital content and logs. These quantitative methods may be combined when suitable with qualitative methods.  
 
Cardoso, G.; Sepúlveda, R.; Narciso, I. (2022). Whatsapp and audio misinformation during Covid-19 pandemic. El Profesional de la Información https://doi.org/10.3145/epi.2022.may.21  
Cobarsí-Morales, J. (2022). Controversial ‘Black Kegend’ concept as misinformation or disinformation related to history: where do we go from here in the 21st century information field?. In: Smits, M. Proceedings iConference 2022 – Information for a Better World: Shaping the Global Future.
Salaverría, R., & León, B. (2022). Misinformation beyond the media: ‘fake news’ in the big data ecosystem. In: Vázquez-Herrero J., Silva-Rodríguez A., Negreira-Rey M.C., Toural-Bran C., López-García X. (2022). Total Journalism. Models, Techniques and Challenges (pp. 109-121. Studies in Big Data, 97. Springer Nature. Cham: Springer. DOI:10.1007/978-3-030-88028-6_9
Meel, P.; Vishwakarma, D.K. (2019). Fake news, rumor, information pollution in social media and web: A contemporary survey of state of the arts, challenges and opportunities. Expert Systems With Applications https://doi.org/10.1016/j.eswa.2019.112986
 

Dr Josep Cobarsí

Mail: jcobarsi@uoc.edu

ICSO
Multilayer networks to better understand Multiple Sclerosis
 
Neuroaxonal anatomy and function are affected by multiple sclerosis (MS) disease which, in turn, impacts the brain structure, organization, and function. In general, networks representing particular brain aspects (morphology, structure, or dynamics) are studied independently to understand and predict individual brain damage effects [1]. Designing a single unified model to jointly study these multiple aspects is necessary to understand neurological diseases. 
 
Within this PhD we would like to introduce an interconnected multi-layer framework for the joint analysis of morphological, structural, and functional networks. Therefore, we aim to define a multi-layer scheme that allows us to combine the information of morphological, structural, and functional networks into a single scheme in order to better assess brain damage effects and evolution of MS patients. Then, it is very relevant to define or adapt graph-mining metrics to evaluate and quantify the deterioration of the connectivity of the brain using the new multi-layer scheme.
 
This work will be done in close collaboration with the Multiple Sclerosis group led by Dr. Sara Llufriu at the IDIBAPS-Hospital Clinic, a world-wide recognized clinical institution.
 
Ref: Jordi Casas-Roma, Eloy Martinez-Heras, Albert Solé-Ribalta, Elisabeth Solana, Elisabet Lopez-Soley, Francesc Vivó, Marcos Diaz-Hurtado, Salut Alba-Arbalat, Maria Sepulveda, Yolanda Blanco, Albert Saiz, Javier Borge-Holthoefer, Sara Llufriu, Ferran Prados; Applying multilayer analysis to morphological, structural, and functional brain networks to identify relevant dysfunction patterns. Network Neuroscience 2022; 6 (3): 916–933. doi: https://doi.org/10.1162/netn_a_00258
 
 
Mail: fpradosc@uoc.edu
 
NeuroADaS Lab

Predict disability in multiple sclerosis using synthetic data and federated learning

An integrative approach to predict disability in multiple sclerosis through image analysis, synthetic data generation and the provision of a federated learning platform. This PhD is going to be in collaboration with the research group ImaginEM at Hospital Clinic BCN.
 
The primary clinical objective of this PhD project is to develop robust predictive models for disability progression in individuals diagnosed with multiple sclerosis [1]. This clinical objective involves leveraging advanced imaging techniques, clinical data, and machine learning algorithms to identify early indicators of disability. The successful candidate will collect and analyse multimodal imaging data, including but not limited to MRI, CT scans, and patient clinical records. The aim is to enhance our understanding of disease progression patterns, ultimately contributing to developing personalised treatment plans.
 
In parallel with the clinical objectives, the PhD candidate will be responsible for developing an open federated learning system platform tailored to analyse medical images [2,3]. This platform will facilitate collaboration and data sharing across healthcare institutions while ensuring data privacy and security. The candidate will design and implement federated learning algorithms, enabling the aggregation of insights from diverse datasets without centralised data storage. Developing a synthetic data generation module [4] will also be crucial, providing a more extensive and diverse dataset for training machine learning models.
 
[1] https://jnnp.bmj.com/content/early/2023/07/19/jnnp-2022-330203.abstract
[2] https://www.worldscientific.com/doi/abs/10.1142/S0129065722500496
[3] https://link.springer.com/chapter/10.1007/978-3-031-22356-3_12
[4] https://www.nature.com/articles/s41598-023-40364-6
 
 
Mail: fpradosc@uoc.edu
 
 
Mail: lsubirats@uoc.edu
 
NeuroADaS Lab

Are the texts written by large LLMs better than those of an average human?

Recent studies have found that readers prefer texts generated by LLM over similar texts extracted from Wikipedia articles [1]; and that the results of crowdsourced text summarization tasks are of lower quality when compared to results obtained with LLMs [2]. To explore and contrast these findings in a broader context we propose to design tasks and experiments to analyze the quality of texts written by LLMs like chat GPT in relation to those written by humans. Specifically we want to explore research questions like the following: What percentage of humans are still able to write better texts than LLMs? How do these percentages differ depending on the task and type of text? Can we use LLMs to rate the quality of those texts to get to similar judgements as with human evaluators?

 
 
[1] Huschens, M., Briesch, M., Sobania, D., & Rothlauf, F. (2023). Do You Trust ChatGPT?--Perceived Credibility of Human and AI-Generated Content. arXiv preprint arXiv:2309.02524.
[2] Veselovsky, V., Ribeiro, M. H., & West, R. (2023). Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks. arXiv preprint arXiv:2306.07899.
 

Dr Andreas Kaltenbrunner

Mail: akaltenbrunner@uoc.edu

Dr Jacopo Amidei

Mail: jamidei@uoc.edu

AID4So 

Let´s be Ethical: new ethical-oriented evaluation methodologies for LLMs 

In the latest years, we have witnessed the impressive development of Artificial Intelligence (AI) in several applications such as Natural Language Processing (more specifically with the introduction of Large Language Models), image recognition, and self driving cars, a few examples of the many that will have a massive impact on the world economy and society. They can be undoubtedly an extraordinary opportunity, but they also pose urgent questions. How can we guarantee that AI respects people's privacy, does not spread fake news and does not become a propaganda, mind-washing or manipulative tool? 

These and other questions raise the need for the definition and development of strong evaluation methodologies for AI systems. Unfortunately, current evaluation methodologies fall short of identifying and tackling these problems [1, 2]. This project aims to define new evaluation methodologies for AI systems. More precisely, given their ever-increasing use and application, the focus will be directed on the Large Language Models (LLM). 
The new evaluation methodologies we will develop, whose definition will be driven by an ethical-oriented view, have a twofold purpose. On the one hand, these evaluation methodologies will provide useful insight to developers in order to improve the models throughout the direction of an ethical AI. On the other hand, they will provide information to the final user about the potential harm of the used model. 
For this project, we will make use of machine learning techniques, textual network analysis [4] and machine psychology techniques [3]. That said, the methodologies to be used in the project are open to any other innovative ideas in the AI field.
 
 
 
 
[1] Chang, Y., Wang, X., Wang, J., Wu, Y., Zhu, K., Chen, H., Yang, L., Yi, X., Wang, C., Wang, Y. and Ye, W., 2023. A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109.
[2] Gehrmann, Sebastian, Elizabeth Clark, and Thibault Sellam. "Repairing the cracked foundation: A survey of obstacles in evaluation practices for generated text." Journal of Artificial Intelligence Research 77 (2023): 103-166.
[3] Hagendorff, Thilo. "Machine psychology: Investigating emergent capabilities and behavior in large language models using psychological methods." arXiv preprint arXiv:2303.13988 (2023).
[4] Hevey, David. "Network analysis: a brief overview and tutorial." Health Psychology and Behavioral Medicine 6.1 (2018): 301-328.
 

Dr Andreas Kaltenbrunner

Mail: akaltenbrunner@uoc.edu

Dr Jacopo Amidei

Mail: jamidei@uoc.edu

AID4So