Norwegian version

Public defence: Carla Schenker

Carla Schenker will defend her thesis “A Flexible Framework for Data Fusion Based on Coupled Matrix and Tensor Factorizations for Interpretable Pattern Discovery” for the PhD program in Engineering Science.

Trial Lecture

Title: «Uniqueness of coupled matrix and tensor models with (partially) shared factors and constraints: an overview of known results»

Public defence

Title: “A Flexible Framework for Data Fusion Based on Coupled Matrix and Tensor Factorizations for Interpretable Pattern Discovery”

Ordinary opponents

Leader of the evaluation committee

Hugo Lewi Hammer, Professor, Ph.D., OsloMet, Norway

Leader of the public defence

Anis Yazidi, Professor, OsloMet, Norway

Supervisors

Abstract

Data fusion involves jointly analyzing multiple interrelated data sets to allow them to interact and inform each other. This approach is crucial in fields like medicine, chemometrics, and remote sensing, where information about the same phenomenon is gathered from various modalities, such as different sensing technologies. Each modality alone may not provide a complete picture, but together they offer complementary insights. For example, EEG and fMRI provide different temporal and spatial resolutions of brain activity.

Data is often represented as matrices and higher-order tensors. EEG data, for instance, can be organized as a three-way tensor with modes for subjects, time, and electrodes. Coupled matrix and tensor factorizations, which model each data set as a sum of low-rank components, are effective for joint analysis and can reveal latent patterns.

However, data from multiple sources are often heterogeneous, posing challenges such as different data types, sizes, dimensions, noise characteristics, and sampling rates. Coupled matrix and tensor factorization models must incorporate various tensor decomposition models, loss functions, and coupling structures, along with constraints and regularizations to ensure identifiability and interpretability.

This thesis applies a coupled matrix and tensor factorization model to a multi-modal neuroimaging data set to extract potential biomarkers of a psychiatric disorder. It systematically studies the model’s effectiveness and limitations.

The main part of the thesis proposes a flexible algorithmic framework for constrained linearly coupled matrix and tensor factorizations, supporting various constraints, regularizations, loss functions, and linear coupling relations. It includes the CANDECOMP/PARAFAC (CP) and PARAFAC2 models, with a new algorithm for fitting PARAFAC2 models that allows imposing constraints on all modes.

Experiments on synthetic data show that the proposed approach accurately extracts underlying components and performs competitively, sometimes even better than state-of-the-art methods. Real data experiments from chemometrics and remote sensing demonstrate the framework’s versatility and applicability.