Trial Lecture: "Uniqueness of coupled matrix and tensor models with (partially) shared factors and constraints: an overview of known results".
Ordinary opponents:
- First opponent: Nikolaos Sidiropoulos, Professor, Ph.D., University of Virginia, Virginia, USA
- Second opponent: Borbála Hunyadi, Assistant Professor, Ph.D., TU Delft, Netherlands
Leader of the evaluation committee: Hugo Lewi Hammer, Professor, Ph.D., OsloMet
Leader of the public defence: Anis Yazidi, Professor, OsloMet
Supervisors:
- Main supervisor: Evrim Acar Ataman, Research Professor, Simula, Norway
- Co-supervisor: Jeremy E. Cohen, CREATIS, Frankrike
Abstract
Data fusion involves jointly analyzing multiple interrelated data sets to allow them to interact and inform each other. This approach is crucial in fields like medicine, chemometrics, and remote sensing, where information about the same phenomenon is gathered from various modalities, such as different sensing technologies. Each modality alone may not provide a complete picture, but together they offer complementary insights. For example, EEG and fMRI provide different temporal and spatial resolutions of brain activity.
Data is often represented as matrices and higher-order tensors. EEG data, for instance, can be organized as a three-way tensor with modes for subjects, time, and electrodes. Coupled matrix and tensor factorizations, which model each data set as a sum of low-rank components, are effective for joint analysis and can reveal latent patterns.
However, data from multiple sources are often heterogeneous, posing challenges such as different data types, sizes, dimensions, noise characteristics, and sampling rates. Coupled matrix and tensor factorization models must incorporate various tensor decomposition models, loss functions, and coupling structures, along with constraints and regularizations to ensure identifiability and interpretability.
This thesis applies a coupled matrix and tensor factorization model to a multi-modal neuroimaging data set to extract potential biomarkers of a psychiatric disorder. It systematically studies the model’s effectiveness and limitations.
The main part of the thesis proposes a flexible algorithmic framework for constrained linearly coupled matrix and tensor factorizations, supporting various constraints, regularizations, loss functions, and linear coupling relations. It includes the CANDECOMP/PARAFAC (CP) and PARAFAC2 models, with a new algorithm for fitting PARAFAC2 models that allows imposing constraints on all modes.
Experiments on synthetic data show that the proposed approach accurately extracts underlying components and performs competitively, sometimes even better than state-of-the-art methods. Real data experiments from chemometrics and remote sensing demonstrate the framework’s versatility and applicability.