A PhD thesis @ NALAG

This is a PhD thesis prepared by a member of the NALAG group or prepared with a (co)promotor from NALAG.

TW 2014_02

Laurent Sorber
Date fusion: Tensor factorizations by complex optimization
May 28, 2014

Advisor(s): Lieven De Lathauwer and Marc Van Barel

Abstract

In our information age, the amount of data observed has increased tremendously in volume, velocity and variety. Concealed inside this new natural resource lies a wealth of information waiting to be mined. Arrays with two or more dimensions, known as matrices and tensors, respectively, serve as the computer representation of many types of data. Knowledge can be extracted, or inferred, from such arrays by capturing their underlying patterns and structure with so-called matrix and tensor factorizations that attempt to explain the observations with a small number of variables. The ever increasing volume of data demands new algorithms that can deal with the curse of dimensionality, while the growing variety of data calls for new approaches to integratively analysing several sources of data at once.

The goal of this thesis is to develop data fusion as a new paradigm for the joint analysis of one or more data sets by coupled low-rank factorization of dense, sparse or incomplete matrices and tensors, leading to insights which are deeper and more accurate than those resulting from a single source of data. There are many choices to be made in a data fusion model. For each data set, one of several types of tensor decompositions should be chosen with which that data set will be factorized. Tensor decompositions are composed of building blocks called factors, each of which may be imposed to exhibit a certain structure such as nonnegativity or orthogonality. Furthermore, the coupling between the decompositions should be defined by indicating which factors are shared between data sets. Evidently, it is a hopeless task to attempt to design algorithms individually for each of the myriad combinations of tensor decompositions, factor structures and coupling between factorizations that define a data fusion model.s

In this thesis we present a framework which allows for all of these choices to be made independently and dynamically, we develop accompanying algorithms that fully exploit the structure of the problem on every level and we deliver an efficient software implementation with which data fusion can be brought into practice on big data. Users can choose from a library of tensor decompositions and factor structures or add their own with little effort, enabling the computation of complex data fusion models and classical matrix and tensor factorizations alike.

Doctadmin 3E100869 / lirias 450617 / mailto: nalag team