A nonlinear matrix decomposition for mining the zeros of sparse data
Many statistical models are motivated by the search for low-dimensional structure in high-dimensional data. Often the data can be represented by a large sparse matrix; such a matrix, for example, can record affinities in a social network, n-gram counts of text, or patterns of neural activity. In this case, we can search for distinctly nonlinear expressions of low-dimensional structure that are keyed to the data's sparsity. To this end, we consider the following problem: given a sparse nonnegative matrix X, how can we estimate a low-rank matrix Θ such that X ≈ f(Θ), where f is an elementwise nonlinearity? We develop a latent variable model for this problem and consider those sparsifying nonlinearities, popular in today's neural networks, that map all negative values to zero. We show that exact inference in this model is tractable and derive an Expectation-Maximization algorithm to estimate the low-rank matrix Θ. Notably, this estimation is performed without parameterizing the matrix Θ as the product of smaller factors. Compared to linear approaches (e.g., singular value decomposition, nonnegative matrix factorization), this model is able to explain the variability of sparse data in terms of many fewer degrees of freedom. The model also illustrates, in a simple way, how sparse high-dimensional data may arise from low-dimensional pattern manifolds.
If you would like to attend, please email firstname.lastname@example.org for the Zoom link.