Loading...
Study of non-negative matrix factorizations: foundations, methods, algorithms, and applications, A
Liu, Kai
Liu, Kai
Citations
Altmetric:
Advisor
Editor
Date
Date Issued
2019
Date Submitted
Keywords
Collections
Research Projects
Organizational Units
Journal Issue
Embargo Expires
Abstract
Machine learning problems generally could be categorized into three: supervised, semi-supervised and unsupervised learning problems. In supervised learning problems, there are labels or output which could be utilized to do classification or regression, while there are no labels or output in unsupervised learning. Clustering is one major topic in unsupervised learning, that given some data, we need to cluster the data into several groups. Among the methods in clustering, $K$-means is one of the most popular one, which is simple yet efficient. Non-negative matrix factorization (\textbf{NMF}) is a classical mathematical problem which aims to give two non-negative matrices $F$ and $G$ so that $FG$ approximate the given nonnegative matrix $X$. Due to its elegant formulation, \textbf{NMF} has been a hot topic for the past decade, and it has been proved that \textbf{NMF} could also do clustering which is equivalent to $K$-means and give more details. Different form some existing matrix factorization methods such as \textbf{SVD}, \textbf{QR}, \textbf{LU} \etc, \textbf{NMF} has an advantage in interpretation. In real world, there are some data naturally to be nonnegative, such as the connection between people, image pixels range from 0 to 255 \etc. Existing factorization methods could not guarantee the factor matrices to be nonnegative which fails in interpretation. From the perspective of feature learning $F$ , if some face images are given for clustering, the negative data in $F$ is meaningless. While from the perspective of membership matrix $G$, some operations are only additive, for example: any color could be generated from Red, Green and Blue, the ratio of Red, Green and Blue should be non-negative. Similar situations include the function of some certain Genetics to some certain disease. Some important work has been done for the past years: basic \textbf{NMF} (\textbf{BNMF}) is proposed to give $F, G$ with a rigid decrease in loss function with updates by making use of a method named Auxiliary Function while the nonnegative properties could be guaranteed. Graph NMF (\textbf{GNMF}) makes use of both intra data and inter correlation between data which is widely used for manifold data clustering. Tri-factorization has also been proposed to cluster the data and feature simultaneously which has been studied widely for the past several years. Besides the methods mentioned above, some constraint in \textbf{NMF} has been proposed, such as the orthogonality of membership matrix $G$ to get a unique solution; also sparsity ratio constraint is studied with rigorous mathematical analysis and proof. However, most of the previous methods are based on Multiplicative Updating Algorithm (\textbf{MUA}) in \textbf{BNMF} which is suffered from some disadvantages: poor local minimum, time consuming, difficult to solve non-convex norm objective function or penalty. Moreover, in real experiments, there are many soft-clustering cases which is ambiguous for clustering. More importantly, though the orthogonality constraint is proposed to get rid of infinite solution, yet the corresponding algorithm can't yield a rigid orthogonal factor matrix. The objective function in traditional methods are Frobenius square norm, which is known to be sensitive to noise and outliers which is inevitable in real world. Recently, $\ell_{2,1}$-norm and $\ell_{1}$-norm objective function have been proposed, while the corresponding algorithms are either ad-hoc or non-rigorous. Alternating Direction Method of Multipliers (\textbf{ADMM}) is a useful tool in optimization problem which is very robust by introducing the augmented item in traditional multiplier method. By updating the primal and dual variables along with newly introduced dependent variables through iterations, we found \textbf{ADMM} could be a new framework for \textbf{NMF} problems. So is alternating minimization with its variations. In this proposal report, some applications of \textbf{NMF} are widely studied including multi-relational data, multi-view data and feature selection. Different methods including \textbf{MUA}-based, \textbf{ADMM} and Proximal Alternating Linearlized Minimization (\textbf{PALM}) are proposed and corresponding algorithm along with rigorous proof are given. We could see that the new proposed method are better in terms of shorter time cost, lower local minimum \etc.
Associated Publications
Rights
Copyright of the original work is retained by the author.