Neuc-MDS: Non-Euclidean Multidimensional Scaling Through Bilinear Forms

Published in The Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024), 2024

Abstract

We introduce Non-Euclidean-MDS (Neuc-MDS), an extension of classical Multidimensional Scaling (MDS) that accommodates non-Euclidean and non-metric inputs. The key idea is to generalize the standard inner product to symmetric bilinear forms so negative eigenvalues of dissimilarity Gram matrices can be used productively. Neuc-MDS efficiently chooses both positive and negative eigenvalues to reduce STRESS, the sum of squared pairwise error. We provide error analysis, proofs of optimality for minimizing lower bounds of STRESS, and evaluations on synthetic and real-world datasets against linear and non-linear dimension reduction methods.


The Challenge: Dimensionality Paradox

Classical MDS (cMDS) assumes Euclidean geometry and discards negative eigenvalues from the Gram matrix. On non-Euclidean data (e.g., biological sequence dissimilarity, non-metric crowd-sourced similarity), this leads to a Dimensionality Paradox: increasing the number of target dimensions $k$ in cMDS can actually increase the STRESS error.

Our method, Neuc-MDS, embraces the indefinite geometry (pseudo-Euclidean space) by selecting a mix of positive and negative eigenvalues, ensuring that STRESS monotonically decreases as dimensions are added.


Experimental Results

We evaluated Neuc-MDS on datasets with significant non-Euclidean components (indicated by negative eigenvalues), such as “Random-Simplex” and biological datasets (Brain, Breast Cancer).

Table 1: STRESS Comparison (Lower is better)

DatasetClassical MDSLower-MDSSMACOFNeuc-MDS (Ours)
Random-Simplex80.5231.5415.96**1.18**
Euclidean Ball36.9717.304e6**1.19**
Brain (Genomics)2.890.2890.081**0.046**
CIFAR1026.591.271.63e5**0.85**

Key Contributions

  • Generalization: Extends MDS with symmetric bilinear forms to handle non-Euclidean and non-metric inputs.
  • Optimization: Proposes an $O(n)$ greedy algorithm to select the optimal subset of positive and negative eigenvalues.
  • Theoretical Guarantees: Proves that STRESS is decomposed into three terms, and Neuc-MDS minimizes the dominant lower bound.
  • Validation: Superior performance on synthetic and real datasets, resolving the dimensionality paradox.

@inproceedings{deng2024neucmds,
  title={Neuc-MDS: Non-Euclidean Multidimensional Scaling Through Bilinear Forms},
  author={Chengyuan Deng and Jie Gao and Kevin Lu and Feng Luo and Hongbin Sun and Cheng Xin},
  booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
  year={2024},
  url={[http://jackal092927.github.io/files/neucmds.pdf](http://jackal092927.github.io/files/neucmds.pdf)}
}
PaperCode