Johnson-Lindenstrauss Lemma Beyond Euclidean Geometry

Published in Conference on Neural Information Processing Systems (NeurIPS), 2025

Abstract

The Johnson-Lindenstrauss (JL) lemma is a cornerstone of dimensionality reduction in Euclidean space, but its applicability to non-Euclidean data has remained limited. This paper extends the JL lemma beyond Euclidean geometry to handle general dissimilarity matrices that are prevalent in real-world applications. We present two complementary approaches: First, we show the JL transform can be applied to vectors in pseudo-Euclidean space with signature (p, q), providing theoretical guarantees that depend on the ratio of the (p, q) norm and Euclidean norm of two vectors, measuring the deviation from Euclidean geometry. Second, we prove that any symmetric hollow dissimilarity matrix can be represented as a matrix of generalized power distances, with an additional parameter representing the uncertainty level within the data. In this representation, applying the JL transform yields multiplicative approximation with a controlled additive error term proportional to the deviation from Euclidean geometry. Our theoretical results provide fine-grained performance analysis based on the degree to which the input data deviates from Euclidean geometry, making practical and meaningful reduction in dimensionality accessible to a wider class of data. We validate our approaches on both synthetic and real-world datasets, demonstrating the effectiveness of extending the JL lemma to non-Euclidean settings.

Method: Two Approaches

We propose two distinct frameworks to handle non-Euclidean data:

Pseudo-Euclidean JL Transform: We map dissimilarities to a pseudo-Euclidean space with signature $(p, q)$. We prove that standard random projections can be applied separately to the positive and negative components. The distortion is bounded by $1 \pm \epsilon \cdot C_{ij}$, where $C_{ij}$ measures the local deviation from Euclidean geometry.
Power Distance JL Transform: We prove that any symmetric hollow matrix is a matrix of generalized power distances between balls. We apply random projections to the centers of these balls. This yields a $(1 \pm \epsilon)$ multiplicative approximation with an additive error term $4\epsilon r^2$, where $r$ is derived from the smallest eigenvalue of the Gram matrix.

Experimental Results

We evaluated our methods on 10 datasets, including synthetic non-Euclidean sets (Simplex, Ball) and real-world data (Genomics, Graphs).

Relative Error Comparison

Our methods (JL-PE and JL-Power) significantly outperform classical JL on non-Euclidean datasets where classical JL often fails (yielding infinite error when mapping distinct points to the same location).

Table 1: Median Relative Error (Lower is better)

Dataset	Classical JL	JL-PE (Ours)	JL-Power (Ours)
Simplex	6.36	2.57	12.79
Ball	6.68	2.78	1.00
Brain	2.69e12	1284.06	1.01e7
Facebook	5.08	5.83	1.16

Key Contributions

Theoretical Extension: First formulation of JL lemma guarantees for general symmetric hollow matrices using Pseudo-Euclidean and Power Distance representations.
Fine-Grained Analysis: Error bounds are not worst-case but dependent on the data’s specific deviation from Euclidean geometry.
Universal Representation: Proof that any symmetric hollow matrix is a generalized power distance matrix.
Empirical Validation: Superior performance on biological, graph, and synthetic non-metric datasets compared to classical approaches.

BibTeX Citation

@inproceedings{deng2025jl,
  title={Johnson-Lindenstrauss Lemma Beyond Euclidean Geometry},
  author={Chengyuan Deng and Jie Gao and Kevin Lu and Feng Luo and Cheng Xin},
  booktitle={Conference on Neural Information Processing Systems},
  year={2025}
}

Cheng Xin