Evaluation of Jensen-Shannon distance over sparse data

Richard Connor, Franco Alberto Cardillo, Robert Moss, Fausto Rabitti

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

2 Citations (Scopus)
55 Downloads (Pure)


Jensen-Shannon divergence is a symmetrised, smoothed version of Küllback-Leibler. It has been shown to be the square of a proper distance metric, and has other properties which make it an excellent choice for many high-dimensional spaces in R*.
The metric as defined is however expensive to evaluate. In sparse spaces over many dimensions the Intrinsic Dimensionality of the metric space is typically very high, making similarity-based indexing ineffectual. Exhaustive searching over large data collections may be infeasible.
Using a property that allows the distance to be evaluated from only those dimensions which are non-zero in both arguments, and through the identification of a threshold function, we show that the cost of the function can be dramatically reduced.
Original languageEnglish
Title of host publicationSimilarity Search and Applications
Subtitle of host publication6th International Conference, SISAP 2013, A Coruña, Spain, October 2-4, 2013, Proceedings
EditorsNieves Brisaboa, Oscar Pedreira, Pavel Zezula
Place of PublicationBerlin
Number of pages6
ISBN (Print)9783642410611
Publication statusPublished - 13 Sep 2013
Event6th International Conference on Similarity Search and Applications, SISAP 2013 - Hotel Riazor, A Coruña, Spain
Duration: 2 Oct 20134 Oct 2013

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743


Conference6th International Conference on Similarity Search and Applications, SISAP 2013
CityA Coruña


  • distance metrics
  • exhaustive searching
  • high dimensional spaces
  • instrinsic dimensionalitites
  • Jensen-Shannon divergence
  • metric spaces
  • other properties
  • threshold functions
  • artificial intelligence
  • computer science


Dive into the research topics of 'Evaluation of Jensen-Shannon distance over sparse data'. Together they form a unique fingerprint.

Cite this