Projects per year
Abstract
Comparing treestructured data for structural similarity is a recurring theme and one on which much effort has been spent. Most approaches so far are grounded, implicitly or explicitly, in algorithmic information theory, being approximations to an information distance derived from Kolmogorov complexity. In this paper we propose a novel complexity metric, also grounded in information theory, but calculated via Shannon's entropy equations. This is used to formulate a directly and efficiently computable metric for the structural difference between unordered trees. The paper explains the derivation of the metric in terms of information theory, and proves the essential property that it is a distance metric. The property of boundedness means that the metric can be used in contexts such as clustering, where secondorder comparisons are required. The distance metric property means that the metric can be used in the context of similarity search and metric spaces in general, allowing trees to be indexed and stored within this domain. We are not aware of any other tree similarity metric with these properties.
Original language  English 

Pages (fromto)  748764 
Number of pages  17 
Journal  Information Systems 
Volume  36 
Issue number  4 
DOIs  
Publication status  Published  Jun 2011 
Keywords
 unordered tree
 tree comparison
 distance metric
 algorithmic information theory
 information content
 information distance
 entropy
Fingerprint
Dive into the research topics of 'A bounded distance metric for comparing tree structure'. Together they form a unique fingerprint.Projects
 1 Finished

Structural Comparison of Labelled Graph Data
Connor, R.
EPSRC (Engineering and Physical Sciences Research Council)
1/10/09 → 30/09/12
Project: Research