Abstract
The effective grouping, or partitioning, of semistructured
data is of fundamental importance when providing support
for queries. Partitions allow items within the data set that
share common structural properties to be identified efficiently. This allows queries that make use of these properties, such as branching path expressions, to be accelerated. Here, we evaluate the effectiveness of several partitioning techniques by establishing the number of partitions that each scheme can identify over a given data set. In particular, we explore the use of parameterised indexes, based upon the notion of forward and backward bisimilarity, as a means of partitioning semistructured data; demonstrating that even restricted instances of such indexes can be used to identify the majority of relevant partitions in the data.
Original language | English |
---|---|
Pages | 497-506 |
Number of pages | 9 |
Publication status | Published - 4 Sep 2006 |
Event | 17th International Workshop on Database and Expert Systems Applications (DEXA 2006) - Krakow, Poland Duration: 4 Sep 2006 → 8 Sep 2006 |
Conference
Conference | 17th International Workshop on Database and Expert Systems Applications (DEXA 2006) |
---|---|
City | Krakow, Poland |
Period | 4/09/06 → 8/09/06 |
Keywords
- semistructured data
- data management
- partitions
- indexes
- statistics