Compact in-memory representation of XML data: design and implementation of a compressed DOM for data-centric documents

M. Neumüller, John Wilson

Research output: Book/ReportOther report

Abstract

Over recent years XML has evolved from a document exchange format to a multi-purpose data storage and retrieval solution. To make use of the full potential of XML in the domain of large, data-centric documents it is necessary to have easy and fast access to individual data elements. We describe an implementation of the Document Object Model (DOM) that is designed with these objectives in mind. It uses compression to allow large documents to be stored in the computer's main memory. Query-relevant DOM methods are optimised to work on top of the created data structure. Measurements indicate that compression up to a factor of 5 is possible without losing the ability to directly address individual elements. No prior decompression is needed to query and locate nodes.
LanguageEnglish
Place of PublicationGlasgow, UK
PublisherUniversity of Strathclyde
Number of pages13
Publication statusPublished - 2002

Fingerprint

XML
Data storage equipment
Data structures

Keywords

  • xml
  • extensible markup language
  • programming language

Cite this

@book{17cc36594b794560bffd4f3678b67385,
title = "Compact in-memory representation of XML data: design and implementation of a compressed DOM for data-centric documents",
abstract = "Over recent years XML has evolved from a document exchange format to a multi-purpose data storage and retrieval solution. To make use of the full potential of XML in the domain of large, data-centric documents it is necessary to have easy and fast access to individual data elements. We describe an implementation of the Document Object Model (DOM) that is designed with these objectives in mind. It uses compression to allow large documents to be stored in the computer's main memory. Query-relevant DOM methods are optimised to work on top of the created data structure. Measurements indicate that compression up to a factor of 5 is possible without losing the ability to directly address individual elements. No prior decompression is needed to query and locate nodes.",
keywords = "xml, extensible markup language, programming language",
author = "M. Neum{\"u}ller and John Wilson",
year = "2002",
language = "English",
publisher = "University of Strathclyde",

}

Compact in-memory representation of XML data : design and implementation of a compressed DOM for data-centric documents. / Neumüller, M.; Wilson, John.

Glasgow, UK : University of Strathclyde, 2002. 13 p.

Research output: Book/ReportOther report

TY - BOOK

T1 - Compact in-memory representation of XML data

T2 - design and implementation of a compressed DOM for data-centric documents

AU - Neumüller, M.

AU - Wilson, John

PY - 2002

Y1 - 2002

N2 - Over recent years XML has evolved from a document exchange format to a multi-purpose data storage and retrieval solution. To make use of the full potential of XML in the domain of large, data-centric documents it is necessary to have easy and fast access to individual data elements. We describe an implementation of the Document Object Model (DOM) that is designed with these objectives in mind. It uses compression to allow large documents to be stored in the computer's main memory. Query-relevant DOM methods are optimised to work on top of the created data structure. Measurements indicate that compression up to a factor of 5 is possible without losing the ability to directly address individual elements. No prior decompression is needed to query and locate nodes.

AB - Over recent years XML has evolved from a document exchange format to a multi-purpose data storage and retrieval solution. To make use of the full potential of XML in the domain of large, data-centric documents it is necessary to have easy and fast access to individual data elements. We describe an implementation of the Document Object Model (DOM) that is designed with these objectives in mind. It uses compression to allow large documents to be stored in the computer's main memory. Query-relevant DOM methods are optimised to work on top of the created data structure. Measurements indicate that compression up to a factor of 5 is possible without losing the ability to directly address individual elements. No prior decompression is needed to query and locate nodes.

KW - xml

KW - extensible markup language

KW - programming language

M3 - Other report

BT - Compact in-memory representation of XML data

PB - University of Strathclyde

CY - Glasgow, UK

ER -