Japanese / English
XICS: XML Indexes for Content and Structural search
Recently, the XML full-text search has emerged as an important new research topic as we can found the Working Drafts by W3C [1]. Our target is the speeding up of both structural searches (such as "//title") and full-text searches (such as "//section[contains(.,'XML')]") for XML documents by creating indexes. An important design principle of our indexes is the use of a B+ tree. We used GiST (Generalized Search Tree) [2] for the implementation of B+ tree indexes.
Each node in the XML tree is labeled with the identifier which consists of path information and sibling information. We constructed two types of indexes, one is for handling structural searches (STB-tree) and the other is for handling full-text searches (COB-tree). The search keys of the COB-tree are pairs of text fragments and the identifiers of the leaf nodes that contain the text, whereas the search keys of the STB-tree are the node identifiers.
STB-tree can handle queries which contain "//" such as "//title" efficiently because the search keys of STB-tree are ordered by the reverse path information. COB-tree keeps text fragments based on suffix texts. We also considered a compression technique for COB-tree.

Figure 1. Overview of XICS.
XICS achieved up to hundreds of times faster execution time compared to the earlier work [3]. We can traverse in XICS efficiently by using path information in the search keys.

Figure 2. Comparison of execution time with existing index.
[1] W3C. "XQuery 1.0 and XPath 2.0 Full-Text 1.0," May 2007. http://www.w3.org/TR/xquery-full-text/.
[2] J. M. Hellerstein, J. F. Naughton, and A. Pfeffer, "Generalized search trees for database systems," Proc. of 21st International Conference on Very Large Data Bases, pp.562-573, Zurich, Switzerland, Sept. 1995.
[3] R. Kaushik, R. Krishnamurthy, J. F. Naughton, and R. Ramakrishnan,"On the integration of structure indexes and inverted lists," Proc. of the 2004 ACM SIGMOD international conference on Management of data, pp.779-790, Paris, France, June 2004.