See also: DesignConsiderations

XML Databases

Investigated alternatives:

  • Exist, not completed
  • MonetDB/XQuery, not completed
  • Sedna

Other possible investigation targets:

Here are some measurements done by MonetDB (warning, this is from February 2006). The test appears to search within one document. This is not exactly our use case (I think we will have many tiny documents), but I assume that the effect of many tiny documents is the same (or worse) then 1 big document. The conclusion is: competitors of MonetDB are comparable or a little bit faster for tiny documents, fall hopelessly behind with slightly larger documents and mostly just give up with large documents.

Exist

  • + Most popular XML database.
  • - Very slow searches with large documents or large databases. Work is being done to make it faster. This will involve the creation of custom indexes.
  • - Has a fixed limit of 231 documents. Limits on document size is not likely to be a problem for us.
  • + Has an ACL - unix like model on folders and documents with option to integrate with LDAP
  • + Uses XACML for security on XQuery
  • - Transactions are supported, but on only to protects against crashes.
  • - Clustering is not supported.

MonetDB/XQuery

  • + Fast, even without custom indexes, for any reasonable size.
  • + Architected for multi-gigabyte databases.
  • - I could not easily find out how to connect to MonetDB from Java. Questions on the internet suggest this is possible either through SOAP and through JDBC (!)
  • - No specifics are available for security, it is suggested to solve this in the specific application.
  • ± Although it is used in banks, this is essentially a research database.
  • ± Clustering is supported in the sense of partitioning (work in progress), not in the sense of duplication.
  • ± Transactions are supported but I could not quickly figure out how.

Sedna

  • - Not well known.
  • + Support Xquery and Xupdate.
  • + Has a good java interface XML:DB API (or XAPI).
  • + Supports transactions, appears to be robust.
  • - No clustering support.
  • - Needs hand made indexes to perform.