SQL Server & XQuery: XPath Axes, Node Tests, and Predicates

In order to really get a grasp on what’s going on within an XQuery method in SQL Server (like .exist()), we need to examine the relationship between XQuery and XPath.  Semantically, there’s little difference in the usage of the two names, since XQuery as a language uses XPath to navigate through an XML fragment; however, strictly speaking, XPath is a subset of XQuery. 

XPath expressions contain one or more of three elements: an axis, a node test, and a predicate.  An axis refers to the locational reference of a path when navigating an xml structure.  For example consider the following XML:

<root>
  <plant>
    <fruit>tomato</fruit>
    <fruit>banana</fruit>
    <vegetable>tomato</vegetable>
  </plant>
  <animal>
    <canine>Doberman</canine>
  </animal>
</root>

and the XQuery below:

/root/plant/fruit

based on the axis (path) supplied above, the results will be the two <fruit> nodes in the XML document.  It’s important to remember that in the SQL Server implementation of XPath, axis expressions always start at the top of an XML document and work down, but there are some navigational shortcuts.  Below is a small table describing the axes supported:

Abbreviation Full Syntax Notes
/ / steps through nodes
// // steps through all descendant nodes
  child returns a specific child of a given node; e.g. /root/child::plant is the same as /root/plant
.. parent steps up a node; .. equivalent to parent::node()
@ attribute identified attribute-based XML
. self refers to the current node
* * refers to all nodes at the current level; e.g. /root/* will return both the <plant> and <animal> nodes from our previous example.

A node test is a part of the location path that when used with an axis restricts the XQuery to return only the results of interest. Taking our same XQuery from above:

/root/plant/fruit

The axis is the “/” symbol which tells XQuery to walk down a node at a time, and there are two node tests: plant, and fruit.  The first node test tells XQuery to only look at the plant nodes, and the second node test restricts it to just the fruit nodes.  The above XQuery could be shortened to:

//fruit

All nodes would be considered, and all fruit nodes would be returned, regardless of where they live in the XML hierarchy. 

The final expression in XPath is the predicate; a predicate limits the results of an XPath expression through evaluation (much like the WHERE clause of a SQL statement).  Predicates are specified using square brackets, and allow for the use of several functions to return either a positional reference or an evaluation of a value or node.  For example, the results of the following two XQuery expressions against our test XML document:

//fruit[2]

//fruit[last()]

would return the same positional results:

<fruit>banana</fruit>

likewise, the XQuery expression below:

//fruit[text()="banana"]

also returns the same results.  last() and text() are both XPath supported functions, and are part of the subset supported by Microsoft SQL Server.

Summary

Why is all of this important?  It’s dry, I know. However, the foundations of XPath allow us to do some pretty powerful querying of XML before handing the results “across the bridge” to SQL Server.  Now that we’ve got the basics identified, we can continue our previous discussion of the XML methods in SQL Server.

Share