XML

SQL Server & XQuery: XPath Axes, Node Tests, and Predicates

In order to really get a grasp on what’s going on within an XQuery method in SQL Server (like .exist()), we need to examine the relationship between XQuery and XPath.  Semantically, there’s little difference in the usage of the two names, since XQuery as a language uses XPath to navigate through an XML fragment; however, strictly speaking, XPath is a subset of XQuery. 

XPath expressions contain one or more of three elements: an axis, a node test, and a predicate.  An axis refers to the locational reference of a path when navigating an xml structure.  For example consider the following XML:

<root>
  <plant>
    <fruit>tomato</fruit>
    <fruit>banana</fruit>
    <vegetable>tomato</vegetable>
  </plant>
  <animal>
    <canine>Doberman</canine>
  </animal>
</root>

and the XQuery below:

/root/plant/fruit

based on the axis (path) supplied above, the results will be the two <fruit> nodes in the XML document.  It’s important to remember that in the SQL Server implementation of XPath, axis expressions always start at the top of an XML document and work down, but there are some navigational shortcuts.  Below is a small table describing the axes supported:

Abbreviation Full Syntax Notes
/ / steps through nodes
// // steps through all descendant nodes
  child returns a specific child of a given node; e.g. /root/child::plant is the same as /root/plant
.. parent steps up a node; .. equivalent to parent::node()
@ attribute identified attribute-based XML
. self refers to the current node
* * refers to all nodes at the current level; e.g. /root/* will return both the <plant> and <animal> nodes from our previous example.

A node test is a part of the location path that when used with an axis restricts the XQuery to return only the results of interest. Taking our same XQuery from above:

/root/plant/fruit

The axis is the “/” symbol which tells XQuery to walk down a node at a time, and there are two node tests: plant, and fruit.  The first node test tells XQuery to only look at the plant nodes, and the second node test restricts it to just the fruit nodes.  The above XQuery could be shortened to:

//fruit

All nodes would be considered, and all fruit nodes would be returned, regardless of where they live in the XML hierarchy. 

The final expression in XPath is the predicate; a predicate limits the results of an XPath expression through evaluation (much like the WHERE clause of a SQL statement).  Predicates are specified using square brackets, and allow for the use of several functions to return either a positional reference or an evaluation of a value or node.  For example, the results of the following two XQuery expressions against our test XML document:

//fruit[2]

//fruit[last()]

would return the same positional results:

<fruit>banana</fruit>

likewise, the XQuery expression below:

//fruit[text()="banana"]

also returns the same results.  last() and text() are both XPath supported functions, and are part of the subset supported by Microsoft SQL Server.

Summary

Why is all of this important?  It’s dry, I know. However, the foundations of XPath allow us to do some pretty powerful querying of XML before handing the results “across the bridge” to SQL Server.  Now that we’ve got the basics identified, we can continue our previous discussion of the XML methods in SQL Server.

SQL Server XQuery: .exist() method

I’m trying to beef up my writing about XQuery in advance of the upcoming PASS Summit 2013; I got accepted to present, and I want to make sure that I start covering a lot of topics related to XML so that I’ve honed my focus a bit.  Unfortunately, blogging doesn’t come easy to me these days, and while I would generally start a session from the beginning and move forward, I think it would be best for me to start with something easy for me to write about, even though it means I’ll be doing a bit of backtracking later.

A little bit of background

As you may already know, SQL Server supports XQuery against an XML data type through the use of five basic methods:

  • .query()
  • .value()
  • .exist()
  • .modify()
  • .nodes()

Conceptually, each of these methods act a bridge between the the SQL processing engine and the xml parser baked into SQL Server.  This concept of a bridge allows me to think about where query work is being performed; is it being done as a SQL statement against the table containing the XML fragment of interest, or is it being performed by XQuery before bubbling up to the SQL level? 

image

.exist()

I want to focus on probably the easiest method against the xml datatype: .exist().  Simply put, this method checks to see whether or not the results of an XQuery expression against the supplied XML value are nonempty; it returns 1 if true, and 0 if false.  Let’s start with a simple sample:

DECLARE @T TABLE ( ID INT, x XML )

INSERT  INTO @T
        ( ID, x )
VALUES  ( 1, '<root><plant><fruit>tomato</fruit></plant></root>' )
,       ( 2, '<root><plant><fruit>banana</fruit></plant></root>' )
,       ( 3, '<root><plant><vegetable>tomato</vegetable></plant></root>' )


SELECT ID, X, x.exist('//root') AS "Does It Exist?" 
FROM @T 

Running this query will return the following results:

image

Every row has an XML value in the X column which contains a node named <root>.  We can change the output to identify rows which have vegetable nodes like so:

SELECT ID, X, x.exist('//root/plant/vegetable') AS "Does It Exist?" 
FROM @T 

The results then change to the following:

image

.exist() can also be used in the WHERE clause of a SQL statement; if we want to identify only the rows which have an XML fragment containing a tomato, we can alter the SQL statement like so:

SELECT ID, X  
FROM @T 
WHERE  x.exist('//.[text()="tomato"]') = 1

image

Note that the .exist() method interpreted an XQuery statement to look for nodes which had a text() value of “tomato”; two rows matched that requirement, so .exist() returned a 1 for those two rows to the calling SQL statement, and the WHERE clause identified those two rows.

Wrap-up

So, I’ve covered a few basic concepts while talking about the .exist() method:

  1. The xml methods act as a bridge between XQuery and SQL for the xml data types.
  2. The .exist() method supports XQuery expressions including both simple paths (//<node name>) and XQuery functions (e.g, text()).  This will become important later when I get around to discussing FLWOR.