June 2013

SQL Server XQuery: .query() method

Continuing in my discussion of XQuery in SQL Server, let me discuss another XML data type method: .query().  The .query() method is a great illustration of the bridging concept I discussed earlier.  Simply put, this method allows you to run an XQuery against the XML data type, and return the results of that query as an XML document to SQL Server.  Using our example from the .exists() post:

DECLARE @T TABLE ( ID INT, x XML )

INSERT  INTO @T
        ( ID, x )
VALUES  ( 1, '<root><plant><fruit>tomato</fruit></plant></root>' )
,       ( 2, '<root><plant><fruit>banana</fruit></plant></root>' )
,       ( 3, '<root><plant><vegetable>tomato</vegetable></plant></root>' )

SELECT x.query('.') 
FROM @T 

Note that the parameter supplied to the method is an XQuery statement consisting of a single axis (.), which refers to the current node.  Since SQL Server’s implementation of XQuery starts at the top and works it’s way down the document, the current node is the entire document.  Running that query will give you the following output:

image

which isn’t very exciting.  However, if we begin to build out a XPath statement with a node test and a predicate, we can begin to narrow down the results quickly:

SELECT ID, x.query('.//fruit[text()="banana"]')
FROM @T t

And here’s the results:

image

While the results may be simple, there’s a lot going on behind the scenes, and it may have returned something different than what you expected.  Remember that when you execute an XQuery method against an XML column, there’s actually two queries happening:

For the XQuery portion, the XPath axis, node test, and predicate specified that we were looking for an XML node that was a fruit and had a value of a banana; this only existed in the XML in row 2, so the node was returned for row 2, but row 1 and row 3 were empty.  The SQL query then ran, and returned all rows in the table because there was no WHERE clause filtering the results.  If we only wanted to return the results where the XQuery returned a node, we could move the .query() method to the where clause.

One thing to always remember is that regardless of how specific your XQuery is, the type of the data being returned will always be XML.  For example, if you specify the text() function as part of the axis, and not the predicate:

SELECT ID, x.query('.//text()')
FROM @T t 

You’ll get what looks like a table with a column of character values representing the values of the various nodes:

image

However, the visual representation of the result set uses hyperlinks to indicate that it’s an XML column; if you click on the value (banana), the XML explorer will open up, showing you that it is an XML fragment, albeit one that has no tags.  If you need to return the data as the actual datatype, you’ll either need to use a SQL method (like CAST or CONVERT), or the .value() method (to be discussed soon).

Summary

Wrapping up with a few key points:

  1. The query() method allows you to specify an XQuery statement against an XML document, and
  2. It always returns an XML document.

MaTT: Entropy

imageDecided to take a detour today in my blogging thoughts (it is Friday, after all), and opted to revisit the issues of managing a technical team (MaTT); today’s topic: Entropy.

en·tro·py

/ˈentrəpē/

Noun

  • A thermodynamic quantity representing the unavailability of a system’s thermal energy for conversion into mechanical work, often…
  • Lack of order or predictability; gradual decline into disorder.

Simply put, processes decay over time.  A good manager will recognize this, and learn to intervene when those processes begin to break down.  In contrast, code lives on forever (unless someone steps in to change it). 

Let me give an example; about a year and a half ago, I implemented Kanban as a method of managing our workflow in the production environment.  My team quickly adapted to it, and it became very easy to see exactly what people were working on, and what was getting accomplished.  I thought “my job here is done”, and moved on to other things, occasionally touching base with the Kanban board to see what was going on.

This week, I realized that the process had degenerated; sure, my team was still using the board to report what they had done for the day.  However, they’ve gotten in the habit of solving the problem, and then creating the card in the Done pile.  Entropy.  I am no longer able to predict or measure what is being done or what needs to be done.

I’m out next week (hoping my son is born by the end of the week), but when I return, I need to intervene and work toward cleaning up the process.  While I firmly believe that I should trust my team and encourage them to make the right decision, part of the management process is learning when to intervene in order to shore up a process that is decaying.

SQL Server & XQuery: XPath Axes, Node Tests, and Predicates

In order to really get a grasp on what’s going on within an XQuery method in SQL Server (like .exist()), we need to examine the relationship between XQuery and XPath.  Semantically, there’s little difference in the usage of the two names, since XQuery as a language uses XPath to navigate through an XML fragment; however, strictly speaking, XPath is a subset of XQuery. 

XPath expressions contain one or more of three elements: an axis, a node test, and a predicate.  An axis refers to the locational reference of a path when navigating an xml structure.  For example consider the following XML:

<root>
  <plant>
    <fruit>tomato</fruit>
    <fruit>banana</fruit>
    <vegetable>tomato</vegetable>
  </plant>
  <animal>
    <canine>Doberman</canine>
  </animal>
</root>

and the XQuery below:

/root/plant/fruit

based on the axis (path) supplied above, the results will be the two <fruit> nodes in the XML document.  It’s important to remember that in the SQL Server implementation of XPath, axis expressions always start at the top of an XML document and work down, but there are some navigational shortcuts.  Below is a small table describing the axes supported:

Abbreviation Full Syntax Notes
/ / steps through nodes
// // steps through all descendant nodes
  child returns a specific child of a given node; e.g. /root/child::plant is the same as /root/plant
.. parent steps up a node; .. equivalent to parent::node()
@ attribute identified attribute-based XML
. self refers to the current node
* * refers to all nodes at the current level; e.g. /root/* will return both the <plant> and <animal> nodes from our previous example.

A node test is a part of the location path that when used with an axis restricts the XQuery to return only the results of interest. Taking our same XQuery from above:

/root/plant/fruit

The axis is the “/” symbol which tells XQuery to walk down a node at a time, and there are two node tests: plant, and fruit.  The first node test tells XQuery to only look at the plant nodes, and the second node test restricts it to just the fruit nodes.  The above XQuery could be shortened to:

//fruit

All nodes would be considered, and all fruit nodes would be returned, regardless of where they live in the XML hierarchy. 

The final expression in XPath is the predicate; a predicate limits the results of an XPath expression through evaluation (much like the WHERE clause of a SQL statement).  Predicates are specified using square brackets, and allow for the use of several functions to return either a positional reference or an evaluation of a value or node.  For example, the results of the following two XQuery expressions against our test XML document:

//fruit[2]

//fruit[last()]

would return the same positional results:

<fruit>banana</fruit>

likewise, the XQuery expression below:

//fruit[text()="banana"]

also returns the same results.  last() and text() are both XPath supported functions, and are part of the subset supported by Microsoft SQL Server.

Summary

Why is all of this important?  It’s dry, I know. However, the foundations of XPath allow us to do some pretty powerful querying of XML before handing the results “across the bridge” to SQL Server.  Now that we’ve got the basics identified, we can continue our previous discussion of the XML methods in SQL Server.

SQL Server XQuery: .exist() method

I’m trying to beef up my writing about XQuery in advance of the upcoming PASS Summit 2013; I got accepted to present, and I want to make sure that I start covering a lot of topics related to XML so that I’ve honed my focus a bit.  Unfortunately, blogging doesn’t come easy to me these days, and while I would generally start a session from the beginning and move forward, I think it would be best for me to start with something easy for me to write about, even though it means I’ll be doing a bit of backtracking later.

A little bit of background

As you may already know, SQL Server supports XQuery against an XML data type through the use of five basic methods:

  • .query()
  • .value()
  • .exist()
  • .modify()
  • .nodes()

Conceptually, each of these methods act a bridge between the the SQL processing engine and the xml parser baked into SQL Server.  This concept of a bridge allows me to think about where query work is being performed; is it being done as a SQL statement against the table containing the XML fragment of interest, or is it being performed by XQuery before bubbling up to the SQL level? 

image

.exist()

I want to focus on probably the easiest method against the xml datatype: .exist().  Simply put, this method checks to see whether or not the results of an XQuery expression against the supplied XML value are nonempty; it returns 1 if true, and 0 if false.  Let’s start with a simple sample:

DECLARE @T TABLE ( ID INT, x XML )

INSERT  INTO @T
        ( ID, x )
VALUES  ( 1, '<root><plant><fruit>tomato</fruit></plant></root>' )
,       ( 2, '<root><plant><fruit>banana</fruit></plant></root>' )
,       ( 3, '<root><plant><vegetable>tomato</vegetable></plant></root>' )


SELECT ID, X, x.exist('//root') AS "Does It Exist?" 
FROM @T 

Running this query will return the following results:

image

Every row has an XML value in the X column which contains a node named <root>.  We can change the output to identify rows which have vegetable nodes like so:

SELECT ID, X, x.exist('//root/plant/vegetable') AS "Does It Exist?" 
FROM @T 

The results then change to the following:

image

.exist() can also be used in the WHERE clause of a SQL statement; if we want to identify only the rows which have an XML fragment containing a tomato, we can alter the SQL statement like so:

SELECT ID, X  
FROM @T 
WHERE  x.exist('//.[text()="tomato"]') = 1

image

Note that the .exist() method interpreted an XQuery statement to look for nodes which had a text() value of “tomato”; two rows matched that requirement, so .exist() returned a 1 for those two rows to the calling SQL statement, and the WHERE clause identified those two rows.

Wrap-up

So, I’ve covered a few basic concepts while talking about the .exist() method:

  1. The xml methods act as a bridge between XQuery and SQL for the xml data types.
  2. The .exist() method supports XQuery expressions including both simple paths (//<node name>) and XQuery functions (e.g, text()).  This will become important later when I get around to discussing FLWOR.