Stuart Ainsworth

SQL Server XQuery: .modify() method

Continuing with XQuery, today’s post addresses the .modify() method against the xml data type; this method is used to update XML documents by applying one of three XML DML statements (insert, delete, or replace) via the use of a SQL SET statement.  Again, the idea is hand off the work between the XML engine and the SQL engine.  For example, given the following example:

DECLARE @X XML
SET @x = '<root><id>1</id></root>'

SET  @x.modify('insert <splat>hello world </splat>
                into /root[1]')

SELECT @x

the output of this SQL is the following XML:

<root>
  <id>1</id>
  <splat>hello world </splat>
</root>

So what happened?  We took a very simple XML document, and passed XML DML into the .modify() method which stated that we should:

  1. insert a new node “<splat>hello world </splat>” into the XML document
  2. into the first instance of the root node (/root[1]).

That’s it; welcome to the world of XML DML (Data Manipulation Language).  The syntax pattern for insert is pretty simple; action (insert) followed by an XML expression (<splat>hello world </splat> followed by the location of a second XML expression (into /root[1])With some minor variation in pattern, these elements hold true for the other two XML DDL statements (delete; replace value of):

action XML expression (1) XML expression (2) Sample
insert node {as first | as last}

into | after | before

XQuery

SET @x.modify(‘insert <splat>hello world!</splat>

                before (/root/id)[1]’)

delete node **not used** SET @x.modify(‘delete /root/splat[1]’)
replace value of node/value with value SET @x.modify(‘replace value of (/root/id/text())[1] with "2"’)

 

Of course, the variation in the pattern is what makes this set of commands a challenge to keep straight, so I’ll try to briefly explain the patterns in order of increasing complexity:

  • the delete pattern requires that a node singleton be identified for deletion.  The XML expression must evaluate to a node in the XML document that is being updated.
  • the insert pattern identifies a node to be added either into, before, or after a specific location path.  If an instance of that node already exists, you should specify as first or as last in order to place the new node in the correct sequence.
  • the replace value of requires two XML expressions; the first must evaluate to an existing value (as either an attribute or a text() value of an element).  The second must evaluate to a specific value; a later blog post will show how to use XQuery to dynamically set that value.

One more thing; these examples all use a single XML variable, and issue the SET command against it.  If you’re dealing with a table of XML documents, you can use the SQL UPDATE command to SET the XML column using this XQuery method and XML DML like so:

DECLARE @T TABLE (x XML)
INSERT INTO @T
VALUES ('<id>1</id>'), ('<id>2</id>')

UPDATE @T
SET x.modify('insert <new>sample</new>
                into /id[1]')

SELECT *
FROM @T

will return a result set like so:

image

Summary

To sum up:

  1. the .modify() method may be used either against an XML variable or an XML column in a table by using the SET statement (alone or in an UPDATE command, respectively), and
  2. there are three basic patterns for XML DML that are passed to the .modify() method, and
  3. the syntax can be confusing but allows for a great deal of fine-grained manipulation of the XML document when needed.

Cleaning up orphaned users in SQL Server

Short blog post here; I’m working with my team to clean up some older SQL Servers, and we’re removing several logins that have been dormant or disabled for some time.  Removing the logins is easy, but it leaves orphaned users in several of our databases.  I created the script below to cycle through each of the databases on the server and generate a series of DROP USER commands for each database.

DECLARE @t TABLE
    (
      db VARCHAR(100)
    , u VARCHAR(1000)
    , s INT
    )

DECLARE @sql NVARCHAR(4000)
SET @sql = 'USE [?];

SELECT  DB_NAME()
      , dp.name
      , s = CASE WHEN s.Schema_id IS NOT NULL THEN 1
                 ELSE 0
            END
FROM    sys.database_principals AS dp
        LEFT JOIN sys.server_principals AS sp ON dp.sid = sp.sid
        LEFT JOIN sys.schemas s ON s.principal_id = dp.principal_id
WHERE   dp.type IN ( ''U'', ''S'' )
        AND sp.sid IS NULL
        AND dp.authentication_type IN ( 1,  3 )
        AND dp.name <> ''dbo'''

INSERT  INTO @t
        EXEC sp_msforeachdb @sql

SELECT  'USE [' + db + ']; '
        + CASE WHEN s = 1
               THEN 'ALTER AUTHORIZATION ON SCHEMA::[' + u + '] TO dbo; '
               ELSE ''
          END + ' DROP USER [' + u + ']'
FROM    @t

If the script discovers an orphaned user, the output will look something like:

USE [AdventureWorks2012]; DROP USER [olduser]

If that user owns a schema in the database, an ALTER AUTHORIZATION step is added to first transfer the schema to dbo, and then drop the user:

USE [AdventureWorks2012]; ALTER AUTHORIZATION ON SCHEMA::[olduser] TO dbo; DROP USER [olduser]

I’m sure there are other ways to accomplish the same thing, but it works well for this task.

SQL Server XQuery: .nodes() method

Revisiting my XQuery discussion that I abandoned so long ago; with PASS Summit 2013 quickly approaching, I need to get back on the stick, and make some headway in this series. 

The .nodes() method is different than the other xml methods in SQL Server in that it:

  1. returns tabular data, and so must be used in the FROM clause as a table-valued function, and
  2. requires that  the .query() ,.value(), or .exist() method be used in the SELECT clause of a SQL statement.

I don’t want to spend a lot of time on explaining a table-valued function, so if you’re unfamiliar with them, I did a short writeup on them for my Stuff in the From Clause series.

The basic concept behind this method is that it splits an XML document into smaller documents according to the node specified in the method; if you have multiple instances of a node, an XML document is returned for each node.  That XML document must then be queried using one of the other XML methods.  For example, executing the following:

DECLARE @x XML
SET @x = 
'<root>
  <plant>
    <fruit>tomato</fruit>
    <fruit>banana</fruit>
    <vegetable>tomato</vegetable>
  </plant>
</root>' 

SELECT t.c.query('.') 
FROM @x.nodes('//fruit') AS T(c)

will return a data set that looks like this:

image

How does this work?  The xQuery statement (“//fruit”) provided to the .nodes() method found two nodes; the method returned to the SELECT statement two fragments containing the nodes from the original XML document (one for each instance of fruit).  The query method then looks at the root of those two new XML fragments, and returned them as an xml datatype, one on each row.  A lot of work got done for a very simple concept; taking a look at the execution plan, you can see how much work got done:

image

Notice the multiple calls to XML readers; using our bridge analogy, the SQL query first asked the XML engine to find the nodes that matched the xQuery.  The XML engine replied saying “I have 2 instances”; the SQL query then asked the XML engine to split them into rows, and the XML engine did so, returning those rows to the SQL engine.  The SQL engine then asked the XML engine to query those two documents, and the XML engine did so and returned the results.  A lot of back and forth between the engines.

The syntax changes slightly when you’re dealing with a table instead of a scalar value; in this case you must use the CROSS APPLY functionality in order to parse the information.  For example:

DECLARE @T TABLE ( ID INT, x XML )


INSERT  INTO @T
        ( ID, x )
VALUES    ( 1, '<root><plant><fruit>tomato</fruit><fruit>banana</fruit><vegetable>tomato</vegetable></plant></root>' )

SELECT ID, c.value('.', 'varchar(max)')
FROM @T
    CROSS APPLY x.nodes('//fruit') T(c)  

will return a two-column dataset:

image

and the execution plan looks similar in regards to the handoff between the engines; the only addition is the scan of our table variable:

image

Summary:

  1. The .nodes() method is an easy way to shred XML documents into rows based on an xQuery, and
  2. it’s expensive, since it requires the use of another XML method to return data.

#passvotes Why I voted for Allen Kinsel–three times….

I’m a little late with this endorsement, in part because I debated whether or not that I should admit that I’m a serial voter.  In case you haven’t heard, PASS is holding it’s elections for the Board of Directors again; ballots went out on Wednesday, September 25 to “all members in good standing as of June 2013”.  Apparently, I’m a member thrice over, since I received three ballots at three different email addresses.

This has happened in every PASS election that I can remember, and I’ve always taken the moral high ground before, casting one vote and ignoring the other two; however, this time is different.  It’s different because Allen Kinsel’s on the ballot.

Lots of people have lots of good things to say about Allen, so I won’t rehash those; however, the thing that swayed me to vote for him (multiple times) was the following bullet point from his platform page:

    • Making an additional IT investment to bring PASS’s membership roster up-to-date. Updating the roster will allow PASS to have better knowledge of our worldwide membership and to provide members with better benefits.

Allen wants to update the database, and clean it up.  As a database professional, data integrity appeals to me; it’s a little sad that my professional organization has dirty data, and is relying on that dirty data to make key business decisions (like generating ballots).   So this time, I decided to leverage all of my electoral power and cast all of my votes for Allen because I believe he’s going to try and do something to prevent this from happening in the future.

I know; it seems wrong for me to cast three votes, but I figure I’m not the only one with multiple ballots.  Maybe some other members of the organization don’t have the same moral hesitation about “pulling the lever” at every opportunity.  Perhaps some devious prankster has registered 500 email addresses, and is slowly stealing the election away.  All I know is that this should change, and Allen’s the only candidate which has mentioned this as a priority.

Vote.

SQL In The City Atlanta, Oct 11, 2013

SITC13_Banner_300x250_SpeakerJust a quick note to say that I’m super excited to be presenting at the Red Gate SQL In the City event in Atlanta this year; I’ll be covering some of their development tools (many of which I use daily).  As far as I know, there are still some seats left, but I wouldn’t count on them being available too much longer.

It’s free, and you’ll get a chance to hear lots of great topics presented by Grant Fritchey, Steve Jones, and Tim Radney, as well as me. Click on the image for more information, and I hope to see you there.

SQL Server XQuery: .value() method

Back to XQuery; the third method I wanted to discuss in this series is the .value() method. Like the .query() method, .value() bridges the SQL Server database engines and XQuery engines by extracting information from an XML document.  The difference is that while .query() returns XML, .value() translates the specific information requested into a scalar SQL value.  For example:

DECLARE @T TABLE ( ID INT, x XML )

INSERT  INTO @T
        ( ID, x )
VALUES  ( 1, '<root><plant><fruit>tomato</fruit></plant></root>' )
,       ( 2, '<root><plant><fruit>banana</fruit></plant></root>' )
,       ( 3, '<root><plant><vegetable>tomato</vegetable></plant></root>' )

SELECT ID, x.value('(.)[1]', 'varchar(max)')
FROM @T t

Note that the .value() method has two parameters, an XQuery statement and a string representing the expected SQL Server type for the scalar value returned.  The results are as follows:

image

So, what is the XQuery statement requesting?  The axis is referring to the self-node of the XML fragment (and remember that in SQL Server’s implementation of XQuery, the query begins at the top of the document), there is no node test (so any node will satisfy the test), and the predicate is referring to the first position [1] of the path.  The .value() function returns the value found at that location, and converts it to a sql value of type varchar(max).  So what happens if there’s more than one value at a given XPath?

INSERT  INTO @T
        ( ID, x )
VALUES    ( 4, '<root><plant><fruit>tomato</fruit><fruit>banana</fruit><vegetable>tomato</vegetable></plant></root>' )


SELECT ID, x.value('(.)[1]', 'varchar(max)')
FROM @T t

The above script inserts a slightly more complex XML document than used before; there are now multiple nodes under the <plant> node (in addition to the three rows we already have); the output looks like:

image 

Note how row 4 has a single string that represents all of the values of the XML document.  Without a specific XQuery, the value() function returns the values that satisfy the request; in this case, all of them.  In the case where the XQuery cannot be satisfied, .value() will return a NULL value to the SQL Server engine, as illustrated by the following code examples:

SELECT ID, x.value('(//plant/fruit)[1]', 'varchar(max)')
FROM @T t 

SELECT ID, x.value('(//plant/fruit)[2]', 'varchar(max)')
FROM @T t 

The output should look like:

image

In the first query, we are looking for an axis of any node in the document, a node test of fruit under the <plant> node, and a positional predicate of 1.  Rows 1,2, and 4 satisfy this query; the XML document in row 3 doesn’t have a <fruit node>, and thus returns a NULL.  The second query is similar in that several of the XML documents in our table don’t satisfy the query; only the document in row 4 has node in the second [2] position of the <fruit> node.

Summary

Key takeaways:

  1. The .value() function allows you to specify an XQuery statement against an XML datatype looking for a specific value, and
  2. that returned value is cast to a SQL datatype.

Next up: .nodes()

my amazing life

I don’t often talk about personal stuff on this blog (it’s CODEgumbo; not LIFEgumbo), but a lot’s happened in the last few weeks, so I thought I’d share a bit.  Besides, I was on a roll for blogging for a bit there, and then suddenly stopped; I figured I should at least attempt to explain what happened Smile.  So let me explain in chronological order (with a few flashbacks) what’s been going on with me lately.

I’m a dad, again….

Evan Stuart Ainsworth was born on June 25, 2013.  It’s been 14 years since I’ve had a baby in the house, so it’s a complete reboot.  I’ve been totally unprepared for the joys and trials of having a baby; I kept thinking it was going to be easy, but I guess I forgot how worn out you feel all the time, and I’m just the dad.  I don’t understand the sheer strength my wife has; the kid eats all the time.  And when he’s not eating, he’s either happy or mad; my nickname for him is the Hulk (“don’t make me angry”).

Before I go too much further with the story of Evan’s arrival, I should pause for a second and say how grateful I am for my two daughters (Isabel, 16, and Grace, 14); their world just got a little weirder, and they’ve stepped in and accepted it reasonably well.  Blended families are tough, and I appreciate them for trying to balance their relationship with me, their stepmother, and their new little brother, as well as building a relationship with their mom (and her new family).

Anyway, while the birth of any kid is big life change, Evan is special.  He was actually supposed to be an induced birth, so we were at the hospital when labor started.  Betsy (my wife) was all hooked up to monitors, and labor was progressing when things went awry; during some of the contractions, Betsy suffered a condition known as a placental abruption, and Evan went into shock in utero.  Within minutes, Betsy underwent an emergency C-section, and Evan was out within 15-30 minutes of the incident.

Betsy and I spent two weeks hanging out at the NICU, waiting on our little boy to get well; it wasn’t comfortable.  Lots of stress, nowhere to really go, and too little time in between feedings for either of us to leave.  Thankfully, Evan’s condition was relatively minor compared to many babies that spend time in the NICU, but it was still scary.  When I say my prayers, I give thanks for the doctors and midwives at the hospital; without them, I don’t think either Betsy or Evan would be home with me today.  After I give thanks, I then turn my thoughts toward the parents and families of children in the NICU.

I say all of this to explain that even while it’s tough coping to a new baby, I just laugh when I hear him cry.  Life is a blessing, even in the everyday struggles.

I survived another trip around the sun

As of July 5, 2013, I am now twice-21.  I’m a little slower, a little grayer, and happier than I’ve ever been.  42 is a lot more awesome than I thought it would be when I was 21.  Besides, kids are supposed to keep you young, and if I keep having them, I should live forever.

I celebrated a second year of marriage

While I don’t share a lot about my personal life on my blog, I REALLY don’t talk a lot about my wife.   As I pointed out above, this is my second marriage; my ex-wife and I separated in November 2009 after 14 years of marriage.  If you go back and search older posts on here, you’ll find a few references to my original marriage scattered about; I try not to dwell on them too much, because I don’t really want to revisit those feelings.  Let me just say that I spent a lot of my adult life trying to be happy and make someone love me, and I was only marginally successful.

After my divorce was final, I dated a few women and thought I would enjoy being single. And then I met Betsy.  My wife is an amazing woman, and I don’t think I tell her that often enough.  She loves me for being me, and its easy to love her.  I’m happy every day; love is easy with the right person.  At times it seems as though I just met her a week ago; at times, it seems as though life began when I met her.  Enough mushy stuff by me; sing along with the Proclaimers.  It’ll make you feel good.

And now back to our show

Hopefully, I’ll get back on track with technical blogging now that I’ve had some time to adjust to all the recent life events.  Thanks for reading.

SQL Server XQuery: .query() method

Continuing in my discussion of XQuery in SQL Server, let me discuss another XML data type method: .query().  The .query() method is a great illustration of the bridging concept I discussed earlier.  Simply put, this method allows you to run an XQuery against the XML data type, and return the results of that query as an XML document to SQL Server.  Using our example from the .exists() post:

DECLARE @T TABLE ( ID INT, x XML )

INSERT  INTO @T
        ( ID, x )
VALUES  ( 1, '<root><plant><fruit>tomato</fruit></plant></root>' )
,       ( 2, '<root><plant><fruit>banana</fruit></plant></root>' )
,       ( 3, '<root><plant><vegetable>tomato</vegetable></plant></root>' )

SELECT x.query('.') 
FROM @T 

Note that the parameter supplied to the method is an XQuery statement consisting of a single axis (.), which refers to the current node.  Since SQL Server’s implementation of XQuery starts at the top and works it’s way down the document, the current node is the entire document.  Running that query will give you the following output:

image

which isn’t very exciting.  However, if we begin to build out a XPath statement with a node test and a predicate, we can begin to narrow down the results quickly:

SELECT ID, x.query('.//fruit[text()="banana"]')
FROM @T t

And here’s the results:

image

While the results may be simple, there’s a lot going on behind the scenes, and it may have returned something different than what you expected.  Remember that when you execute an XQuery method against an XML column, there’s actually two queries happening:

For the XQuery portion, the XPath axis, node test, and predicate specified that we were looking for an XML node that was a fruit and had a value of a banana; this only existed in the XML in row 2, so the node was returned for row 2, but row 1 and row 3 were empty.  The SQL query then ran, and returned all rows in the table because there was no WHERE clause filtering the results.  If we only wanted to return the results where the XQuery returned a node, we could move the .query() method to the where clause.

One thing to always remember is that regardless of how specific your XQuery is, the type of the data being returned will always be XML.  For example, if you specify the text() function as part of the axis, and not the predicate:

SELECT ID, x.query('.//text()')
FROM @T t 

You’ll get what looks like a table with a column of character values representing the values of the various nodes:

image

However, the visual representation of the result set uses hyperlinks to indicate that it’s an XML column; if you click on the value (banana), the XML explorer will open up, showing you that it is an XML fragment, albeit one that has no tags.  If you need to return the data as the actual datatype, you’ll either need to use a SQL method (like CAST or CONVERT), or the .value() method (to be discussed soon).

Summary

Wrapping up with a few key points:

  1. The query() method allows you to specify an XQuery statement against an XML document, and
  2. It always returns an XML document.

MaTT: Entropy

imageDecided to take a detour today in my blogging thoughts (it is Friday, after all), and opted to revisit the issues of managing a technical team (MaTT); today’s topic: Entropy.

en·tro·py

/ˈentrəpē/

Noun

  • A thermodynamic quantity representing the unavailability of a system’s thermal energy for conversion into mechanical work, often…
  • Lack of order or predictability; gradual decline into disorder.

Simply put, processes decay over time.  A good manager will recognize this, and learn to intervene when those processes begin to break down.  In contrast, code lives on forever (unless someone steps in to change it). 

Let me give an example; about a year and a half ago, I implemented Kanban as a method of managing our workflow in the production environment.  My team quickly adapted to it, and it became very easy to see exactly what people were working on, and what was getting accomplished.  I thought “my job here is done”, and moved on to other things, occasionally touching base with the Kanban board to see what was going on.

This week, I realized that the process had degenerated; sure, my team was still using the board to report what they had done for the day.  However, they’ve gotten in the habit of solving the problem, and then creating the card in the Done pile.  Entropy.  I am no longer able to predict or measure what is being done or what needs to be done.

I’m out next week (hoping my son is born by the end of the week), but when I return, I need to intervene and work toward cleaning up the process.  While I firmly believe that I should trust my team and encourage them to make the right decision, part of the management process is learning when to intervene in order to shore up a process that is decaying.

SQL Server & XQuery: XPath Axes, Node Tests, and Predicates

In order to really get a grasp on what’s going on within an XQuery method in SQL Server (like .exist()), we need to examine the relationship between XQuery and XPath.  Semantically, there’s little difference in the usage of the two names, since XQuery as a language uses XPath to navigate through an XML fragment; however, strictly speaking, XPath is a subset of XQuery. 

XPath expressions contain one or more of three elements: an axis, a node test, and a predicate.  An axis refers to the locational reference of a path when navigating an xml structure.  For example consider the following XML:

<root>
  <plant>
    <fruit>tomato</fruit>
    <fruit>banana</fruit>
    <vegetable>tomato</vegetable>
  </plant>
  <animal>
    <canine>Doberman</canine>
  </animal>
</root>

and the XQuery below:

/root/plant/fruit

based on the axis (path) supplied above, the results will be the two <fruit> nodes in the XML document.  It’s important to remember that in the SQL Server implementation of XPath, axis expressions always start at the top of an XML document and work down, but there are some navigational shortcuts.  Below is a small table describing the axes supported:

Abbreviation Full Syntax Notes
/ / steps through nodes
// // steps through all descendant nodes
  child returns a specific child of a given node; e.g. /root/child::plant is the same as /root/plant
.. parent steps up a node; .. equivalent to parent::node()
@ attribute identified attribute-based XML
. self refers to the current node
* * refers to all nodes at the current level; e.g. /root/* will return both the <plant> and <animal> nodes from our previous example.

A node test is a part of the location path that when used with an axis restricts the XQuery to return only the results of interest. Taking our same XQuery from above:

/root/plant/fruit

The axis is the “/” symbol which tells XQuery to walk down a node at a time, and there are two node tests: plant, and fruit.  The first node test tells XQuery to only look at the plant nodes, and the second node test restricts it to just the fruit nodes.  The above XQuery could be shortened to:

//fruit

All nodes would be considered, and all fruit nodes would be returned, regardless of where they live in the XML hierarchy. 

The final expression in XPath is the predicate; a predicate limits the results of an XPath expression through evaluation (much like the WHERE clause of a SQL statement).  Predicates are specified using square brackets, and allow for the use of several functions to return either a positional reference or an evaluation of a value or node.  For example, the results of the following two XQuery expressions against our test XML document:

//fruit[2]

//fruit[last()]

would return the same positional results:

<fruit>banana</fruit>

likewise, the XQuery expression below:

//fruit[text()="banana"]

also returns the same results.  last() and text() are both XPath supported functions, and are part of the subset supported by Microsoft SQL Server.

Summary

Why is all of this important?  It’s dry, I know. However, the foundations of XPath allow us to do some pretty powerful querying of XML before handing the results “across the bridge” to SQL Server.  Now that we’ve got the basics identified, we can continue our previous discussion of the XML methods in SQL Server.