As part of my New Year’s resolution for 2011, I vowed to do a deep-dive on something technical every month; for January, I’m focusing on XML. I’ve been using XML and XQuery in SQL Server for a while now (even presenting on it), but I still don’t consider myself an expert in the area. For example, I use a lot of untyped XML to transfer data between databases; I’ve never really tackled XSD (XML Schema Definition Language), and now’s the time. I’m reading The Art of XSD by Jacob Sebastian to help get me started.
What’s XSD? In a nutshell, it’s an XML document which validates the structure of another XML document. From the perspective of a database developer, an XSD document describes how data should look in a dataset; if the data doesn’t match the description (i.e, if a table is missing a column), that dataset is invalid. The XSD document can be very precise, or it can offer options for the dataset, but in either case, the point of an XSD is to document the expectations about the dataset. XML without XSD is untyped; XML with an XSD is typed (although XSD’s do more than just provide information about the data types contained within the XML).
Let’s take a look at an untyped XML statement:
DECLARE @NoXSD XML
SET @NoXSD = '<Test1>Hello World!</Test1>'
SELECT @NoXSD
Simple and straightforward; I created an XML variable, and populated it with an XML fragment. I then pulled the data out of that fragment. In this example, we have an element named Test1; what happens if we have a typo when we populate the variable?
SET @NoXSD = '<Test2>Hello World!</Test2>'
SELECT @NoXSD
Nothing happens. It’s a well-formed XML fragment (no root tag, but it does have starting and ending tags); the XML engine in SQL Server doesn’t know that our fragment is supposed to have an element named Test1, so it accepts the fragment as valid. This is where an XSD comes in:
IF EXISTS( SELECT * FROM sys.xml_schema_collections WHERE name = 'TestSchema' )
DROP XML SCHEMA COLLECTION TestSchema
GO
CREATE XML SCHEMA COLLECTION TestSchema AS
'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="Test1" />
</xsd:schema>'
GO
DECLARE @XSD XML ( TestSchema ) --use the schema to validate (type) the xml
SET @XSD = '<Test1>Hello, World!</Test1>'
SELECT @XSD
Since the XML fragment matches the XSD, the assignment of data works; what happens when we assign a fragment that doesn’t match?
SET @XSD = '<Test2>Hello, World!</Test2>'
We get a big fat error message:
XML Validation: Declaration not found for element ‘Test2’. Location: /*:Test2[1]
Straightforward, right? But now what? Well, let’s type the data in our schema:
IF EXISTS( SELECT * FROM sys.xml_schema_collections WHERE name = 'TestSchema' )
DROP XML SCHEMA COLLECTION TestSchema
GO
CREATE XML SCHEMA COLLECTION TestSchema AS
'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="Test1" type="xsd:string"/>
</xsd:schema>'
GO
DECLARE @XSD XML ( TestSchema )
SET @XSD = '<Test1>Hello, World!</Test1>'
So; what does this mean? It means that we can now use the XQuery methods built into SQL Server to cast the data from the XML datatype to a SQL Server data type.
SELECT @XSD .value ( '(//Test1)[1]' , 'varchar(50)' )
More to come, but that’s a good stopping place for now; we’ve built a simple XSD, and validated a simple datatype. I plan to spend some time learning about optional data elements next.
Looking forward to reading more of this series. XML: the final frontier.
Sweet, can’t wait to read rest of this. I recently wrestled with the XML beast and now I’m curious as to what I can do with it with SQL Server.
Stuart,
I attended your session “Dirt, Spit, and Happy FLWOR: Hands on with XQuery.” at SQL Saturday in Columbia, SC 3/19/2011. It was great. Are you going to make your slides available. I really appreciated your session and the talented delivery of FLWOR.
Letron, it was a pleasure to meet you; I really enjoyed teaching that session (a lot of input from the group). The slides are available on the SQLSaturday site, but I will make them available here as well in a couple of days 🙂