Something new for 2011: XML and XSD

As part of my New Year’s resolution for 2011, I vowed to do a deep-dive on something technical every month; for January, I’m focusing on XML.  I’ve been using XML and XQuery in SQL Server for a while now (even presenting on it), but I still don’t consider myself an expert in the area.  For example, I use a lot of untyped XML to transfer data between databases; I’ve never really tackled XSD (XML Schema Definition Language), and now’s the time.  I’m reading The Art of XSD by Jacob Sebastian to help get me started.

What’s XSD?  In a nutshell, it’s an XML document which validates the structure of another XML document.  From the perspective of a database developer, an XSD document describes how data should look in a dataset; if the data doesn’t match the description (i.e, if a table is missing a column), that dataset is invalid.  The XSD document can be very precise, or it can offer options for the dataset, but in either case, the point of an XSD is to document the expectations about the dataset.  XML without XSD is untyped; XML with an XSD is typed (although XSD’s do more than just provide information about the data types contained within the XML).

Let’s take a look at an untyped XML statement:

DECLARE @NoXSD XML
SET
@NoXSD = '<Test1>Hello World!</Test1>'
SELECT @NoXSD

 

Simple and straightforward; I created an XML variable, and populated it with an XML fragment.  I then pulled the data out of that fragment.  In this example, we have an element named Test1; what happens if we have a typo when we populate the variable?

SET @NoXSD = '<Test2>Hello World!</Test2>'
SELECT @NoXSD

 

Nothing happens.  It’s a well-formed XML fragment (no root tag, but it does have starting and ending tags); the XML engine in SQL Server doesn’t know that our fragment is supposed to have an element named Test1, so it accepts the fragment as valid.  This is where an XSD comes in:

IF EXISTS( SELECT * FROM sys.xml_schema_collections WHERE name = 'TestSchema' )
DROP XML SCHEMA  COLLECTION TestSchema
GO

CREATE XML SCHEMA COLLECTION TestSchema AS
'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="Test1" />
</xsd:schema>'
GO

DECLARE @XSD XML ( TestSchema --use the schema to validate (type) the xml
SET @XSD = '<Test1>Hello, World!</Test1>'

SELECT @XSD

 

Since the XML fragment matches the XSD,  the assignment of data works; what happens when we assign a fragment that doesn’t match?

SET @XSD = '<Test2>Hello, World!</Test2>'

We get a big fat error message:

XML Validation: Declaration not found for element ‘Test2’. Location: /*:Test2[1]

Straightforward, right?  But now what?  Well, let’s type the data in our schema:

IF EXISTS( SELECT * FROM sys.xml_schema_collections WHERE name = 'TestSchema' )
DROP XML SCHEMA  COLLECTION TestSchema
GO

CREATE XML SCHEMA COLLECTION TestSchema AS
'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="Test1" type="xsd:string"/>
</xsd:schema>'
GO

DECLARE @XSD XML ( TestSchema )
SET @XSD = '<Test1>Hello, World!</Test1>'

 

So; what does this mean?  It means that we can now use the XQuery methods built into SQL Server to cast the data from the XML datatype to a SQL Server data type.

SELECT @XSD .value ( '(//Test1)[1]' , 'varchar(50)' )

 

More to come, but that’s a good stopping place for now; we’ve built a simple XSD, and validated a simple datatype.  I plan to spend some time learning about optional data elements next.

Share