SQL Server

#SQLPASS – Fluffy Bunnies

File:Fluffy white bunny rabbit.jpgI didn’t really have a good name for this post, so I thought I’d just try to pick something as non-offensive as possible.  Everybody likes bunnies, right?  Anyway, the last series of posts that I’ve made regarding the Professional Association for SQL Server has raised a number of questions that I thought I’d strive to answer; since I’m still mulling over my next post, I figured this was as good a time as any. 

What’s your motivation in writing this series?

Believe it or not, I’m trying to help.  For several years, it seems like there’s been one controversy after the other within the Professional Association of SQL Server, and those controversies dissipate and re-emerge.  My goal is to document what I perceive to be the root causes for some of these issues so that we can work toward a solution.

 

Why are you bashing PASS?

I’m trying very hard to NOT “bash” the Professional Association for SQL Server; I’ve been a member for several years now, and I’ve seen controversies get personal.  I really don’t feel like I’m doing that; I still believe that the Board of Directors are a great bunch of people that are making decisions based on their circumstance at the time.  I’m trying to wrap a schema around those circumstances, so I can understand those decisions. 

I’m trying to articulate my own perspective on what I think is wrong, so that I can be better prepared to have an honest discussion about those perceptions.  Relationships aren’t about ignoring issues; the first step in addressing an issue is to identify the issue.

You mentioned “transparency” as an issue with the BoD before; what do you mean by that?

That’s more complicated to answer than I thought it would be.  At first, I thought it was more information; as an active community member, I felt like I needed to be more involved in the decision-making process.  However, I can’t really name a specific reform that I would make for the BoD to be more transparent.  I don’t want to read meeting minutes, and most of the form letter emails I get from the Association go straight to the trash; I’m too busy for them to be transparent.

On further consideration, what I’ve been calling transparency is more about leadership and up-front communication than it is about revealing information.   As an example, I help run one of the largest SQL Server chapters in the world; if I had realized that the Professional Association for SQL Server was planning a major re-branding exercise, I feel like I could have contributed some “notes from the field” on what that would mean, and how to best prepare our membership for it.  Instead of appearing defensive about a controversy, the Board would appear to be very proactive.

It’s that feeling of being left out of the conversation that bothers me, I think.  I realize that there’s some details that can’t be shared, but I feel like the onus is on the BoD to find better ways to communicate with members (and to me, chapters are an underused resources).  The information flow is very unidirectional; Summit keynotes and email blasts are not an effective way to discuss an issue.   Maybe those conversations are happening with other people, but if so, few people have stepped forward discussing them.

It’s also an issue of taking action when an issue is raised; for example, the recent passwordsecurity controversy was documented by Brent Ozar.  He states that he and several other security-oriented members had several private conversation with board members, and yet no comprehensive action was taken until it became a public issue.  The perception is that the only way to get the BoD to act is to publically shame them.   That’s not healthy in the long run.

What’s your vision for the Association?

That’s the subject of my next post, so I’m going to hold off on that one.  I do think that the unidirectional flow of communication is not just an issue with the organization, but that it’s a tone set by our relationship with Microsoft.  As I pointed out in my last post, if the Association knew more about the needs and desires of the membership, that knowledge becomes a very valuable resource.   Instead of being a fan club for a product, we could become strong partners in the development of features for that product.

How do you feel about the name change?

Frankly, I feel a little sad about the change in direction, but I understand it.  I predict that SQL Server as on premise-platform is going to become a niche product, regardless of the number of installations.  It just makes sense for Microsoft to broaden their data platform offerings.  I do think that we (the Association) need to do a better job of preparing membership for this new frontier, and it’s going to take efforts to transform administration skills into analytic skills. Without providing that guidance, it seems like we’re abandoning the people who helped build this organization.

Where is this series headed?  What’s the final destination?

To be honest, I don’t know.  When I first started writing these posts, I was sure that I could describe exactly what was wrong, and suggest a few fixes.  The more I write, the easier it is to articulate my observations; I’m surprised to find that my observations are not what I thought they would be.  Thanks for joining me for the ride; hopefully, it leads to some interesting conversations at Summit.

#SQLServer – Where does my index live?

Today, I got asked by one of my DBA’s about a recently deployed database that seemed to have a lot of filegroups with only a few tables.  He wanted to verify that one of the tables was correctly partition-aligned, as well as learn where all of the indexes for these tables were stored.  After a quick search of the Internets, I was able to fashion the following script to help.  The script below will find every index on every user table in a database, and then determine if it’s partitioned or not.  If it’s partitioned, the scheme name is returned; if not, the filegroup name.  The final column provides an XML list of filegroups (because schemes can span multiple filegroups) and file locations (because filegroups can span multiple files).

 WITH C AS ( SELECT ps.data_space_id
, f.name
, d.physical_name
FROM sys.filegroups f
JOIN sys.database_files d ON d.data_space_id = f.data_space_id
JOIN sys.destination_data_spaces dds ON dds.data_space_id = f.data_space_id
JOIN sys.partition_schemes ps ON ps.data_space_id = dds.partition_scheme_id
UNION
SELECT f.data_space_id
, f.name
, d.physical_name
FROM sys.filegroups f
JOIN sys.database_files d ON d.data_space_id = f.data_space_id
)
SELECT [ObjectName] = OBJECT_NAME(i.[object_id])
, [IndexID] = i.[index_id]
, [IndexName] = i.[name]
, [IndexType] = i.[type_desc]
, [Partitioned] = CASE WHEN ps.data_space_id IS NULL THEN 'No'
ELSE 'Yes'
END
, [StorageName] = ISNULL(ps.name, f.name)
, [FileGroupPaths] = CAST(( SELECT name AS "FileGroup"
, physical_name AS "DatabaseFile"
FROM C
WHERE i.data_space_id = c.data_space_id
FOR
XML PATH('')
) AS XML)
FROM [sys].[indexes] i
LEFT JOIN sys.partition_schemes ps ON ps.data_space_id = i.data_space_id
LEFT JOIN sys.filegroups f ON f.data_space_id = i.data_space_id
WHERE OBJECTPROPERTY(i.[object_id], 'IsUserTable') = 1
ORDER BY [ObjectName], [IndexName] 

Determining the Primary Key of a table without knowing it’s name

So, I’ve been trying to find more technical fodder for blog posts lately in order to get in the habit of blogging on a regular basis, so I thought I would explore a few of my higher-ranked answers on StackOverflow and provide a little more detail than that site provides.  This is my highest-rated answer (sad, I know, compared to some posters on the site):

Determine a table’s primary key using TSQL

I’d like to determine the primary key of a table using TSQL (stored procedure or system table is fine). Is there such a mechanism in SQL Server (2005 or 2008)?

My answer:

This should get you started:

SELECT *
    FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
        JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE ccu ON tc.CONSTRAINT_NAME = ccu.Constraint_name
    WHERE tc.CONSTRAINT_TYPE = 'Primary Key'

 

Besides the obvious faux pas of using SELECT * in a query, you’ll note that I used the INFORMATION_SCHEMA views; I try to use these views wherever possible because of the portability factor.  In theory, I should be able to apply this exact same query to a MySQL or Oracle database, and get similar results from the system schema.

The added benefit of this query is that it allows you to discover other things about your PRIMARY KEYs, like looking for system named keys:

SELECT  tc.TABLE_CATALOG, tc.TABLE_SCHEMA, tc.TABLE_NAME, ccu.CONSTRAINT_NAME, ccu.COLUMN_NAME
FROM    INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
        JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE ccu ON tc.CONSTRAINT_NAME = ccu.Constraint_name
WHERE   tc.CONSTRAINT_TYPE = 'Primary Key'
    AND ccu.CONSTRAINT_NAME LIKE 'PK%/_/_%' ESCAPE '/'

or finding PRIMARY KEYs which have more than one column defined:

SELECT  tc.TABLE_CATALOG, tc.TABLE_SCHEMA, tc.TABLE_NAME, ccu.CONSTRAINT_NAME
FROM    INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
        JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE ccu ON tc.CONSTRAINT_NAME = ccu.Constraint_name
WHERE   tc.CONSTRAINT_TYPE = 'Primary Key'
GROUP BY tc.TABLE_CATALOG, tc.TABLE_SCHEMA, tc.TABLE_NAME, ccu.CONSTRAINT_NAME
HAVING COUNT(*) > 1

Error 574 Upgrading SQL Server 2012 to SP1

This blog post is way overdue (check the dates in the errror log below), but I promised our sysadmin that I would write it, so here it is.  Hopefully, it’ll help some of you with this aggravating issue.  During an upgrade of our SQL cluster, we ran into the following error as we attempted to upgrade one of the instances:

2014-04-15 23:50:14.45 spid14s     Error: 574, Severity: 16, State: 0.

2014-04-15 23:50:14.45 spid14s     CONFIG statement cannot be used inside a user transaction.

2014-04-15 23:50:14.45 spid14s     Error: 912, Severity: 21, State: 2.

2014-04-15 23:50:14.45 spid14s     Script level upgrade for database ‘master’ failed because upgrade step ‘msdb110_upgrade.sql’ encountered error 574, state 0, severity 16. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.

2014-04-15 23:50:14.45 spid14s     Error: 3417, Severity: 21, State: 3.

2014-04-15 23:50:14.45 spid14s     Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

Google wasn’t helpful; there’s apparently lots of potential fixes for this, none of which helped.  The closest match we found was that we had some orphaned users in a few databases (not system databases), which we corrected; the upgrade still failed.  We eventually had to contact Microsoft support, and work our way up the second level technician.  Before I reveal the fix, let me give a little more background on how we got orphaned users.

You see, shortly after we upgraded to SQL 2012 (about a year ago), we did what many companies do; we phased out a service offering.  That service offering that we phased out required several database components, including a SQL login associated with users in the database, and several maintenance jobs that were run by SQL Agent.  When we phased out the service, those jobs were disabled, but not deleted.  Our security policy tracks the last time a login was used; if a login isn’t used within 60 days, it’s disabled.  30 days after that (if no one notices), the login is deleted.  Unfortunately, our implementation of this process missed two key steps:

  1. The associated user in each database was not dropped with the login (leaving an orphan), and
  2. any job that was owned by that login was also not dropped or transferred to a sysadmin.

The latter was the key to our particular situation; the upgrade detected an orphaned job even though that job was disabled, and blocked the upgrade from going forward.  Using trace flag –T902, we were able to start the server instance and delete the disabled job.  We then restarted the server without the trace flag, and the upgrade finished successfully.

 

Resources:

Find and fix all orphaned users for all databases.

Brent Ozar Unlimited’s sp_blitz will find jobs that are owned by users other than sa.

Speaking at the #SQLPASS #Summit14

I know I’m a day late with this announcement, but I haven’t blogged in months, so what’s the rush?  I am very excited, however, about presenting a full session and a lightning talk at the PASS Summit in Seattle in November.

THE ELEPHANT IN THE ROOM: A DBA’S GUIDE TO HADOOP AND BIG DATA

Speaker(s)Stuart Ainsworth

Duration: 75 minutes

Track: BI Platform Architecture, Development & Administration

You’re a SQL Server DBA working at Contoso and your boss calls you out of your cubicle one day and tells you that the development team is interested in implementing a Hadoop-based solution to your customers. She wants you to help plan for the implementation and ongoing administration. Where do you begin?

This session will cover the foundations of Hadoop and how it fundamentally differs from the relational approach. The goal is to provide a map between your current skill set and "big data.” Although we’ll talk about basic techniques for querying data, the focus is on basic understanding how Hadoop works, how to plan for growth, and what you need to do to start maintaining a Hadoop cluster.

You won’t walk out of this session a Hadoop administrator, but you’ll understand what questions to ask and where to start looking for answers.

 

TEN-MINUTE KANBAN

Speaker(s)Stuart Ainsworth

Duration: 10 minutes

Track: Professional Development

The goal of this Lightning Talk is to cover the basic principles of the Lean IT movement, and demonstrate how Kanban can be used by Administrators as well as developers. Speaker Stuart Ainsworth will cover the basic concepts of Kanban, where to begin, and how it works.

Kanban boards can be used to highlight bottlenecks in resource and task management, as well as identify priorities and communicate expectations. All this can be done by using some basic tools that can be purchased at an office supply store (or done for free online).

SQL Server XQuery: MORE deleting nodes using .modify()

So after my last post, my developer friend came back to me and noted that I hadn’t really demonstrated the situation we had discussed; our work was a little more challenging than the sample script I had provided.  In contrast to what I previously posted, the challenge was to delete nodes where a sub-node contained an attribute of interest.  Let me repost the same sample code as an illustration:

DECLARE @X XML = 
'<root>
  <class teacher="Smith" grade="5">
    <student name="Ainsworth" />
    <student name="Miller" />
  </class>
  <class teacher="Jones" grade="5">
    <student name="Davis" />
    <student name="Mark" />
  </class>
  <class teacher="Smith" grade="4">
    <student name="Baker" />
    <student name="Smith" />
  </class>
</root>'

SELECT  @x

If I wanted to delete the class nodes which contain a student node with a name of “Miller”, there are a couple of ways to do it; the first method involves two passes:

SET @X.modify('delete /root/class//.[@name = "Miller"]/../*')
SET @X.modify('delete /root/class[not (node())]')
SELECT @x

In this case, we walk the axis and find a node test of class (/root/class); we then apply a predicate to look for an attribute of name with a value of Miller ([@name=”Miller”]) in any node below the node of class (//.).  We then walk back up a node (/..), and delete all subnodes (/*).

That leaves us with an XML document that has three nodes for class, one of which is empty (the first one).  We then have to do a second pass through the XML document to delete any class node that does not have nodes below it (/root/class[not (node())]).

The second method accomplishes the same thing in a single pass:

SET @x.modify('delete /root/class[student/@name="Miller"]')
SELECT @x

In this case, walk the axis to class (/root/class), and then apply a predicate that looks for a node of student with an attribute of name with a value of Miller ([student/@name=”Miller”); the difference in this syntax is that the pointer for the context of the delete statement is left at the specific class as opposed to stepping down a node, and then back up.

SQL Server XQuery: deleting nodes using .modify()

Quick blog post; got asked today by a developer friend of mine about how to delete nodes in an XML fragment using the .modify() method.  After some head-scratching and some fumbling around (its been a few months since I’ve done any work with XML), we came up with a version of the following script:

DECLARE @X XML = 
'<root>
  <class teacher="Smith" grade="5">
    <student name="Ainsworth" />
    <student name="Miller" />
  </class>
  <class teacher="Jones" grade="5">
    <student name="Davis" />
    <student name="Mark" />
  </class>
  <class teacher="Smith" grade="4">
    <student name="Baker" />
    <student name="Smith" />
  </class>
</root>'

SELECT  @x

--delete the classes that belong to teacher Smith
SET @X.modify('delete /root/class/.[@teacher="Smith"]')
SELECT @X 

Now, let me try to explain it:

  1. Given a simple document that has a root with classes, and students in each class, we want to delete all classes that are being taught by a teacher named “Smith”.
  2. First, we delete the nodes under those classes that belong to Smith
    1. Using XPath, we walk the axis and use a node test to restrict to /root/class/. (the current node under class).
    2. We then apply a predicate looking for a teacher attribute with a value of “Smith”
    3. The .modify() clause applies the delete command to the @X variable, and updates the XML

PASS 2013 Summit Evals are out!

And I didn’t do too bad; wish I had done better.  I said that when I was done, I felt like it was a “B” level presentation, and it was; I got a 4 out of 5 on my evals.  If I had been a less experienced speaker, I would be thrilled with that; as it stands, I’m a little bummed.  I know that it’s tough to get accepted to speak at Summit, and I feel bad that I didn’t hit this one out of the park.

However, it was a great experience; 73 people attended my session, which is a big audience for me.  I struggled with my demos throughout (I don’t even want to listen to the audio because I’m worried about how bad it was), and I should have worked on finding ways to better connect with my audience.  The feedback I got was really constructive:

Was a good intro, just would have liked to have seen some broader examples. For example converting XML into relational tables, not in detail but just at a high level.

Lots of demos geared towards people who have already written a lot of XQuery. This should have been a 201 session. A discussion on why you’d even use the XML datatype would have been useful. What problem does the XML datatype even solve for people?

I think I would have benefitted from a hard copy (gasp) of the XML data.  I would have been able to see the data and compared it to your on screen results

Way too fast, too ambitious for a 101 session

Well put together and paced. Very clear and coherent

Scale back expectations if it really is a 101 level session

So it sounds like I didn’t do the best job of making my abstract clear; people had different expectations than what I had for what a 100 level course was supposed to be.  I do agree that it was too much content, and if I present on the topic again, I’ll be sure to go back to splitting this up to focus on the basics of XPath, and save a discussion of FLWOR for later.  Also, I really should have used demos much more judiciously; I kept running code and trying to work the magnifier, when I should have just used slides for the basics, and then done a much more thorough demo.

So what did I learn?  Connect with the audience first and foremost.  If I could have kept them engaged and entertained, I may have covered less material, but may have inspired them to do more research on their own (which in the end, is the point of this whole exercise).

quick blog from #sqlpass #summit13

Been a busy couple of days; hell, the last few weeks have been just nuts. I’m pausing for a few seconds to type a quick post from sunny Charlotte, and just fill in a few thoughts.

First, I think my XQuery session went reasonably well; I got bit by the demo gods, and shouldn’t have tried to use the magnifier without more practice, but I had a lot of questions and a few nods of approval. Overall, I think it was a B effort, and am hopeful that I can improve it.

Second, it’s always exciting to be back at Summit; kind of like New Year’s Day. I make lots of resolutions about how I want to get involved with an active and dynamic community. Let’s see how many of them stick. Mostly, I like being around smart people, and it’s been quite exciting to talk to some of the smartest people I know. I’ve had some great technical conversations with lots of people, and its given me a lot of things to mull over in terms of where I want to go and grow.

Third, I also got sucked into a lot of conversations about the whole PASS election/membership issue. My post about Allen Kinsel’s campaign seem to have kicked off more of a firestorm than I realized. I’ve had lots of people ask me what my thoughts were on the issue, and really, it’s kind of simple: We’re database people, and we need a plan to fix a data problem. I don’t have that plan, but there are lots of people who do (see my second point above about smart people).

Fourth, keynotes\sessions are awesome. I’m learning a lot, and I hope others are as well.

More to come soon.

SQL Server XQuery: Functions (sql:variable() & sql:column())

 

Like most query languages, XQuery has several functions that can be used to manipulate and query data.  SQL Server’s implementation supports a limited subset of the XQuery specification, but there’s a lot of power in the functions provided.  I hope to cover some of those functions in more detail at a later date, but for now I’d like to focus on a couple of very specific functions (sql:variable() & sql:column()).  These are proprietary extensions to XQuery, and they (like the xml methods I previously discussed) provide a bridge between the SQL engine and the XQuery engine.

For example, if you wanted to find the value of the third node of a simple XML document in SQL Server, you could do the following:

DECLARE @x XML ='<alpha>a</alpha><alpha>b</alpha><alpha>c</alpha>'
SELECT @x.value('(//alpha)[3]', 'varchar(1)')

The .value() method would return the letter “c” in the form of a varchar to the SQL engine.  However, if you wanted to do this dynamically, and specify which node to return based on a parameter, you would use the sql:variable() function, like so:

DECLARE @x XML ='<alpha>a</alpha><alpha>b</alpha><alpha>c</alpha>'

DECLARE @node INT = 3
SELECT @x.value('(//alpha)[sql:variable("@node")][1]', 'varchar(1)')

The sql:variable() function uses a string literal (a value surrounded by double quotes) to reference a SQL parameter (in this case, @node) and concatenates it to the XQuery string.  The above query is seen as:

(//alpha)[3][1]

by the XQuery engine.  In English, we are looking for the 3rd node named alpha.  You may wonder about the extra positional reference (“[1]”) ; the .value() method requires that a positional reference be explicitly defined.  In this situation, we are telling the XQuery engine to return the first instance of the third node of the alpha node.  Seems a bit clunky, but it works.  Looking at the execution plan, we can see that this is a relatively complex process, with multiple calls between the two sides of the query processor:

image

The sql:column() function is similar, but is used to refer to a column instead of a parameter; this allows for the dynamic querying of an XML column on a row by row basis.  For example:

DECLARE @T TABLE ( ID INT, x XML )

INSERT  INTO @T
        ( ID, x )
VALUES  ( 1, '<alpha>a</alpha><alpha>b</alpha><alpha>c</alpha>' ),
        ( 2, '<alpha>a</alpha><alpha>b</alpha><alpha>c</alpha>' ),
        ( 3, '<alpha>a</alpha><alpha>b</alpha><alpha>c</alpha>' )

SELECT  ID, v=x.value('(//alpha)[sql:column("ID")][1]', 'varchar(1)')
FROM    @T

The above query will return a dataset like so:

image

Summary

SQL Server provides two functions for sharing information from the SQL engine to the XQuery engine: sql:variable() & sql:column().  The nature of these functions is pretty straight-forward; you pass the value of either a parameter or a column to an XML method, and it builds an XQuery string using the values of these functions.