February 2011

Resolution checkup

As February draws to a close, I thought I’d do a quick check-up to see how well I was keeping up with my New Year’s resolution list.  In sum: not great, but not too bad, either.  I need to make some adjustments, but I think I can pull it back in.

Here’s the rundown (copied and pasted from the original, with some notes below):

Professional

Technical Skills

  • I want to learn something new every month.  My goal is to tackle something challenging, and be able to understand the ins and outs of it within 30 days.  For example, I want to finish tackling XML (including XSD’s) in SQL Server. 

I think I’m doing OK on this one; I haven’t really done great this month, but I have spent a little time each month working on something new.

  • I want to upgrade my certifications by the end of the year; I’ve been dancing around the MCITP exams for a while, and I need to finish them.

Spent a little time studying, but I need to get on this.

Presentation

  • I want to make at least 6 technical presentations by the end of the year; last year, I managed to eke out 8, but given some of the recent changes in my personal life (see below), I think 6 is reasonable.

I have two presentations scheduled for SQL Saturday 70 next month.

  • I will blog at least once a month about some technical topic (see the first bullet point under technical skills).

See the above point; as I learn, I blog.  I did miss the T-SQL Tuesday blog for Feb (which makes me sad).

Management

  • I will understand the SCRUM methodology, and learn how to implement it with my team at work.  Although I’m not a team leader, I AM the Senior Database Architect, and I need to code less, and teach more.  This is my year to do so.

I’ve done this; I’m moving on to something larger. 

Personal

Health

  • I’m getting married again this year, and I want to look good for my new wife.  I also want to avoid long-term health issues.  I was losing weight last year (until I started dating), and I want to get back on track.  I’d like to lose 50 lbs by October.

Started Weight Watchers and have lost about 10 pounds so far.  Have tapered off a bit, and I need to get back on this bandwagon.

  • I have apnea, and I’ve been horrible about using my CPAP on a regular basis.  I will use it regularly.

How about irregularly?

  • I need to exercise more, so I will find 20 minutes a day to do SOMETHING, even if it’s just walking around the office for 20 minutes.

Blech.  I did OK for about two days.

  • I will drink at least 8 glasses of water per day.

Does Diet Coke count as water?  Sigh; it looks like I’m not doing so hot in the Health area.

Spiritual

  • I’ve slacked off in my religious activities; my faith was nourished by church attendance during my divorce, and I need to start growing again.  I will find a new church in the next two months (my old church is too far to drive on a regular basis), and become a regular attendee.

Checked out a church; didn’t like it.

  • I choose to absorb the goodness from people who love me, and I will reject the poison from those who do not.  I will focus on the important things in life (like my kids, and my future bride), and worry less about the unimportant things (like who’s mowing the grass).

Mixed results on this; while I think I do a great job at spending time with my kids and my future bride, I’m still struggling with ways to handle conflict in a positive fashion.  My strategy now is direct confrontation, rather than continuing to tap-dance around issues.

Social

  • I will listen more to my children, my family, and my friends.  I will find ways to let them know I love them.

See above.

  • I will nurture my own friendships; while I love my fiance’s friends and family, I want to bring more to the table than just me.

Need to do better about this.

Financial

  • My divorce pulled me way off course.  While I’m a long way from being out of debt, I will continue to make strides in that area.  I will pay off at least one credit card ahead of schedule.

Not really making a lot of headway here;  this one may have to wait until my fiancee and I combine households (thus saving on rent payments).

  • I will save more; I plan to find ways to cut costs (like taking advantage of coupons, and eating out less).

Ditto.

There you have it; a mixed bag.  I think I’m making some positive steps in the right direction, but I’ve still got a long way to go.

What Should PASS be? #sqlpass

Andy Warren recently threw out a challenge for bloggers to “fix” things with the Professional Association for SQL Server in 3 years.   There have been some great responses so far (and I’m sorry if I’ve missed yours):

All of these posts have great ideas, and have influenced my thinking on my subject; I’ve had conversations with most of these authors about some of the finer points of the direction that PASS should take over the last year at Summit, SQL Saturdays, email, etc; the ideas that I’m going to post below are probably not too dissimilar than their thoughts (although we probably differ on some on the implementations of those ideas).

Heading off in a general direction…

Although Andy W. specifically asked for a 3-year plan, I think part of the problem with PASS is that the long-term vision is unclear.  There’s a big debate about whether or not PASS is a community organization, a business serving that community, or something else that’s not been well-defined.  Additionally, PASS struggles with its domain of influence; the organization is viewed as being U.S.-centric by most members outside of the states, and inside the states, the continued reliance on Microsoft’s presence in Seattle makes the organization seem distant to local users.  What should PASS be?

In a conversation with Andy W. a few months ago, I proposed that PASS should borrow from some of the great evangelistic traditions of Western civilization (I was originally thinking of a non-religious version of the five fold ministry of the early Christian church: apostles, prophets, evangelists, pastors, and teachers), and Andy threw out the word “guild”.  I like that concept; PASS should be a guild, providing training both in terms of learning about the tools (SQL Server and associated products) and growth in the guild (moving from a student to a master).  Guilds are both a community of learners, and a powerful force of influence; where the Summit goes, Microsoft should follow (instead of the other way around).  I think this thought echoes Grant’s call:

Get the word out that if you want training this is the place to be. If you want to be a trainer, this is the place to start, if you are a trainer, this is where you grow you brand.

Of course, that’s a long-term definitional goal ; in the short term, I see three areas for improvement.

Things to do in the next three years…

1. Have an election process that’s deemed fair and reliable by the majority of the membership. 

I applaud PASS for taking steps in this regard.  I obviously spent a great deal of time discussing this over the last 10 months, and I’ve arrived at a very different place than either Andy Leonard or Mike Walsh (I believe in a strong Nominating Committee with an opaque application process; Andy has called to abandon it altogether, and Mike believes in a simple pass-or-fail review of credentials).  While our viewpoints on the actual implementation may differ, I think we can all agree that PASS will continue to lack credibility if the method by which organizational power is attained is not supported by the constituency.    PASS needs to get the election process stabilized and supported before the next election.

2. Adopt the User Groups as an extension of the organization, rather than just partners in community.

The PASS Chapter model is essentially a good one; there is no better way (in my opinion) to reach SQL Server professionals interested in building their careers than through the User Groups.  Unfortunately, as Mike (and others) have pointed out, the loose affiliation between PASS and the chapters have left many chapter leaders questioning what does PASS really do for the chapters?  That needs to change.

Chapters should be the local arms of PASS; attendees to a chapter meeting should leave every meeting thinking that they are getting a monthly shot (albeit a smaller dosage) of the same knowledge that they get from a PASS SQLSaturday, a PASS SQLRally, and a PASS Summit.  Chapters should feel interconnected; as a chapter leader in Atlanta, I should know what topic TJay Belt is discussing in Utah, or what Roy Ernest is covering in Curaco.   I should feel confident (as should they) that I have access to the same resources for educating my members (including trained, professional speakers as well as online materials) as any other chapter.

Chapters should also be given the tools necessary to recruit new members to the guild, both those members of the community with lots of experience with SQL Server (and little-to-none with PASS) as well as those members of the community who are still figuring out what a clustered index is.  I realize that this is a huge task to take on in 3 years, but the initial groundwork must be laid; chapters need to feel that they are part of a larger organization, and they should be embraced as siblings (not distant cousins).

As a sidebar, I should note that while PASS chapters should not replace the online initiatives that PASS has recently invested in (the blogosphere and social networks), they should be the primary focus.   From my own personal perspective, I’ve recently discovered that as I’ve become less “plugged in” (changes in my personal life as well as new corporate firewall policies have prevented my social networking),  it’s been harder to stay invested in PASS and the SQL community.  For example, I missed the recent call for volunteers for Program Committee members; I’ve also missed quite a few calls for bloggers (like T-SQL Tuesday).  There needs to be better connectedness between “meatspace” (a term I borrowed from Brent Ozar) and the online community.

3.  Invest in the IT structure at HQ.

We’re an organization of information technology professionals, and as far as I know, we have a staff of 2 IT guys (a developer and an admin).  If PASS is going to be the essential tool for the SQL Server Professional, then the organization needs to build an IT infrastructure that can support community connectedness, the sharing of essential information, networking between members, and training resources to move passive members to active masters of their craft.  I am not sure what that would take, but I think the speaker bureau (as well as a speaker training program) is a good start.  PASS doesn’t need to be a SQLServerpedia or a SQL Server Central, but it does need to provide its membership with an awareness of what good SQL Server resources are, and how they should be used in the educational path of the member.

Summing Up…

As I said before, I’m envisioning PASS as a guild for SQL Server professionals; guilds have members with varying skill levels (from apprentice to master craftsman), and the goal of the guild is to train its members not only in the tools they use, but also in the ways of the guild.  We’ve got a long way to go, but I think we have some basic steps we need to master, and soon.

A simple codebuilder for parsing in T-SQL

If you’ve ever tried to parse a wide character column in T-SQL, you know two things:

  1. It’s a pain to do, and
  2. It’s a pain to do.

A lot of the data I deal with comes in syslog format, which can come in one of two formats: positional (the location of the data element is related to the type of data), and named attributes (which usually only include delimiters for complex strings).  Although I haven’t had much luck automating positional parsing, I’ve recently begun using Excel to help me with the named attributes. 

Here’s an example; I have a table with a message column that is pulling over syslog data from a firewall.  In a given day, I may have millions of rows like the following:

sn=AA17D5028EAA time="2011-01-26 13:40:14 UTC" fw=10.1.100.1 pri=1 c=512 m=522 msg="Malformed or unhandled IP packet dropped" n=1 src=10.1.1.23:32795:X1: dst=10.1.1.1:514:: proto=udp/17

Note that each attribute of this particular syslog message is identified with an attribute name (eg, sn, time, fw, etc).  In order to break out each of the elements in T-SQL, we can split the string using a combination of SUBSTRING and CHARINDEX, like so:

SELECT TOP 1
        m
= CONVERT(INT, SUBSTRING(MESSAGE, CHARINDEX(' m=', MESSAGE) + 3,
                                  
CHARINDEX(' ', MESSAGE, CHARINDEX(' m=', MESSAGE) + 3) - ( CHARINDEX(' m=', MESSAGE)
                                                                                              +
3 )))
      ,
time = CONVERT(DATETIME, SUBSTRING(MESSAGE, CHARINDEX(' time="', MESSAGE) + 7,
                                          
CHARINDEX('UTC"', MESSAGE, CHARINDEX(' time="', MESSAGE) + 7)
                                           - (
CHARINDEX(' time="', MESSAGE) + 7 )))
      ,
fw = CONVERT(VARCHAR(20), SUBSTRING(MESSAGE, CHARINDEX(' fw=', MESSAGE) + 4,
                                           
CHARINDEX(' ', MESSAGE, CHARINDEX(' fw=', MESSAGE) + 4) - ( CHARINDEX(' fw=',
                                                                                                       
MESSAGE) + 4 )))
FROM    syslogng (NOLOCK)

Note the repetition for each column; you need to find the position of a starting delimiter, the position of an ending delimiter, and supply to the SUBSTRING function the position of the starting delimiter, and the difference between the two.  You also need to determine the lingth of the starting identifier, and then I CONVERT to a specific data type.  Whee!

It gets even more fun when the attributes are optional; some syslog messages may have a proto code, and some may not.   When faced with this, you need to include a CASE option, like so:

SELECT TOP 1
        proto
= CONVERT(VARCHAR(20), CASE WHEN CHARINDEX(' proto=', MESSAGE) = 0 THEN NULL
                                         
ELSE SUBSTRING(MESSAGE, CHARINDEX(' proto=', MESSAGE) + 7,
                                                        
CHARINDEX(' ', MESSAGE, CHARINDEX(' proto=', MESSAGE) + 7)
                                                         - (
CHARINDEX(' proto=', MESSAGE) + 7 ))
                                    
END)
FROM    syslogng (NOLOCK)

 

One of our developers is working on a syslog parser in .NET code, but I needed a proof-of-concept, and I didn’t want to keep cutting and pasting to see if it was working.  Looking at the parsing, it’s very formulaic SQL.  When I think formulas, I think Excel, and so I whipped out the following:

image

Note that I have several input columns:

  • start, the starting delimiter
  • end, the ending delimiter (usually a space)
  • colname, the column name I want to use; usually the same as start, but stripped of extra characters.
  • type, the SQL type I want to convert the data to, and
  • optional, a column to decide if the attribute is optional per row or not.

I also have a hidden column (column F), which generates most of the SQL code:

=CONCATENATE("SUBSTRING(message, CHARINDEX(‘", A2, "’, message)+ ", LEN(A2), ", CHARINDEX(‘", B2, "’, message, CHARINDEX(‘", A2, "’, message)+", LEN(A2), ") – (CHARINDEX(‘", A2, "’, message)+", LEN(A2), "))")

This takes the starting and ending delimiters, the length of the starting delimiter, and plugs those values into a valid SQL statement.  I then create a SQL column, using the following formula:

=CONCATENATE(", ", C2,"CONVERT(", D2, ", ",  IF(E2="Y", CONCATENATE("CASE WHEN CHARINDEX(‘", A2, "’, message) = 0 THEN NULL ELSE ",F2, " END"), F2), ")")

If I were better at Excel, I’d use named ranges, but for my purposes, this is OK.   I append a column to the beginning, specify the type, and include a CASE statement based on whether or not my optional column includes a “Y”.

It took me longer to write this blog post than it did to generate a proof-of-concept, parsing each of the named attributes out from a syslog message.

I’m doing it wrong…

me_doing_it_wrong At some point in your career, you have to realize that you’re going about it in the wrong way.   It may hit you like a ton of bricks, or it might be a subtle realization, but either way you realize that things aren’t working out for you like you expected.  I’ve had a couple of those moments throughout my career; one was shortly after I flunked out of graduate school.  Nothing says “you’re doing it wrong” than sitting outside of your advisor’s office for a meeting that never happens.

I’ve had other epiphanies in my career, such as the time when my ethical standards were a little higher than my employers; when I got sent home by a GM after a discussion over my responsibilities, I started polishing my resume.   I was doing it wrong by working for the wrong company.

Recently, I’ve begun to realize that I’m not living up to my full potential in my career.  I’ve spent the last several years building an enterprise solution for my company that has become the core product of that company.  It’s a good product, and I’m proud of it.  However, like many small companies that have grown up fast,  our company is built on a complex ecosystem of ever-changing goals and feature requests.  We built a system based on assumptions, and we’ve become one of the leaders of our industry because we’re often the first to deliver a product for a niche market.  Many of the assumptions we made didn’t pan out, and the applications we’ve built have slowly degenerated into a mass of tangled wire and unrealistic expectations.  I realized this as I’ve struggled to add a new feature and retrofit it into this existing solution; it’s taking more and more time to solve development problems because we’re not sure what features are still being used by some employee in a dark corner of the building.

As I was rewriting a stored procedure for the fifth time trying to eke out a few more milliseconds of performance, I realized that I was thinking like an engineer.  Engineers find creative solutions to problems in a very hands-on way; they worry about wiring things together so that they work, and they work well.  Engineers are worried about the microcosm; as every geek’s favorite engineer (Scotty from Star Trek) would say “In four hours, the ship blows up.”  That’s pretty straightforward; under condition x, outcome y is to be expected in a certain amount of time.

The problem?  My title says Architect.  I’m supposed to be thinking about the big picture, not just how a couple of applications are wired together.  I’m supposed to understand (and enforce) the rules about how events become data, and how data becomes information.  I should be more concerned with defining the specifications for our system than trying to figure out this damned stored procedure (for the fifth time).  Maybe we shouldn’t even have this particular stored procedure; maybe with a little tweaking, we could eliminate the problem altogether.

So what does this mean for me?  Well, as part of my New Year’s resolutions, I’ve been determined to learn something new every month.  This month, I’ve been focused on what does it mean to be a Data Architect, and I’ve been trying to find a little time every day to transform myself from an engineer to an architect.  I’m not going to master all of these subjects at once, but here’s my working list (from high-level goals to specific action items).  I expect this list to evolve, but it’s a start.

High Level Goal: A Data Architect needs to establish the standards for information and data in the enterprise.

  • I need to document the information architecture of our division of the company, using a standard data flow diagram notation.  I need to spend some time daily refreshing my memory on that notation.
  • I need spend time with employees throughout the organization, discovering what the business entities are, and what the vocabulary for those entities are. 
  • After discovery, I need to publish a standard vocabulary document and data-dictionary, showing how we capture that information today:
    • I need to propose changes to our business vocabulary, and
    • I need to propose changes to our database schema to standardize our notation.

High Level Goal: A Data Architect needs to understand the nature of the enterprise’s information on all levels: physical, logical, and procedural.

  • I need to talk to our production DBA’s an understand how our database servers are set up physically, including the clustering structure, the drive arrays, the SAN, etc.
  • I need to talk to our engineers to understand how data gets to the databases.
  • I need to talk to our product owners to understand what information they want from the data, and what’s the best way to deliver it.

High Level Goal: A Data Architect needs to recommend the best architecture for information management, including a plan on how to get there from here.

  • I need to refresh my memory on all aspects of SQL Server, not just the parts I use on a daily basis.
  • After discovery, I need to recommend ways to improve efficiency in our data capture processes.
  • I need to listen to all voices in the organization, even those I don’t normally agree with.  I can’t afford to throw away good ideas simply because I don’t always like the originator of those ideas.

More to come, but this is what I’ve been working on so far this month (February 2011).

Just a quick note… #sqlsat70

I just submitted a couple of sessions to SQL Saturday #70; I feel like I’ve been way off my game in terms of service to the community lately, so hopefully this will provide me a bit of a kick-start.  Even if I don’t get accepted (the list is growing longer each day), it’s at least a reminder that I need to get back out there and present.

Here’s the links to the sessions, btw:

Dirt, Spit, and Happy FLWOR- Hands on with XQuery

From DBA to Data Architect: Changing Your Game