Education

#sqlpass disappointed, but needing the kickstart

So, PASS Summit 2011 session selection emails have started going around; it looks like none of my sessions made the cut.  I need some time to digest this, but I mostly need to think about what that means for me moving forward.  These last two years have been a time of great change for me, and I need to get back on track. First things first: I need to figure out what track I want to run on.

Reminder: Data Architecture PASS VC presentation tomorrow

Just a quick note: I’ll be presenting from the wilds of Hoschton, GA via Live Meeting tomorrow night at 8PM EST.  Details cut and pasted below from http://dataarch.sqlpass.org/:

 

Subject:
From DBA to Data Architect: Changing Your Game

Start Time:
Thursday, May 19, 2011 8:00 PM US Eastern Time (May 20, 2011 1:00 AM GMT)

End Time:
Thursday, May 19, 2011 9:00 PM US Eastern Time (May 20, 2011 2:00 AM GMT)

Presenter:
Stuart Ainsworth (blog|@CodeGumbo)

Live Meeting Link:
https://www.livemeeting.com/cc/UserGroups/join?id=JQTC9F&role=attend&pw=h%3E%234n%212Mj

From DBA to Data Architect: Changing Your Game
The role of database administrator has been around for years, but as information collection and storage needs have skyrocketed, a new career opportunity has opened within enterprises: the data architect. This session is intended to give some guidance on the differences between database administrators (and their cousins, the database developer) and data architects, including specific advice on how to transition from one role into the next.

Stuart Ainsworth
Stuart R Ainsworth, MA, MEd is a Database Architect working in the realm of Financial Information Security; over the last 15 years, he’s worked as a Research Analyst, a report writer, a DBA, a programmer, and a public speaking professor. He’s one of the chapter leaders for AtlantaMDF, the Atlanta chapter of PASS. A master of air guitar, he has yet to understand the point of Rock Band ("You push buttons? What’s that all about?").

#msteched Columnstore indexes unveiled–DBI312

Live blogging again; hope you find my notes useful (scattered though they are).  I’ve been waiting on this session because it’s a very specific area of interest.  I work a lot with VLDB’s, and performance is always a concern; claims are that Denali’s columnstore may boost performance of certain queries hundred-fold.  Let’s see how they work, and I’m hoping I can convince my boss to set up a test bed to try this out.

Presenter is Eric N Hanson from Microsoft (Twitter). 

We start off with a story; I like story-time.  Actually, it’s a very effective way to break out user cases.

Buzzphrase for Columnstore: “Enabling interaction with data”.  Supposed to be super efficient, and get large amounts of data back from SQL Server Denali.  Internal project name is Apollo; columnstore is only part of the picture.

Area of focus is BI & DW: load large amounts of data, high-read, incremental loads.  Partitioning is mandatory for this feature.

Curious as to why the examples join tables in the WHERE clause, and not the more accepted syntax of JOIN.

K, here comes the magic: example uses a Fact Table with 100 million rows in it.  Clustered on a date column, and a columnstore index.  Clustered index is still B-TREE; columnstore indexes are nonclustered.

Running duplicate queries; using index hint to force optimizer to use the clustered index in one example.  Wow; 100,000,000 rows of data aggregated in a second on a two-year old laptop.  50x speedup on this particular hardware.  According to presenter: “this is the biggest enhancement to SQL Server since we bought the code from Sybase".

And here’s the meat and potatoes; how does this work?  Vertical partitioning stores each column in a separate page.  Columnstore is based on the same code as PowerPivot and the BI engine;  Vertipaq if you want to do more reading on this.  Columnstore data is highly compressed, so smaller footprint to read from disk and can be stored in main memory.

New query execution plan: batch processing.  “the edsel is the way of the future”.  Actually, the idea is that batches of vectors are stored in query plan; highly efficient data representation.  We can also scale to more cores: tests are showing linear acceleration up to 32 cores.

Instead of storing data as a page, data is stored as a column segment which represents about 1,000,000 rows.

Questions have begun; some questions are good, but this is a 300 level session, folks.  If you don’t understand basic SQL syntax (like how to create an index), this may not be the session for you.  Great question about the relevance of traditional indexes after this is unveiled, and Hanson’s response: in most Decision Support Applications, columnstore is the way to go particularly for scans.

Some index hints for choosing the columnstore or ignoring it:

WITH (index(index_name))

OPTION (ignore_nonclustered_columnstore_index) <—use for bad plan selection if necessary.

Same traditional rules for index hints: trust the optimizer first, rewrite second, and then use hints last.

A couple of new icons for query execution plans: columnstore scan, and batch hash table processing.  Each execution operator now operates in either batch mode or row mode; batch mode is what you want for speed. 

New term of interest: dictionary.  A dictionary is storage for unique values with a lookup so that a column can stores highly compressed information.

Most things just work with SQL Server; Backup and Restore, Mirroring, SSMS, etc.

Lots of datatypes don’t work with column store: long decimals, binary, BLOB, uniqueidentifier, long datetimes, CLR,  (n)varchar(max).

query performance restrictions: outer joins, Unions; Stick with Inner Joins, Star Joins (need to look this one up) Aggregation.  About to show a query which doesn’t benefit from batch processing.  Essence is below:

SELECT t.ID, COUNT(t2.ID)
FROM t LEFT JOIN t2 ON t.ID=t2.ID
GROUP BY t.ID

Left Join knocks it out of batch processing; need to rewrite as an INNER JOIN, but note that you lose the NULL values, so you have to use a CTE; need to get slides for his sample, but you do an INNER JOIN in the CTE, and then do an OUTER JOIN. 

WITH CTE( INNER JOIN)
SELECT blah
FROM t OUTER JOIN CTE ON t.ID yada yada.

Adding data to columnstore; basic methods:

1.  Drop and re-add the index before load.  Expensive, but works well with traditional daily builds

2.  Partition switching.  Sweet spot needs to be tested, but easy one is the hour.  NOLOCK queries pre-empt the ability to do paritioned queries.  Need to read up on this, but may be fixed in future version

3  trickle load can be done, but needs to be tested.

Very awesome; I cannot wait until this is actually released in CTP 3, so I can play around with it.

#msteched “Juneau” preview

Sitting in the Database Dev session at Tech Ed, taking a look at “Juneau”, the new development suite for SQL Server.  I can’t wait to see if this thing slices and dices data like a Ronco food processor.  Again, live blogging, so please excuse the scattered thoughts, poor spelling, and other bad habits below.

Database development is hard.  I’d love to use that sometimes when my boss asks me why something is running behind schedule. 

Great question!  Where does the definition of the database live if most of your code is built on ALTER scripts?

We should work declaratively, not scripted; devs write CREATE statement, and let the tools manage the change statements.

Demos forthcoming…

 

Connected development; working directly with the database server.  Much like SSMS, but it’s obviously the VS shell instead.  Operation is very similar to SSMS in that you build and execute queries inside of the shell; I wonder if execution plans work.

Table designer: uh-oh.  Not sure if this is a good idea.  However, the tools do drift detection before building an active script.

Side note to Microsoft presenters: sometimes the patter works, but only when it’s natural.  Usually, it feels scripted, which makes it clunky instead of funny.

 

Offline Development: Import database into a project, to build a model from the database.  Declarative code is the keyword.  Left-side versus right-side development; idea is that information is arranged depending on the perspective from which you are accustomed.  DBA’s will see the traditional database hierarchy; developers will see the namespaces.

Table designer preview again; nice thing is that it is generating scripts on the fly.

SSTD is deployed to a new lightweight test-and-debug single user instance on your desktop.  Cool stuff; no need to deploy to a test server; it’s there already.

Ughhhhh; live demos are painful.  I’m going to remember to NOT DO THAT EVER AGAIN when I present. 

Yay!!! Execution plans are there!!!!  SWEET.

Interesting way to handle connected and disconnected development.  Since you are building CREATE statements, you can’t run them in a connected sense; execute fails.  But what happens when you run it and are building a new object?  Will it execute?

OMG!!!!!!!! THEY ARE SHOWING A STORED PROCEDURE WITH A F’ING WHILE LOOP IN IT!!!!! I can’t believe that one slipped by.

Cooling off; I totally zoned out on the last view steps.  Lemme calm down and move on to the next one.

 

Publish to SQL Azure: More accurately, the project can be targeted to various editions, including Azure.

Nice feature to work online and offline.  Not much else to say; the tool changes the model to fit the deployment target.

 

SSMS is NOT going away, but the majority of the development functionality will be replicated in Visual Studio. Both will have Server Explorers and query windows, but SSMS will still ship as a part of the client tools for SQL Server.

Schema comparison tools still are lackluster compared to Red Gate, but they seem to have improved a little since I last remember them.  Not really much new here compared to the database projects of 2010 and 2008, but it looks like there has been some thought about best practices.

Overall, a good presentation, but I still think there’s a lot left to do.

#msteched skydiving into the cloud

Here’s a first for me; I’m live-blogging from a keynote.  Diving straight into it, so please excuse the sparseness of the notes.

Robert Wahbe trying to lay out the vision for the cloud from Microsoft; trying to establish that the future is the cloud.  I hate buzzwords, but I think the point of the metaphor is to draw together private vs public application development.  If you build your internal private apps as a private cloud, it should be easy to move to a pubic distributed architecture.

Factors driving the move to the cloud:

  • Extension of existing applications
  • large data sets
  • high perf computing
  • events
  • marketing campaigns (high spike traffic)

Talk about Travelocity’s use of the public cloud.  Focus on scalability, and new products to handle the traffic.  Capacity on demand is the buzzword that they use, and I wish I had a server room like they just showed. 

How easy is it to do this is the question they raise?  I’m still a skeptic.  Biggest problem is not a greenfield scenario, but migration from an existing infrastructure.  Let’s see if demo guy (I missed his name) can convince me. 

Ooh, a Contoso application; I’ve been stuck on AdventureWorks for too long.  Somebody bring back Northwinds.

Ooh, we all need AUTOMATION!  AUTOMATION is what we need! Really?

OK, the VMM service deployment looks cool.  

OK, ADHD just kicked in, and I’ve heard blah, blah, blah for the last couple of minutes.  Apparently, Systems Center will let you do all kinds of one-button stuff.    Now, can I get management to convert from an app that had no clue about that 10 years ago but is still critical to my business?

Now, SQL, and Sharepoint with Amir Netz.  Project Crescent looks interesting. Wow, I like the animation.  I think there will be a lot of future in Management of Business Analysts.  Too much information can cripple a company just as bad as too little; there’s going to have be some training in “what are good questions to ask?”  We’ve finally got the tools to tell us “42”, but do we understand our businesses well enough to understand how we got there?

Average number of devices is 4, so a lot of us are now carrying 6 or 8.  I have two on me right now (work lappy and iPhone), and two at home.  I feel geek-deprived.   Must buy more gear.

Windows Phone 7 interface is nice, but I thin Microsoft has a long way to go to win the smartphone wars.  I really don’t want to go back to carrying multiple devices (smartphone for personal, and smartphone for work).   

Just realized that my butt is starting to hurt; that’s a good indicator that a keynote is running long.  Time for the mid-keynote stretch. 

I’m wondering how many butterflies the Kinect guy has right now; the first few gestures slipped a little.

Second half of the keynote?  Yeesh….  I hope there’s a power plug in the building somewhere.

I chuckled a little when the guy said that “if you’re an ASP.NET developer, you can write Sharepoint Applications.”  I think of all of the swearing that my ASP.NET developer buddies do when you mention Sharepoint…

OK, I’ve done as much as I can for now; I’ll try to blog more today around the sessions. 

knowing when to walk away…

I don’t like to think of myself as a quitter; I especially don’t like to walk away from something I enjoy doing.  However, I’ve recently had to re-evaluate my workload (both personal and professional), and I realized that something had to give.  For me, that something was the Atlanta SQLSaturday 2011.

Now, before you panic, the project is still continuing without me; there’s still a team working very diligently to pull this together.  I’m just no longer working on it.

My reasons for stepping away were many:

  • My personal life is getting very complicated. I’m getting married this summer, and there’s a lot of details left to work out.  On top of that, I have two wonderful kids from my first marriage.  My family deserves no less than my best, and that means I need to put them first.
  • Volunteer work should be fun work.  If you’re leaving a volunteer project more stressed than when you went in, something’s not right.  In my case, I was too invested in the SQLSaturday project to really allow it to grow; I have certain ideas about what should be done, and it’s time to let others take it in a different direction. 
  • Volunteer work should enhance your professional and personal skills.  In my case, I’ve let some of my technical goals slip because I’m investing too much time in volunteering.  I need to finish my technical certifications, for example.
  • There’s other tasks to be done in the Atlanta SQL Server community.  I’ve spent the last few years working with SQLSaturday and AtlantaMDF; one of our original goals for SQLSaturday was to help feed the user group, and to be honest, that hasn’t happened.  Stepping back and looking objectively at the situation has made me realize that there’s some foundational work that needs to be done in the user group in order for it to truly benefit from community outreaches like SQLSaturday.

All of that being said, I’m still going to be doing volunteer work; I just need to make smarter investments of my time.  One of the harder lessons in life is knowing when to walk away from something in order to let it (and yourself) flourish.

SQLSaturday Atlanta 2011–advance notice

Last night a small group of us (Aaron Nelson, Audrey Hammonds, Julie Smith, Tim Radney, and me) met to discuss this year’s upcoming SQL Saturday; our goal is to make it a bigger event than last year, but still try to keep it very community-centric.  Here’s a couple of bullet points that I can tease you with now:

  • We’re still looking for a venue, but our hope is to have at least 7 tracks of content.  We know that we’re going to push for attendee numbers between 300 & 450 (limited seating in the Atlanta area).
  • Our tentative date is September 24; that’s right before PASS Summit.
  • We’re playing around with the idea of a pre-con on the Friday before.  Deep-dive for a low-cost.
  • The event shirts WILL kick a$$.
  • We’re trying to figure out how to work with sponsors to make sure that they get a lot of value out of this; after all, they’re footing the bill, and we want to make sure they leave our event satisfied with the exposure they get.
  • If we can swing it, we’re thinking about one MAC-DADDY prize for attendees.  

A lot depends on finding the right location; good location = more attendee seating = more sponsor funds. Hopefully, we can secure something in the next week or so.

Keep you posted as things develop.

much delayed #sqlsat70 write-up

This write-up will be brief and to the point: SQL Saturday 70 rocked.  K. Brian Kelley (Blog|Twitter) and his team put together a great event (again), and it was a lot of fun catching up with so many SQL people.  Unfortunately, I had a rather severe sinus infection which kept me from really enjoying the event (in fact, I left after my sessions), but I did have a good time.

My sessions went very well; I had small crowds, but they were very involved.  I seem to keep picking esoteric topics (Data Architecture and XQuery), but the beauty of that is that I learn something new every time I start researching the subject.  One thing that stood out for me is that even though I pitched the event as an Intermediate event, I still had a lot of foundational material to cover.  I need to keep that in mind for future versions of my technical presentations.

General notes about the event: I have no complaints.  The space was wonderful, the speaker room was more than adequate, and the food was great.  The only thing that I think was missed was the same sin I’ve been guilty of at our AtlantaMDF events; the hosting organization wasn’t promoted enough.  We get so caught up in making sure the event flows smoothly that I think we forget to tout the monthly events enough.

Anyway, it was a great show.  If you attended my presentations, thank you; I hope you learned something.  The slides are available at the links below:

Data Architect: http://www.sqlsaturday.com/viewsession.aspx?sat=70&sessionid=3754

XQuery: http://www.sqlsaturday.com/viewsession.aspx?sat=70&sessionid=3755

What Should PASS be? #sqlpass

Andy Warren recently threw out a challenge for bloggers to “fix” things with the Professional Association for SQL Server in 3 years.   There have been some great responses so far (and I’m sorry if I’ve missed yours):

All of these posts have great ideas, and have influenced my thinking on my subject; I’ve had conversations with most of these authors about some of the finer points of the direction that PASS should take over the last year at Summit, SQL Saturdays, email, etc; the ideas that I’m going to post below are probably not too dissimilar than their thoughts (although we probably differ on some on the implementations of those ideas).

Heading off in a general direction…

Although Andy W. specifically asked for a 3-year plan, I think part of the problem with PASS is that the long-term vision is unclear.  There’s a big debate about whether or not PASS is a community organization, a business serving that community, or something else that’s not been well-defined.  Additionally, PASS struggles with its domain of influence; the organization is viewed as being U.S.-centric by most members outside of the states, and inside the states, the continued reliance on Microsoft’s presence in Seattle makes the organization seem distant to local users.  What should PASS be?

In a conversation with Andy W. a few months ago, I proposed that PASS should borrow from some of the great evangelistic traditions of Western civilization (I was originally thinking of a non-religious version of the five fold ministry of the early Christian church: apostles, prophets, evangelists, pastors, and teachers), and Andy threw out the word “guild”.  I like that concept; PASS should be a guild, providing training both in terms of learning about the tools (SQL Server and associated products) and growth in the guild (moving from a student to a master).  Guilds are both a community of learners, and a powerful force of influence; where the Summit goes, Microsoft should follow (instead of the other way around).  I think this thought echoes Grant’s call:

Get the word out that if you want training this is the place to be. If you want to be a trainer, this is the place to start, if you are a trainer, this is where you grow you brand.

Of course, that’s a long-term definitional goal ; in the short term, I see three areas for improvement.

Things to do in the next three years…

1. Have an election process that’s deemed fair and reliable by the majority of the membership. 

I applaud PASS for taking steps in this regard.  I obviously spent a great deal of time discussing this over the last 10 months, and I’ve arrived at a very different place than either Andy Leonard or Mike Walsh (I believe in a strong Nominating Committee with an opaque application process; Andy has called to abandon it altogether, and Mike believes in a simple pass-or-fail review of credentials).  While our viewpoints on the actual implementation may differ, I think we can all agree that PASS will continue to lack credibility if the method by which organizational power is attained is not supported by the constituency.    PASS needs to get the election process stabilized and supported before the next election.

2. Adopt the User Groups as an extension of the organization, rather than just partners in community.

The PASS Chapter model is essentially a good one; there is no better way (in my opinion) to reach SQL Server professionals interested in building their careers than through the User Groups.  Unfortunately, as Mike (and others) have pointed out, the loose affiliation between PASS and the chapters have left many chapter leaders questioning what does PASS really do for the chapters?  That needs to change.

Chapters should be the local arms of PASS; attendees to a chapter meeting should leave every meeting thinking that they are getting a monthly shot (albeit a smaller dosage) of the same knowledge that they get from a PASS SQLSaturday, a PASS SQLRally, and a PASS Summit.  Chapters should feel interconnected; as a chapter leader in Atlanta, I should know what topic TJay Belt is discussing in Utah, or what Roy Ernest is covering in Curaco.   I should feel confident (as should they) that I have access to the same resources for educating my members (including trained, professional speakers as well as online materials) as any other chapter.

Chapters should also be given the tools necessary to recruit new members to the guild, both those members of the community with lots of experience with SQL Server (and little-to-none with PASS) as well as those members of the community who are still figuring out what a clustered index is.  I realize that this is a huge task to take on in 3 years, but the initial groundwork must be laid; chapters need to feel that they are part of a larger organization, and they should be embraced as siblings (not distant cousins).

As a sidebar, I should note that while PASS chapters should not replace the online initiatives that PASS has recently invested in (the blogosphere and social networks), they should be the primary focus.   From my own personal perspective, I’ve recently discovered that as I’ve become less “plugged in” (changes in my personal life as well as new corporate firewall policies have prevented my social networking),  it’s been harder to stay invested in PASS and the SQL community.  For example, I missed the recent call for volunteers for Program Committee members; I’ve also missed quite a few calls for bloggers (like T-SQL Tuesday).  There needs to be better connectedness between “meatspace” (a term I borrowed from Brent Ozar) and the online community.

3.  Invest in the IT structure at HQ.

We’re an organization of information technology professionals, and as far as I know, we have a staff of 2 IT guys (a developer and an admin).  If PASS is going to be the essential tool for the SQL Server Professional, then the organization needs to build an IT infrastructure that can support community connectedness, the sharing of essential information, networking between members, and training resources to move passive members to active masters of their craft.  I am not sure what that would take, but I think the speaker bureau (as well as a speaker training program) is a good start.  PASS doesn’t need to be a SQLServerpedia or a SQL Server Central, but it does need to provide its membership with an awareness of what good SQL Server resources are, and how they should be used in the educational path of the member.

Summing Up…

As I said before, I’m envisioning PASS as a guild for SQL Server professionals; guilds have members with varying skill levels (from apprentice to master craftsman), and the goal of the guild is to train its members not only in the tools they use, but also in the ways of the guild.  We’ve got a long way to go, but I think we have some basic steps we need to master, and soon.