Development

Upcoming presentations…

So, if you’ve been wondering where I’ve been, the answer is “too frikkin’ busy to write”.  Settling in to a new marriage, changes at my job(s), and volunteer work have been a little more  time-consuming than I originally planned.  I’m hoping that’s gonna change in the near future, cause I’ve some great ideas for posts brewing in the background.

One idea that I’m really excited about is a change in the monthly meetings for AtlantaMDF (our SQL Server User Group); like most user groups, we have a meet-and-greet followed by a presentation (or two).  The presentations usually cover some mid-level to advanced topic, and usually assume that the attendees have some knowledge with SQL Server.  We’re going to tackle that assumption.

Starting at our next meeting (Monday, September 12), we’re going to present short presentations before the main one that are targeted to new users of SQL Server; the goal is to a) build up our new members, and b) help grow our speaker pool.  I’m excited to present the first topic “Stuff in the FROM clause” on Monday, so if you’re in the Alpharetta area Monday night, come by and see me.

Also coming up is our fourth SQL Saturday (#89); although I haven’t been as involved with the planning on this one as I have in years past, it’s been exciting to see it unfold.  I’m looking forward to presenting a session on XQuery, and hanging out by the user group booth. If you’re gonna be there, stop by and say Hi!

#sqlpass PreCON,Baby! #sqlsat89

Just ordered my tickets to my first ever pre-con, and it’s for a SQSaturday being hosted by my chapter. I’m excited to attend John Welch’s Data Warehousing with SSIS Deep Dive on September 16, 2011 (BTW, tickets are still available).

Why this precon at this time?  Well, this year, conference funds are tight for me.  With my recent wedding, I’m really limited on both time and personal funds to attend a lot of conferences, so my own personal learning experiences are being restricted.  It was an easy sell to my employer to fund a $100 pre-con.

Second, my job is shifting, and I realized I needed to pick up some skills to finally make the transition.  I have a lot of exposure to ETL processes, but not so much SSIS.  Combine that with the fact that I haven’t done a lot of BI (most of our ETL is clickstream analysis), and there’s a big old gap in my data knowledge. 

Anyway, much longer post to follow on what I’ve been up to; just wanted to drop a quick note to say where I’m going to be before SQLSaturday 89.

#sqlpass–Last call for Community Choice

I’ve been out of pocket for the last couple of weeks (more on that later), but I was very excited to hear that two of my sessions were being considered for the Community Choice slots at PASS Summit this year.  If you haven’t voted yet, please consider voting for me; I’m looking for the opportunity to really grow this next year as a technical speaker, and I’d love to kick it off with a bang at Summit.  Voting closes TOMORROW (July 20, 2011), so vote now!

SQL Server XML 201 [Application and Database Development]
Stuart Ainsworth (Gladiator Technology Services)
 
Basic Guidelines for VLDB’s [Enterprise Database Administration and Deployment]
Stuart Ainsworth (Gladiator Technology Services)

 

VOTE HERE: http://www.sqlpass.org/summit/2011/UserLogin.aspx?returnurl=%2fsummit%2f2011%2fSummitContent%2fCommunityChoice.aspx

 

Just as an aside, I’d also like to plug a couple of other sessions by some AtlantaMDF members:

Bad SQL [Application and Database Development]
Geoff Hiten (Intellinet)
 
ETL Smackdown: PowerShell vs SSIS — with Aaron Nelson [BI Architecture, Development and Administration Topics]
Julie Smith (Key2 Consulting)
Aaron Nelson (@SQLvariant)

 

4 out of the 20 sessions being considered for Community Choice are by speakers from my local chapter, which makes me proud. Thanks for your consideration, and more posts to come soon!

distractions and other news…

Sorry for the absence from blogging for a bit; a lot on my plate.  Next week, I’ll be turning 40 on July 5th, and then getting married on July 9th.  The latter is much bigger news than the former, but both are reasons to celebrate.

Another reason to celebrate?  I got accepted to speak at SQL Saturday 64 (Baton Rouge).  I’m excited to head back home for good food and good times.  I needed a little affirmation after the disappointment over Summit.

Anyway, I may not blog much in the next few weeks; please stay tuned, because good things are happening.

#sqlpass disappointed, but needing the kickstart

So, PASS Summit 2011 session selection emails have started going around; it looks like none of my sessions made the cut.  I need some time to digest this, but I mostly need to think about what that means for me moving forward.  These last two years have been a time of great change for me, and I need to get back on track. First things first: I need to figure out what track I want to run on.

Three Myths about Agile Development

I recently attended Microsoft Tech Ed in Atlanta, and while there wasn’t much new being announced about SQL Server (I had heard about many of the features for Denali at PASS Summit 2010), I did find myself drawn to several sessions regarding Agile principles and development.  My shop has been using the Scrum method for about 2 years now, and it was nice to have a refresher.  I also participated in (and overheard) a lot of conversations about Agile methods, and it made me realize two very important things:

  1. Many people who claimed to be using Agile methods had never read the Agile Manifesto, and
  2. There are several misconceptions in play regarding Agile development.

The point of this blog post is dual-fold; first I want to encourage you to read the Agile Manifesto.  If you’ve read it before, read it again.  And then, read it a third time (it’s short, so easy to read).  Done that?  Good, because here’s the crux of my argument:

If you want to do Agile development, you must adhere to the principles of the Agile Manifesto.

It’s simple, really; you shouldn’t claim to be a SQL Server developer if you’ve never written a T-SQL statement.  You can’t call yourself a cubist if you haven’t studied the works of Picasso.  You shouldn’t claim to be doing Agile development if you don’t adhere to the principles of the Agile Manifesto.

And, that leads us to the second part of this post; I believe that lots of us think we’re adhering to the methods and principles of Agile development, but there are at least three basic myths about Agile development which keep development teams from being as agile as they can be; here’s my take on them:

Myth 1: Daily meetings with business people are an impediment to rapid development.

I actually got into a fervent discussion with gentleman at TechEd about this subject during a Birds of the Feather Session on Scrum.  he claimed to be a Scrum Master for 6 teams (including several overseas), and that he barred business people from entering into the daily standup in order to keep them from dragging the meeting astray.  I think that’s wrong, and here’s why (from the Agile Manifesto principles):

Business people and developers must work together daily throughout the project.

While it’s true that the daily standup in Scrum need not be the daily interaction, it makes sense that business people LISTEN (but not INTERACT) in that meeting in order to understand on what issues the development team is working, and how those issues interplay with each other (Note: scrum calls this the chicken and the pig; business people need to know what’s going on across the development team, but shouldn’t be involved at this point.  However, the daily standup can spur additional conversations).   If your development team chooses to have a daily standup without business people, your team members MUST interact with business people in order to handle changing requirements; they must also communicate at that time what the priorities of the development organization are, and why this particular project is not progressing because some other project takes priority.

Agile development depends on the interaction between developers and business people; to isolate half of the team from the other half of the team will cause disruption to the process.  That leads us to our second myth:

Myth 2: Your development team can be agile in a vacuum.

I call this the Agile-Waterfall mindset; your business organization is separate from your development team.  Your developers are practicing some form of Agile development, but the organization is used to handing off a set of requirements to the developers, and then having them return a product at periodic intervals.  Think of this as the complement to Myth 1; Business people aren’t deemed to be an impediment, but the organization hasn’t endorsed agile development throughout.  Daily meetings with developers aren’t deemed to be a priority by the business people; the organization has developed a culture of handing off responsibilities, and expecting them to be fulfilled without daily guidance.

By definition, you can not have an agile team without input from both developers and business people.   If you want to respond to changing requirements (as frustrating as that can be to developers), you must have input from business people as soon as those requirements change.  Again, you need to handle prioritization, as changing requirements do not necessarily merit immediate priority.

Myth 3: Self-organizing teams self-manage efficiently.

A couple of great principles from the Agile Manifesto deal with communication:

The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.

The best architectures, requirements, and designs emerge from self-organizing teams.

While I believe in the wisdom of these two principles, I don’t want to de-emphasize the need for good, basic software design principles.  Most enterprise development consists of intertwined projects and resources; in order to minimize maintenance issues, adherence to consistent programming standards is a must.  Developers have different naming standards, procedural methodology, and architectural perspectives; a good team has a playbook that ALL members of the development team (regardless of what project team they serve) follow. If you have one database developer that makes heavy use of schemas, and another one that doesn’t, maintaining each other’s code requires some additional effort on their parts.  Furthermore, when teams are self-formed of roughly equally-experienced developers, resolution of architectural decisions can be difficult.

Development teams need an enforcer; a good manager goes a long way toward resolving interpersonal conflicts before they get started.  Just because teams communicate well (and good communication includes conflict), it does not necessarily mean that those same teams will develop quality code in an efficient manner.  Good teams need good direction.

Summing Up.

If you’ve made it this far, I hope I’ve given you some food for thought, as well as encouraged you to go back and revisit the Agile Manifesto, as well as your own organizational processes.  Let me sum up with a final thought from the Agile Manifesto:

At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

Reminder: Data Architecture PASS VC presentation tomorrow

Just a quick note: I’ll be presenting from the wilds of Hoschton, GA via Live Meeting tomorrow night at 8PM EST.  Details cut and pasted below from http://dataarch.sqlpass.org/:

 

Subject:
From DBA to Data Architect: Changing Your Game

Start Time:
Thursday, May 19, 2011 8:00 PM US Eastern Time (May 20, 2011 1:00 AM GMT)

End Time:
Thursday, May 19, 2011 9:00 PM US Eastern Time (May 20, 2011 2:00 AM GMT)

Presenter:
Stuart Ainsworth (blog|@CodeGumbo)

Live Meeting Link:
https://www.livemeeting.com/cc/UserGroups/join?id=JQTC9F&role=attend&pw=h%3E%234n%212Mj

From DBA to Data Architect: Changing Your Game
The role of database administrator has been around for years, but as information collection and storage needs have skyrocketed, a new career opportunity has opened within enterprises: the data architect. This session is intended to give some guidance on the differences between database administrators (and their cousins, the database developer) and data architects, including specific advice on how to transition from one role into the next.

Stuart Ainsworth
Stuart R Ainsworth, MA, MEd is a Database Architect working in the realm of Financial Information Security; over the last 15 years, he’s worked as a Research Analyst, a report writer, a DBA, a programmer, and a public speaking professor. He’s one of the chapter leaders for AtlantaMDF, the Atlanta chapter of PASS. A master of air guitar, he has yet to understand the point of Rock Band ("You push buttons? What’s that all about?").

#msteched Columnstore indexes unveiled–DBI312

Live blogging again; hope you find my notes useful (scattered though they are).  I’ve been waiting on this session because it’s a very specific area of interest.  I work a lot with VLDB’s, and performance is always a concern; claims are that Denali’s columnstore may boost performance of certain queries hundred-fold.  Let’s see how they work, and I’m hoping I can convince my boss to set up a test bed to try this out.

Presenter is Eric N Hanson from Microsoft (Twitter). 

We start off with a story; I like story-time.  Actually, it’s a very effective way to break out user cases.

Buzzphrase for Columnstore: “Enabling interaction with data”.  Supposed to be super efficient, and get large amounts of data back from SQL Server Denali.  Internal project name is Apollo; columnstore is only part of the picture.

Area of focus is BI & DW: load large amounts of data, high-read, incremental loads.  Partitioning is mandatory for this feature.

Curious as to why the examples join tables in the WHERE clause, and not the more accepted syntax of JOIN.

K, here comes the magic: example uses a Fact Table with 100 million rows in it.  Clustered on a date column, and a columnstore index.  Clustered index is still B-TREE; columnstore indexes are nonclustered.

Running duplicate queries; using index hint to force optimizer to use the clustered index in one example.  Wow; 100,000,000 rows of data aggregated in a second on a two-year old laptop.  50x speedup on this particular hardware.  According to presenter: “this is the biggest enhancement to SQL Server since we bought the code from Sybase".

And here’s the meat and potatoes; how does this work?  Vertical partitioning stores each column in a separate page.  Columnstore is based on the same code as PowerPivot and the BI engine;  Vertipaq if you want to do more reading on this.  Columnstore data is highly compressed, so smaller footprint to read from disk and can be stored in main memory.

New query execution plan: batch processing.  “the edsel is the way of the future”.  Actually, the idea is that batches of vectors are stored in query plan; highly efficient data representation.  We can also scale to more cores: tests are showing linear acceleration up to 32 cores.

Instead of storing data as a page, data is stored as a column segment which represents about 1,000,000 rows.

Questions have begun; some questions are good, but this is a 300 level session, folks.  If you don’t understand basic SQL syntax (like how to create an index), this may not be the session for you.  Great question about the relevance of traditional indexes after this is unveiled, and Hanson’s response: in most Decision Support Applications, columnstore is the way to go particularly for scans.

Some index hints for choosing the columnstore or ignoring it:

WITH (index(index_name))

OPTION (ignore_nonclustered_columnstore_index) <—use for bad plan selection if necessary.

Same traditional rules for index hints: trust the optimizer first, rewrite second, and then use hints last.

A couple of new icons for query execution plans: columnstore scan, and batch hash table processing.  Each execution operator now operates in either batch mode or row mode; batch mode is what you want for speed. 

New term of interest: dictionary.  A dictionary is storage for unique values with a lookup so that a column can stores highly compressed information.

Most things just work with SQL Server; Backup and Restore, Mirroring, SSMS, etc.

Lots of datatypes don’t work with column store: long decimals, binary, BLOB, uniqueidentifier, long datetimes, CLR,  (n)varchar(max).

query performance restrictions: outer joins, Unions; Stick with Inner Joins, Star Joins (need to look this one up) Aggregation.  About to show a query which doesn’t benefit from batch processing.  Essence is below:

SELECT t.ID, COUNT(t2.ID)
FROM t LEFT JOIN t2 ON t.ID=t2.ID
GROUP BY t.ID

Left Join knocks it out of batch processing; need to rewrite as an INNER JOIN, but note that you lose the NULL values, so you have to use a CTE; need to get slides for his sample, but you do an INNER JOIN in the CTE, and then do an OUTER JOIN. 

WITH CTE( INNER JOIN)
SELECT blah
FROM t OUTER JOIN CTE ON t.ID yada yada.

Adding data to columnstore; basic methods:

1.  Drop and re-add the index before load.  Expensive, but works well with traditional daily builds

2.  Partition switching.  Sweet spot needs to be tested, but easy one is the hour.  NOLOCK queries pre-empt the ability to do paritioned queries.  Need to read up on this, but may be fixed in future version

3  trickle load can be done, but needs to be tested.

Very awesome; I cannot wait until this is actually released in CTP 3, so I can play around with it.

#msteched “Juneau” preview

Sitting in the Database Dev session at Tech Ed, taking a look at “Juneau”, the new development suite for SQL Server.  I can’t wait to see if this thing slices and dices data like a Ronco food processor.  Again, live blogging, so please excuse the scattered thoughts, poor spelling, and other bad habits below.

Database development is hard.  I’d love to use that sometimes when my boss asks me why something is running behind schedule. 

Great question!  Where does the definition of the database live if most of your code is built on ALTER scripts?

We should work declaratively, not scripted; devs write CREATE statement, and let the tools manage the change statements.

Demos forthcoming…

 

Connected development; working directly with the database server.  Much like SSMS, but it’s obviously the VS shell instead.  Operation is very similar to SSMS in that you build and execute queries inside of the shell; I wonder if execution plans work.

Table designer: uh-oh.  Not sure if this is a good idea.  However, the tools do drift detection before building an active script.

Side note to Microsoft presenters: sometimes the patter works, but only when it’s natural.  Usually, it feels scripted, which makes it clunky instead of funny.

 

Offline Development: Import database into a project, to build a model from the database.  Declarative code is the keyword.  Left-side versus right-side development; idea is that information is arranged depending on the perspective from which you are accustomed.  DBA’s will see the traditional database hierarchy; developers will see the namespaces.

Table designer preview again; nice thing is that it is generating scripts on the fly.

SSTD is deployed to a new lightweight test-and-debug single user instance on your desktop.  Cool stuff; no need to deploy to a test server; it’s there already.

Ughhhhh; live demos are painful.  I’m going to remember to NOT DO THAT EVER AGAIN when I present. 

Yay!!! Execution plans are there!!!!  SWEET.

Interesting way to handle connected and disconnected development.  Since you are building CREATE statements, you can’t run them in a connected sense; execute fails.  But what happens when you run it and are building a new object?  Will it execute?

OMG!!!!!!!! THEY ARE SHOWING A STORED PROCEDURE WITH A F’ING WHILE LOOP IN IT!!!!! I can’t believe that one slipped by.

Cooling off; I totally zoned out on the last view steps.  Lemme calm down and move on to the next one.

 

Publish to SQL Azure: More accurately, the project can be targeted to various editions, including Azure.

Nice feature to work online and offline.  Not much else to say; the tool changes the model to fit the deployment target.

 

SSMS is NOT going away, but the majority of the development functionality will be replicated in Visual Studio. Both will have Server Explorers and query windows, but SSMS will still ship as a part of the client tools for SQL Server.

Schema comparison tools still are lackluster compared to Red Gate, but they seem to have improved a little since I last remember them.  Not really much new here compared to the database projects of 2010 and 2008, but it looks like there has been some thought about best practices.

Overall, a good presentation, but I still think there’s a lot left to do.

#msteched skydiving into the cloud

Here’s a first for me; I’m live-blogging from a keynote.  Diving straight into it, so please excuse the sparseness of the notes.

Robert Wahbe trying to lay out the vision for the cloud from Microsoft; trying to establish that the future is the cloud.  I hate buzzwords, but I think the point of the metaphor is to draw together private vs public application development.  If you build your internal private apps as a private cloud, it should be easy to move to a pubic distributed architecture.

Factors driving the move to the cloud:

  • Extension of existing applications
  • large data sets
  • high perf computing
  • events
  • marketing campaigns (high spike traffic)

Talk about Travelocity’s use of the public cloud.  Focus on scalability, and new products to handle the traffic.  Capacity on demand is the buzzword that they use, and I wish I had a server room like they just showed. 

How easy is it to do this is the question they raise?  I’m still a skeptic.  Biggest problem is not a greenfield scenario, but migration from an existing infrastructure.  Let’s see if demo guy (I missed his name) can convince me. 

Ooh, a Contoso application; I’ve been stuck on AdventureWorks for too long.  Somebody bring back Northwinds.

Ooh, we all need AUTOMATION!  AUTOMATION is what we need! Really?

OK, the VMM service deployment looks cool.  

OK, ADHD just kicked in, and I’ve heard blah, blah, blah for the last couple of minutes.  Apparently, Systems Center will let you do all kinds of one-button stuff.    Now, can I get management to convert from an app that had no clue about that 10 years ago but is still critical to my business?

Now, SQL, and Sharepoint with Amir Netz.  Project Crescent looks interesting. Wow, I like the animation.  I think there will be a lot of future in Management of Business Analysts.  Too much information can cripple a company just as bad as too little; there’s going to have be some training in “what are good questions to ask?”  We’ve finally got the tools to tell us “42”, but do we understand our businesses well enough to understand how we got there?

Average number of devices is 4, so a lot of us are now carrying 6 or 8.  I have two on me right now (work lappy and iPhone), and two at home.  I feel geek-deprived.   Must buy more gear.

Windows Phone 7 interface is nice, but I thin Microsoft has a long way to go to win the smartphone wars.  I really don’t want to go back to carrying multiple devices (smartphone for personal, and smartphone for work).   

Just realized that my butt is starting to hurt; that’s a good indicator that a keynote is running long.  Time for the mid-keynote stretch. 

I’m wondering how many butterflies the Kinect guy has right now; the first few gestures slipped a little.

Second half of the keynote?  Yeesh….  I hope there’s a power plug in the building somewhere.

I chuckled a little when the guy said that “if you’re an ASP.NET developer, you can write Sharepoint Applications.”  I think of all of the swearing that my ASP.NET developer buddies do when you mention Sharepoint…

OK, I’ve done as much as I can for now; I’ll try to blog more today around the sessions.