management

#DevOps: Embrace the Ops

If you’re at all in touch with the DevOps community, you’re probably aware of the GitLabs Incident on 1/31/2017; I won’t spend too much time rehashing it here, but GitLabs has done a great job of being transparent about the issue and their processes to recover. Mike Walsh (Straight Path Solutions) wrote a great blog post about it entitled DevOps: Don’t Forget The Ops, which covers a lot of ground from a database administration perspective. Mike ultimately ends up with three specific action items for DevOps teams:

  1. Plan to Fail (so you don’t)
  2. Verify Backups (focus on restores, not backups)
  3. Secure your environment (from yourself).

I agree with all of these ideas; I think Mike is spot on about the need to Remember the Ops in DevOps. However, I want to go a step further, and encourage DevOps adoptees to Embrace the Ops.

What do I mean by that? Let me start with this; Brent Ozar posted this on Facebook yesterday (the image will take you to the job description):

Now, it’s obvious that GitLabs had a backup strategy (they detailed it in their notes), so I don’t mean to imply that they didn’t expect administrative tasks from their database people, but I do think we can infer that administrative tasks were not prioritized as much as other tasks (high availability, performance tuning, etc.). Again, we know that GitLabs had strategy for backups, so it appears that this is a cultural issue (at least based on this flimsy evidence and the outage). And to some degree, that’s understandable; one of the longest running challenges on the operations side is being labeled as a cost center as opposed to development being viewed as a revenue generator. This perception is pervasive in traditional IT shops, so it’s probable that even Unicorn shops share some of this mentality. Development (new features) makes money; Operations cost money.

However, in a true DevOps model, the focus is on delivering quality services to customer, faster. New features may bring new clients, but reliable service retains clients; both are revenue generating. So while it may add some cost to deliver quality service to customers, cutting corners in operations risks impacting the bottom line. From this perspective, I’m arguing that DevOps shops should not only remember the ops, they should embrace it. The entire value stream of a business service includes people, procedures, and technology split into teams; the fewer the teams per service, the fewer silos. So how do we embrace the ops?

  1. If Ops is part of the Value Stream, then apply consistent Development principles to it. I’ve written before that “we are all developers“, and I believe that; administrators are creative folk, just like application developers. Operations includes backup, monitoring, and validation. We should apply development principles to these operations, like creating reusable scripts, finding opportunities for automating validation, and logging (and investigating) errors with that pipeline. We should use source control for these tools, and treat the operations pipeline like any other continuous integration project (automate your backup, automate your restores, and log inconsistencies).
  2. Include operational improvements as part of the development pipeline. I’m borrowing a lot from Google’s SRE model; SRE is what you get when you treat operations as if it’s a software problem (see point 1 above).  However, the SRE model is usually a self-contained bubble within operations; they have their own pipelines for toil reduction. I think if DevOps wants to truly embrace operations, developers need to include toil reduction in the service delivery pipeline. If operations folks have to flip 30 switches to bring an app online, development should make it a priority to reduce that (if possible). It goes back to the fundamental rule for DevOps: communicate. Help each other resolve pain points, and commit to improving everything in the value stream.
  3. Finally, balance risk and experimentation with safety. Gene Kim’s The Phoenix Project provides the Three Ways, and the Third Way is all about creating a culture that rewards risk and experimentation. This is great for developers; try something new, and if it breaks, you can deliver a fix within hours. However, as the GitLabs incident shows, some fixes can’t be delivered, and risk needs to be mitigated by secure data handling processes and procedures. While I’m a big fan of controlled failures (e.g., shutting a server down hard in order to see what the impact is), you don’t do that unless you can test it in a lab first and make sure you have good mitigating option (how do you recover? What error messages do you expect to see? Are you sure your backup systems are working?). Don’t forsake basic safety nets while promoting risk; you want competitive advantages, but you also want to stay in business.

5 #DevOps Books I plan to finish this year

New Year.  Resolutions, etc. 🙂

I’m notoriously bad about starting a book and never finishing it, particularly when it’s a technical book.  My goal this year is to finish the following 5 books:

The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations

Gene Kim is perhaps best known for his novel “The Phoenix Project”, which lays out the fundamental precepts for DevOps.  The Handbook (by Kim, Patrick Debois, John Willis, and Jez Humble) gets great reviews, and I think it does a good job of translating theory into practice.  I’ve only finished about a third of it, so I’ve still got a lot of reading left to do, but I hope to finish it soon.

Site Reliability Engineering: How Google Runs Production Systems

This one might be a little easier to cheat on my goal; I’ve already read most of it.  It’s a collection of papers written by various SRE’s within Google, and gives some great insights into their vision of applying developmental principles to operation problems.  While it could be argued that the SRE model is distinct from DevOps, there’s enough overlap that it makes sense to apply these techniques to my DevOps study.

Level Up Your Life: How to Unlock Adventure and Happiness by Becoming the Hero of Your Own Story

This one’s a bit of a stretch for most DevOps folks, but if you think of it an approach to personal continual improvements, then it makes sense why this book belongs in a DevOps collection.  I started reading this one last year, and quickly off the bandwagon.  My goal is to try and finish it by the middle of the year, and hopefully begin to apply some of the principles to my personal and professional challenges.

The Art of Capacity Planning: Scaling Web Resources in the Cloud

I heard John Willis at DevOpsDays Nashville this year, and he recommended following and reading John Allspaw (among other people); the second edition of this book is coming out this year, so I’ll probably wait till it arrives.  While I don’t do much with either web or cloud development, the principles of scaling is relevant to all kinds of applications.

Team of Teams: New Rules of Engagement for a Complex World

Damon Edwards actually recommended this book during a webcast I saw a couple of months ago, and while it’s not a technical book, it speaks to the art of transforming a large, complex organization with entrenched policies into a nimble, responsive team.  Brownfield to greenfield (with military references).

My Challenges with #DevOps

As I’ve alluded to in earlier posts, my career goals have transitioned away from database development and administration into DevOps implementations; it’s been a bit of a challenge for me, because I feel like a stranger in a strange land all over again. Looking at familiar problems turned on their sides isn’t for me, and my day job has some particular challenges that I need to figure out appropriate solutions for. All of that being said, I’ve enjoyed it. However, in an attempt to help me wrap my head around things, I wanted to list out the struggles I’m facing, and include my current “solutions”; these may change over time, but it’s where I’m at today.

  1. It is what it is…

One of the challenges of describing DevOps is that it’s a conglomerate of technical and cultural changes. System and software engineers can easily understand the technical components of iterative software deployment, but it’s tough to describe the organizational and procedural changes necessary to implement a rapid deployment environment (“You want the developers on pager duty? How will they administer the system?”). Most engineers have a tough time interpreting The Phoenix Project, because it’s not a technical manual; they’re used to step-by-step guides, not cultural strategies.

My solution is to describe DevOps as a philosophy, not a methodology. Philosophies have general principles that you agree to abide by (such as seeking efficiency through automation, increasing feedback, and documenting problems without blame); methodologies are strategies for implementing philosophies. What this means is that as a manager, my method of implementing DevOps is probably different than yours, and there may even be differences within the organization; the key focus should be on reducing or eliminating silos through communication. Where those silos must remain (say, in a highly regulated environment where development and operations need to be separate), workflow in each silo needs to be as transparent as possible.

  1. Brownfield change is harder than greenfield development.

Greenfield projects are new software projects or initiatives; brownfield development is a revitalization of an existing project. While each has challenges from a DevOps perspective, the technical and cultural debt that is associated with brownfield development often slows down adoption of DevOps practices. It’s tough to maintain a system while making suggestions for improvement, particularly when multiple departments are involved, all with their own goals.

For me, it all goes back to two principles: tight focus, and increased communication. As a change agent, I need to carve out time each week to focus on one small, incremental change that I can make to increase efficiencies; for example, I’m currently working on developing a standard postmortem practice for tracking issues with our business service (using this free guide from VictorOps). The point is that I may not have opportunity to make sweeping changes, but I can do a little bit at a time (and encourage my team to do so as well).

  1. Compliance is a constraint.

Working in the financial industry brings some unique challenges to implementing DevOps; while philosophically the ideal DevOps process is to build automation pipelines from development through deployment, regulatory policies are written to dictate separation and control between environments. The thicker the wall, the more likely you are to successfully pass an audit, but those walls make it tough to attain rapid development. If your developers have one set of goals (deploy new features) and your operations team have another (keep the system stable with few changes), you’ve got to figure out a way to reconcile those.

Communication is key, and that includes have a common issue tracking system to report operational issues to development as soon as they occur; I don’t manage the developers in my business unit, so I can’t set their priorities. But I can make them aware of the pain points, and the expenses associated with those struggles. I can also find ways to make our infrastructure more predictable so that developers can develop code faster, and our QA teams can automate tests with some assurance. It’s tough, but it’s my goals.

Summary

I realize that this post may not be that insightful, but I’m looking at it as an effort to keep writing and thinking about these issues. Expect more from me in the future as I continue to try and learn something new.

#DevOps – Starting the path

Last week, I had the pleasure of attending my first DevOps conference (DevOpsDays Nashville), and the first conference that had absolutely nothing to do with SQL Server since graduate school. The conference was great; I met some new folks, and had some great moments of insight early on. The DevOps community is a very welcoming community, but as a relatively new explorer, I quickly got lost. The sessions ranged from culture to technology, and somewhere in between; the speakers were very casual, and to be honest, a little unstructured. It was very different than my previous experiences at tech conferences, where even professional development talks were goal-focused.

One the last day of the event, during one of the keynotes, I encountered the following tweet:

Paul Reed’s piece is interesting, because it helped me begin to form a schema around all that is DevOps; he defines the following frameworks underneath the umbrella of DevOps:

Automation and Application Lifecycle Management – basically, the tools needed to insure micro-deployments and increased time-to-market. The focus here is on Continuous IntegrationDeploymentDelivery, and Configuration Management.

Cultural Issues – Reed breaks these out into a relatively fine grain in a manner that he didn’t do with technology, but he defines three subsections of culture: Organizational Culture, Workflow, and Diversity. Each of these areas has a slightly different area of focus, but they all deal with the “soft, fuzzy, human interactions that lie just beyond the technology.” Organizational Culture can refer to breaking down silos, restructuring, and/or management & leadership. Workflow refers to the application lifecycle methodology, such as the conjunction of Agile and Lean methods.  Diversity is a focus on bringing different voices into technology, including feminist, LGBTQ, and minority perspectives.

The Indescribable – finally, Reed finishes with basically an “everything else” definition. Some organizations which have pioneered DevOps practices don’t really call them that; they’ve just organically grown their own processes to focus on efficiency, including minimizing silos and rapid delivery.

I have to admit, it helped me to reframe my perspective on various lectures using this structure; that may not be true to the spirit of DevOps itself (which de-emphasizes artificial constraints), but it’s given me some room for thought in terms of the areas I want to study and learn. The technology is interesting, but I don’t see myself working in those areas anytime soon; my real focus will be on organizational culture and workflow. As a manager, I want to increase my efficiency by building a team, and I think the path I’m on is to figure out how to push my team of system engineers and database administrators to prepare for continuous integration.

More to come.

One (Last) Trip to the Emerald City for #SQLPASS

On Monday, I’m flying out to the Emerald City (Seattle, WA) for the annual gathering of Microsoft database geeks known as the PASS Summit; as always, I’m excited to see friends and learn new stuff. However, this will probably be my last Summit. Over the last few years, my career trajectory has taken me away from database development and administration, and it’s time that I start investing in the things that now interest me (technology management, and operational culture). My goals for the next year are to attend conferences like the DevOps Enterprise Summit and the SRECon; I want and need to learn more about making IT efficient, and managing large-scale applications.

I’m not entirely disconnecting from the SQL community; I still plan to stay active and involved in our local chapter (AtlantaMDF), and part of the organizing committee for our SQL Saturdays. I still want to be a data-driven professional; I’m just not a data professional. That’s a subtle distinction, but it’s important to me. I’ll still sling code part-time and for hobbies, but I’m really trying to hone in on what I enjoy these days, and it’s process, procedures, management, and cultural change in IT (all IT, not just SQL Server).

So, this year will be different for me; instead of trying to network and schmooze and elevate my own SQL skillset, I’m going to hang out in sessions like “Overcoming a Culture of FearOps by Adopting DevOps” ,Agile Development Fundamentals: Continuous Integration with SSDT“, and
Fundamentals of Tech Team Leadership“. I may visit some courses on Cloud development and Analytics, but mostly, I want to enjoy spending some time with folks that I may not see again for a while.

I truly hope to see you there; I owe a lot to all of you, so I’m probably going to have a huge bar tab after buying rounds. Should be an exciting week.

#DevOps “We are all developers”

https://youtu.be/RYMH3qrHFEM

While thinking about the Implicit Optimism of DevOps, I started running through some of the cultural axioms of DevOps; I’m not sure if anyone has put together a comprehensive list, but I have a few items that I think are important. Be good at getting better is my new mantra, and now, I’m fond of saying “We are all developers”. I remember eating lunch at SQL Saturday Atlanta 2016 listening to a database developer describing this perspective to a DBA, and hearing how strongly the DBA objected to that label. I tentatively agreed with the developer, but recently, I’ve gotten more enamored with that statement.

Having worked as both a developer and an administrator, I get it; there’s an in-group mentality. The two sides of the operational silo are often working toward very different goals; developers are tasked with promoting change (new features, service packs, etc). DBA’s are tasked with maintaining the stability of the system; change is the opposite of stability. Most technical people I know are very proud of their work, which means that there’s often a desire for accuracy in the work we do. If a DBA is trying to make a system stable, and you call them a developer (think: change instigator), then it could be perceived as insulting.

It’s not meant to be.

Efficient development (to me) revolves around the three basic principles of:

  1. Reduce – changes should be highly targeted, small in scope, and touch only what’s necessary.
  2. Reuse – any process that is repeated should be repeated consistently; and,
  3. Recycle – code should be shared with stakeholders, so that inspiration can be shared.

From that perspective, there’s lots of opportunities to apply development principles to operational problems. For my DBA readers (all three of you), think about all the jobs you’ve written to automate maintenance. Think about the index changes you’ve suggested and/or implemented. Think about the reports you’ve written to monitor the performance of your systems. Any time you’ve created something to help you perform your job more efficiently, that’s development.

DevOps is built on the principle of infrastructure as code, with an emphasis on giving developers the ability to build the stack as needed. Google calls its implementation of DevOps principles Site Reliability Engineering, and characterizes it as “what you get when you treat operations as if it’s a software problem”. Microsoft is committed to DevOps as part of its application lifecycle management (although it’s notably cloud-focused). When dealing with large-scale implementations, operations can benefit from the application of the principles of efficient development.

We are all developers; most of us have always been developers. We just called it something else.

The Implicit Optimism of #DevOps

One of my favorite podcasts lately is DevOps Café; John Willis and Damon Edwards do a great job of talking about the various trends in IT management, and have really opened my eyes to a lot of different ways of thinking about problems in enterprise systems administration. On a recent podcast, John interviewed Damon about his #DOES15 presentation, “DevOps Kaizen: Practical Steps to Start & Sustain a Transformation“. During that conversation, Damon mentioned a phrase that really resonated with me: Be Good at Getting Better.

At the heart of the DevOps philosophy is the desire to improve delivery of services through removal of cultural blockages. Success isn’t measured by the amount of code pushed out the door or the number of releases; it’s the ability to continuously improve over time. Companies that experiment (even with ideas that don’t work) learn a different way to approach any problem that they face. The freedom to experiment means that failure is not an outcome; it’s a method of improvement.

The optimism of that appeals to me; I think if you’re focusing on continuous improvement, then you’ve implicitly accepted two fundamental principles of optimism:

  1. Change is necessary for growth, and
  2. Things CAN improve (you just need to figure out how).

There’s some beauty in that; if you’re an organization facing overwhelming technical debt, it’s not uncommon to sink into a spiral of despair, where changes are infrequent for fear of breaking something. Mistrust breeds, as organizations point fingers at other teams for “failing to deliver”. You quit working toward solutions, and instead focus on fighting fires and maintaining some sort of desperate last stand.

You’re better than that.

DevOps is a cultural change; it’s an optimistic philosophy focused on changing IT culture while being open to different strategies for doing so. If you can commit to Be Good at Getting Better, you can change. It may be slow, it may be frustrating, but every day is an opportunity to incrementally move the ball forward in delivering quality business services. The trick is not to focus on where to begin, but simply to begin.


Where’s your slack?

I’ve been rereading the book Slack: Getting Past Burnout, Busywork, and the Myth of Total Efficiency recently.  As I alluded to in my last post, my life has been rough for the last few months.  My nephew’s passing took the wind out of an already saggy sail; I’ve spent a great deal of time just trying to balance work, family, and life in general.  Some people turn to counselors; I turn to project management books.

The premise of the book is that change requires free time, and that free time (slack) is the natural enemy of efficiency.  This is a good thing; if you are 100% efficient, you have no room to affect change.  Zero change means zero growth.  I’ve been a proponent of slack for a while (less successfully than I’d like); it makes sense to allow people some down time to grow.  Just to be clear, slack isn’t wasted time; it’s an investment in growth.  Slack tasks include:

  • Research into interesting projects.  Lab work allows you to experience the unexpected, which gives you time to prepare for the unexpected in production
  • Building relationships. Teams are built on trust, and trust is earned through building relationships.  Teams that like each other are more likely to be successful when it comes to problem solving.
  • Shadow training.  Allow team members to work in other teams for a while; learn how the rest of the company operates.

In short, slack is necessary in order to promote growth; if you want your organization to stay ahead of it’s competition, cutting resources in the name of efficiency is sure-fire plan for losing.  The best advice for slack time is the 80/20 rule; run your team at 80% capacity, and leave 20% for slack.  In the case of emergency, slack time can be temporarily alleviated, but it’s the responsibility of management to return to normal work levels as soon as possible.

So what does this mean for me personally?  In the name of efficiency, I let slack time go.  I work a full time job, a couple of different consulting gigs, act as a chapter leader for AtlantaMDF, and am an active father.  I have no hobbies, and suck at exercise.  I love to travel, but trips are planning exercises in and of themselves.  In short, I have zero slack to deal with emergencies.  When something goes wrong and time gets compromised, I immediately feel guilty because I’ve robbed Peter to pay Paul in terms of time.  That’s not living,

I’m done with that.

Change is incremental, so I’m not planning on upsetting the apple cart just yet, but I am trying to figure out ways to make my slack time more of a recharge time.  Don’t get me wrong; I waste time.  I sit and stare at Facebook like the rest of the modern world; I binge on Netflix when new series drop.  That’s not slack, and it doesn’t recharge me. Slack is using free time to grow, to change.  My goal is to find an hour a week for growth-promoting free time.  I’ll let you know how I’m doing.

 

(Personal) Kanban Myths: The Myth of The Important Task

Continuing in my efforts to chronicle myths of kanban utilization, I thought I would tackle the second biggest misconception I see surrounding kanban boards.  As I discussed in my previous post, many people mistake kanban to be a process for task management, when in reality, it’s a visualization of some other process.  The key takeaway is that you should spend some time making your board match your process; a kanban board should emulate your workflow, not the other way around.

So you’ve invested the time, and you now have a complex board that accurately reflects how you do work.  You’re humming along, getting things done.  Life is good, right?

Almost.  If you’re just using a kanban board to visualize a process, there’s a temptation to accept the following:

Myth 2: Kanban is a visualization tool primarily focused on (important) task management.

This is partially true; in industrial kanban, workers may use a kanban board to keep track of individual issues as they move throughout the workflow.  Managers, however, should primarily use the tool to look for opportunities to continuously improve their processes.  Once your kanban board matches your process, it becomes easy to understand where bottlenecks occur (both resource allocation and/or unnecessary processes).   Tuning workflow is a critical part of kanban utilization.

For personal kanban, however, managing resource allocation becomes a bit of challenge; how do you manage yourself?  You’re already too busy working through your pile of stuff.  Unless you can recruit other friends or family members (the Tom Sawyer approach), it’s unlikely you’ll be able to adjust resource allocation.  You can, however, begin to look for opportunities to tune processes.  How?

This is where the conversation has to drift away from kanban a bit; as a tool, a board allows you to visualize workflow and primarily focus on improvement, but in and of itself the measure of improvement isn’t part of the board.  In other words, you can see how things work, but there’s no built in visualization for determining if something has room to improve.  You have to decide what that method of improvement will be.  To improve your processes, you must define the metrics for improvements.  Those metrics are known more commonly as goals.

Goals are a critical component of a successful kanban implementation.  For example, if you have a personal goal of “I want to lose 50 pounds in the next year”, that goal should influence your decision on which tasks to pursue (and what the priority of those tasks are).  In other words, if your kanban board shows that you’re getting a lot done, but no tasks are associated with the goal of losing weight, you’ve got some room to improve your processes.

So, in summary:

  1. Spend some time making your board match your processes (at least 30 days).
  2. Define your goals (metrics for improvements)
  3. Take some time to tweak your processes to align them with your goals.

Minor incremental adjustments are more likely to be adopted than sudden and swift changes (see my management notes about change curve).  Kanban is a long-term tool, but can be highly effective at improving workflow.

 

(Personal) Kanban Myths: The Myth of The Process

Recently, my friend Joe Webb has posted some great resources on Personal Kanban on his Facebook timeline.  Joe’s an influential guy in the tech community (particularly the #SQLFamily), so I’ve been excited about the flurry of emails and comments regarding the adoption of Kanban techniques.  I’ve been using Kanban boards for a while now at work, and it’s interesting to see the differences between Personal Kanban and “industrial” Kanban.   The significant distinction between the two appears to be the impetus to “get things done” in Personal Kanban by using a very simple abstraction; in other words, start with a simple board of “To Do”, “Doing”, and “Done” and attack your task list.

I think this is a great way to get started with Kanban, but I also think that it’s easy to forget about some critical components of Lean thinking.  After observing a couple of email chains from friends (and comments on Joe’s Facebook thread), I thought I’d blog about some common misconceptions of Kanban, starting with:

Myth 1: Kanban is a process to manage my task list.

This is probably the biggest trap that most people fall into when they decide to get started.  The simplicity of Kanban is so appealing; just throw up a board and start moving cards left to right.  Getting things done; Kanban helps you do that, right?

Sort of.  Kanban is not a process; it’s a visualization of your process.  The distinction may appear to be subtle, but it’s important.  A simple board showing three columns (i.e., “To Do”, “Doing”, and “Done) assumes that your method of handling tasks is equally simple.  Your process should drive your board, not the other way around.  While the act of defining tasks will yield some immediate benefits, oversimplifying the visualization has some costs.

As a concrete example, let’s assume that one of your tasks is to call and make an appointment with your doctor.  You move the card to the Doing pile, call your doctor, and then get informed that they’ll have to call you back.  Can you move the card to Done?  You haven’t made the appointment.  Do you leave the card in Doing?  Are you doing anything with it besides waiting?  Do you move it back to To Do?  You’ve already started working.   If your board is driving your process, the temptation is to leave the board alone and struggle with task movement.  If your process is driving your board, you change the board.  Add a column for waiting tasks, move the card, and then revisit that pile as needed.

Don’t get me wrong; starting with a simple board is a GREAT way to get started with the fundamentals of visualizing workflow, especially if you don’t know what your process is yet.  However, as you discover more about the way you work, don’t try to change the process (at first); make sure that you spend some time developing your board so that it matches the way you do work.  Improvement will come later.