culture

One Weird Trick to Build a #DevOps Culture

I know it’s clickbait, but I really did want to reinforce the simplicity of this post. I think building a DevOps culture can sometimes be daunting for most folks from traditional IT backgrounds because, well, people aren’t systems. You see, both developers and system engineers are comfortable with technology; it’s usually predictable, and it’s relatively easy to manipulate.

People (and processes) are messy. The social aspect of a sociotechnical environment is full of friction: different backgrounds, different perspectives, differences of opinion. It all creates a challenge for solving business problems quickly because it takes energy to build and maintain a relationship. However, once those relationships are established, the benefits multiply. When you are challenged by a peer you trust to defend an idea, it makes the idea better. The old adage of “iron sharpens iron” is true; smart people make smart people smarter.

But how do you build trust? The first step is easy.

Be thankful.

That’s it; acknowledge when people take a risk, and thank them for stepping of their comfort zone. Be thankful when they offer an opinion that’s different than yours, even if you don’t take the advice. Specifically acknowledge their contributions that make your job easier, and let them know why. It takes some time (and occasionally some effort), but it establishes a pattern of trust. This is particularly important when you’re in a leadership position, and they are not; the best ideas come from people who feel like they can contribute on a regular basis, especially to people in power.

Take the extra time to say “Thank You”, and establish a foundation of trust.

#DevOps: Embrace the Ops

If you’re at all in touch with the DevOps community, you’re probably aware of the GitLabs Incident on 1/31/2017; I won’t spend too much time rehashing it here, but GitLabs has done a great job of being transparent about the issue and their processes to recover. Mike Walsh (Straight Path Solutions) wrote a great blog post about it entitled DevOps: Don’t Forget The Ops, which covers a lot of ground from a database administration perspective. Mike ultimately ends up with three specific action items for DevOps teams:

  1. Plan to Fail (so you don’t)
  2. Verify Backups (focus on restores, not backups)
  3. Secure your environment (from yourself).

I agree with all of these ideas; I think Mike is spot on about the need to Remember the Ops in DevOps. However, I want to go a step further, and encourage DevOps adoptees to Embrace the Ops.

What do I mean by that? Let me start with this; Brent Ozar posted this on Facebook yesterday (the image will take you to the job description):

Now, it’s obvious that GitLabs had a backup strategy (they detailed it in their notes), so I don’t mean to imply that they didn’t expect administrative tasks from their database people, but I do think we can infer that administrative tasks were not prioritized as much as other tasks (high availability, performance tuning, etc.). Again, we know that GitLabs had strategy for backups, so it appears that this is a cultural issue (at least based on this flimsy evidence and the outage). And to some degree, that’s understandable; one of the longest running challenges on the operations side is being labeled as a cost center as opposed to development being viewed as a revenue generator. This perception is pervasive in traditional IT shops, so it’s probable that even Unicorn shops share some of this mentality. Development (new features) makes money; Operations cost money.

However, in a true DevOps model, the focus is on delivering quality services to customer, faster. New features may bring new clients, but reliable service retains clients; both are revenue generating. So while it may add some cost to deliver quality service to customers, cutting corners in operations risks impacting the bottom line. From this perspective, I’m arguing that DevOps shops should not only remember the ops, they should embrace it. The entire value stream of a business service includes people, procedures, and technology split into teams; the fewer the teams per service, the fewer silos. So how do we embrace the ops?

  1. If Ops is part of the Value Stream, then apply consistent Development principles to it. I’ve written before that “we are all developers“, and I believe that; administrators are creative folk, just like application developers. Operations includes backup, monitoring, and validation. We should apply development principles to these operations, like creating reusable scripts, finding opportunities for automating validation, and logging (and investigating) errors with that pipeline. We should use source control for these tools, and treat the operations pipeline like any other continuous integration project (automate your backup, automate your restores, and log inconsistencies).
  2. Include operational improvements as part of the development pipeline. I’m borrowing a lot from Google’s SRE model; SRE is what you get when you treat operations as if it’s a software problem (see point 1 above).  However, the SRE model is usually a self-contained bubble within operations; they have their own pipelines for toil reduction. I think if DevOps wants to truly embrace operations, developers need to include toil reduction in the service delivery pipeline. If operations folks have to flip 30 switches to bring an app online, development should make it a priority to reduce that (if possible). It goes back to the fundamental rule for DevOps: communicate. Help each other resolve pain points, and commit to improving everything in the value stream.
  3. Finally, balance risk and experimentation with safety. Gene Kim’s The Phoenix Project provides the Three Ways, and the Third Way is all about creating a culture that rewards risk and experimentation. This is great for developers; try something new, and if it breaks, you can deliver a fix within hours. However, as the GitLabs incident shows, some fixes can’t be delivered, and risk needs to be mitigated by secure data handling processes and procedures. While I’m a big fan of controlled failures (e.g., shutting a server down hard in order to see what the impact is), you don’t do that unless you can test it in a lab first and make sure you have good mitigating option (how do you recover? What error messages do you expect to see? Are you sure your backup systems are working?). Don’t forsake basic safety nets while promoting risk; you want competitive advantages, but you also want to stay in business.

My Challenges with #DevOps

As I’ve alluded to in earlier posts, my career goals have transitioned away from database development and administration into DevOps implementations; it’s been a bit of a challenge for me, because I feel like a stranger in a strange land all over again. Looking at familiar problems turned on their sides isn’t for me, and my day job has some particular challenges that I need to figure out appropriate solutions for. All of that being said, I’ve enjoyed it. However, in an attempt to help me wrap my head around things, I wanted to list out the struggles I’m facing, and include my current “solutions”; these may change over time, but it’s where I’m at today.

  1. It is what it is…

One of the challenges of describing DevOps is that it’s a conglomerate of technical and cultural changes. System and software engineers can easily understand the technical components of iterative software deployment, but it’s tough to describe the organizational and procedural changes necessary to implement a rapid deployment environment (“You want the developers on pager duty? How will they administer the system?”). Most engineers have a tough time interpreting The Phoenix Project, because it’s not a technical manual; they’re used to step-by-step guides, not cultural strategies.

My solution is to describe DevOps as a philosophy, not a methodology. Philosophies have general principles that you agree to abide by (such as seeking efficiency through automation, increasing feedback, and documenting problems without blame); methodologies are strategies for implementing philosophies. What this means is that as a manager, my method of implementing DevOps is probably different than yours, and there may even be differences within the organization; the key focus should be on reducing or eliminating silos through communication. Where those silos must remain (say, in a highly regulated environment where development and operations need to be separate), workflow in each silo needs to be as transparent as possible.

  1. Brownfield change is harder than greenfield development.

Greenfield projects are new software projects or initiatives; brownfield development is a revitalization of an existing project. While each has challenges from a DevOps perspective, the technical and cultural debt that is associated with brownfield development often slows down adoption of DevOps practices. It’s tough to maintain a system while making suggestions for improvement, particularly when multiple departments are involved, all with their own goals.

For me, it all goes back to two principles: tight focus, and increased communication. As a change agent, I need to carve out time each week to focus on one small, incremental change that I can make to increase efficiencies; for example, I’m currently working on developing a standard postmortem practice for tracking issues with our business service (using this free guide from VictorOps). The point is that I may not have opportunity to make sweeping changes, but I can do a little bit at a time (and encourage my team to do so as well).

  1. Compliance is a constraint.

Working in the financial industry brings some unique challenges to implementing DevOps; while philosophically the ideal DevOps process is to build automation pipelines from development through deployment, regulatory policies are written to dictate separation and control between environments. The thicker the wall, the more likely you are to successfully pass an audit, but those walls make it tough to attain rapid development. If your developers have one set of goals (deploy new features) and your operations team have another (keep the system stable with few changes), you’ve got to figure out a way to reconcile those.

Communication is key, and that includes have a common issue tracking system to report operational issues to development as soon as they occur; I don’t manage the developers in my business unit, so I can’t set their priorities. But I can make them aware of the pain points, and the expenses associated with those struggles. I can also find ways to make our infrastructure more predictable so that developers can develop code faster, and our QA teams can automate tests with some assurance. It’s tough, but it’s my goals.

Summary

I realize that this post may not be that insightful, but I’m looking at it as an effort to keep writing and thinking about these issues. Expect more from me in the future as I continue to try and learn something new.

#DevOps “We are all developers”

https://youtu.be/RYMH3qrHFEM

While thinking about the Implicit Optimism of DevOps, I started running through some of the cultural axioms of DevOps; I’m not sure if anyone has put together a comprehensive list, but I have a few items that I think are important. Be good at getting better is my new mantra, and now, I’m fond of saying “We are all developers”. I remember eating lunch at SQL Saturday Atlanta 2016 listening to a database developer describing this perspective to a DBA, and hearing how strongly the DBA objected to that label. I tentatively agreed with the developer, but recently, I’ve gotten more enamored with that statement.

Having worked as both a developer and an administrator, I get it; there’s an in-group mentality. The two sides of the operational silo are often working toward very different goals; developers are tasked with promoting change (new features, service packs, etc). DBA’s are tasked with maintaining the stability of the system; change is the opposite of stability. Most technical people I know are very proud of their work, which means that there’s often a desire for accuracy in the work we do. If a DBA is trying to make a system stable, and you call them a developer (think: change instigator), then it could be perceived as insulting.

It’s not meant to be.

Efficient development (to me) revolves around the three basic principles of:

  1. Reduce – changes should be highly targeted, small in scope, and touch only what’s necessary.
  2. Reuse – any process that is repeated should be repeated consistently; and,
  3. Recycle – code should be shared with stakeholders, so that inspiration can be shared.

From that perspective, there’s lots of opportunities to apply development principles to operational problems. For my DBA readers (all three of you), think about all the jobs you’ve written to automate maintenance. Think about the index changes you’ve suggested and/or implemented. Think about the reports you’ve written to monitor the performance of your systems. Any time you’ve created something to help you perform your job more efficiently, that’s development.

DevOps is built on the principle of infrastructure as code, with an emphasis on giving developers the ability to build the stack as needed. Google calls its implementation of DevOps principles Site Reliability Engineering, and characterizes it as “what you get when you treat operations as if it’s a software problem”. Microsoft is committed to DevOps as part of its application lifecycle management (although it’s notably cloud-focused). When dealing with large-scale implementations, operations can benefit from the application of the principles of efficient development.

We are all developers; most of us have always been developers. We just called it something else.