DevOps

#DevOpsDays & #SQLSaturdays

I’ve been meaning to write this post for a while, but life rolls on, as it always does. I had the privilege of attending DevOpsDays Atlanta back in April. This was my second DevOpDays event to attend (the first being Nashville), and overall, I’ve enjoyed the events. However, as a long-time organizer and speaker with the SQL Saturday events, it’s hard for me not to compare my experiences between the two conferences. They’re both community-run, low-cost, voluntary technical events; however, there were some things that I really like about the DevOpsDays format (and some things I wish were different).

Cost

The cost models of the two conferences are different; in short, SQLSaturday’s are free to the attendees (although a lunch fee is usually provided as an optional service), and DevOpsDays charges a small fee ($99-$150). Both rely on sponsors to pick up the tab for the bulk of the expenses (usually location fees). Speakers are volunteers, as well as event management staff. The benefit for the attendee is guaranteed swag (an event t-shirt is typical) and a great lunch (food was fantastic at DevOpsDaysAtlanta).

Charging a higher fee does a couple of things; it allows organizers to get a more accurate attendance estimate; if an attendee pays more to go to a conference, it’s more likely that they’ll show up. This has a trickle effect on luring sponsors; it’s easier to justify sponsoring an event if you know that you’re going to get a certain amount of foot traffic. A fee also guarantees amenities that are important to technical folk; good Wifi, and livestreams (although sessions weren’t recorded at the Atlanta DevOpsDays event). You can also direct some of those funds to getting a premier meeting space.

On the other hand, a free event with a nominal lunch has the potential of bringing in a much larger audience; DevOpsDays Atlanta was hosted in a 230-seat theatre, so attendance was probably around 250 (with standing room, vendors, and speakers). Last year’s Atlanta SQL Saturday had over 600 attendees, and this year’s event had slightly over 500 attendees. Attendance counts shouldn’t be considered a metric of superiority, but it does provide a different incentive for pursuing sponsors. As an attendee, I like the SQLSaturday model; as an organizer, I like the DevOpsDays model.

Parent Organization Involvement

DevOpsDays is a highly decentralized model (true to the agile underpinnings of the movement). The parent organization appears (from the outside) to be very hands off; local event organizers handle their own sponsorships, registration, and other details. This allows for a lot of fluidity when it comes to branding, networking, etc. For example, see the differences in advertising logos for the DevOpsDays organization, the Atlanta 2017 event, and the upcoming Nashville 2017 event:

DEVOPSDAYS (GENERIC) ATLANTA 2017 NASHVILLE 2017

In contrast, PASS (the Professional Association for SQL Server) retains tight control over the marketing of SQLSaturday; registrations and event planning are handled by their internally-developed tools, and the branding has recently evolved to provide a more consistent association with the parent organization (although not without some concerns).

PASS LOGO SQLSATURDAY LOGO

From an attendee perspective, branding probably doesn’t make much of a difference; however, the tools used for registration are highly visible. Both DevOpsDays events I attended used EventBrite, a well know tool for managing, well, events. SQLSaturday relies on a custom registration site that has improved over time, but still often leaves attendees confused (despite all the best guidance from organizers). Furthermore, if a SQLSaturday event has a precon, those events are usually managed by EventBrite, which leads to an additional disconnect between the precon event and the actual SQLSaturday. Despite my love for SQLSaturdays, I think the DevOpsDays approach to branding and tooling is better.

Educational Delivery Format

One area that SQLSaturday feels more comfortable to me is the format of the sessions. SQLSaturdays typically follow the traditional multi-track model in a single day (not counting pre-cons), where attendees can choose from multiple sessions at the same time; for example, SQLSaturday Atlanta 2017 had 10 concurrent tracks, each with sessions lasting about an hour. Note that this format is not required; smaller events may only have a single track, or have multiple tracks with longer sessions.

In contrast, the DevOpsDays standard of delivery is multiple days, with a single track in the morning of longer talks, followed by a single-track of short talks (“Ignites“), and then open-space sessions in the afternoon. For me, this is a mixed bag of effectiveness; bringing everyone in the conference together to hear the same discussion can (in theory) promote better cross-communication between the various stakeholders in the DevOps audience. For example, having managers and deployment specialists hear a programmer discussing pipelines may promote perspective-taking, one of the fundamentals of good communication. In reality, however, my experience has been that many presenters don’t do a great job of relating to all of the stakeholders in the audience, making it difficult to bridge that gap. Granted, I’ve only been to two events, but of the 16 main talks that I heard across the two events, about half of them seemed relevant to me. Ignites have some of the same limitations, but the time constraints mean that they hold attention spans for longer.

Most people either love or hate open spaces; letting the audience drive discussions is a great concept in theory; in reality, discussions are typically dominated by a few extroverts in the group, and most people merely observe. Although there are usually self-appointed moderators, the dynamic selection of topics just prior to the discussion makes it difficult to engage or guide. When they work, they work well; however, too often open spaces lend themselves to the seven-minute lull Ideally, I think the most effective method of delivery would be a blended approach; a couple of keynote sessions in the morning, followed by a few Ignite sessions. Do multiple tracks in the afternoon, including open-space discussion, both free-form and guided. However, this is mostly a matter of personal preference; I’d love to try it and see what people think of it.

Conclusion

I’m enjoying the transition of my career away from being a SQL Person to a DevOps person; both communities seem vibrant and engaged, and I plan on attending more DevOpsDays conferences in the future (and perhaps even help with planning one). Events like these offer a lot of opportunity to learn high-quality material in a low-cost setting, and I only expect them to get better (or I’ll get better) over time.

#SQLSatATL, #DevOps, #Cloud, & the Future of the DBA

Last weekend was SQLSaturday Atlanta 2017, and I was not only an organizer, but a presenter. In the future, I’ll need to balance that a little better (especially when we’re dealing with a lot of unknowns for the day, like a new building). Overall, I think my presentation went well; had a lot of great hallway conversations with folks later, and got some good feedback. You can find the slide deck here, or look on the Code, etc tab above.

However, during my presentation, a couple of questions came up that I didn’t have a great answer for; mostly it was revolving around the first bullet point on this slide:

Why, if DevOps as a philosophy encourages better communication between development and operations, do I believe that there will be increased segregation between those roles? I fumbled for an answer during the presentation, but then went back and realized what I left out in my explanation, so I thought I’d take a stab at rebuilding my argument and explain where I was going with this:

DevOps is built on a Service-Oriented Architecture (SOA) model.

Services logically represent business activities; they are self-contained, and the inner workings of each service are opaque to the consumers. Services can be built using other services, but that rule of opacity stays true; when you consume a service, you don’t care what it’s doing under the covers. It just has to provide a consistent output when given a consistent input.

The Cloud Paradigm is also built on a SOA model.

Software-as-a-Service is built on a Platform-as-a-Service, which is in turn built on an Infrastructure-as-a-Service. Communication between service layers must be consistent and repeatable, but processes and procedures within each services should be opaque. Furthermore, the consumers of a service are not the same; for example, if you have a web portal displaying account information to a client. The client consumes Software-as-a-Service; they just want to see their account information. They don’t care how many servers are involved or how the network is laid out. Software-as-a-Service consumes from the Platform layer; they may have a requirement that they use a particular database system, or OS, but specific configuration isn’t exposed to them. Software engineers define performance expectations (e.g, “we need to commit 1000 transactions per second”), and leave it up to the Platform (and Infrastructure) engineers to meet that expectation.

The traditional tasks associated with SQL Server Database Administration can be roughly divided into two roles: Development and Administration (Operations).

From this slide, I outline the general breakdown between skills:

 

SQL Server as a product spans the top two layers of the Cloud Paradigm:

Basically, I believe that traditional development skills belong to the Software-as-a-Service Layer, and traditional administration skills belong to the Platfom layer.

So by segregation of responsibilities, I mean that as companies embrace the Cloud paradigm, the current role of a DBA will fork into both Software-as-a-Service engineering (Dev) and Platform-as-a-Service engineering (Ops). I need to clarify that thought more in future presentations, because I may be using those terms differently than others would.

Thanks for reading, and if you attended, thanks for coming!

-Stu

#DevOps: Remote Workers and Minimizing Silos

Been a while since I’ve posted, so I thought I’d try to put down some thoughts on a lightweight topic. I’m a full-time remote worker; I have been for the last 10 years). My company has embraced remote workers, and provides lots of tools for people to contribute from all over the country (including the wilds of North Georgia); tools include instant messaging clients, VOIP, remote presentation software, etc. Document sharing and discussion is easy, but as you probably know, DevOps is as much about relationship building as it is about knowledge sharing. How do you minimize silos between teams when teams aren’t physically located near each other?

Here’s some different methods:

  1. 1. First, in-person communication provides the greatest avenue for relationship-building. Bringing a remote worker in from the field from time to time can greatly reduce isolation. If your chief developer is in Wisconsin, and your main sysops guy is in Georgia, flying both in periodically is probably the best way to create opportunities for conversation. Better yet, send them both to a conference somewhere in between.
  2. If in-person conversation is the gold standard for discussion but isn’t an option for economic or practical reasons, seek methods to emulate that experience. Conference calls or web conferencing tools are common, but video conferencing adds an additional dimension to discussions. In general, the higher the bandwidth, the better because it forces “presence” in conversations.
  3. Encourage remote employees to add depth to relationships by providing them with a virtual space to connect. Internal blogs (with personal pictures or activities), or slack channels for goofing off provide teams with meta information beyond their work ability. Knowing that another person loves Die Hard as much as you do gives you a common place to start building relationships.
  4. Organize virtual non-work events, such as multi-player gaming marathons (Leeroy Jenkins would be proud; NSFW) or virtual parties.

The main point is that while face-to-face interaction is desirable, it isn’t necessary. Employees (and companies) can thrive if they actively seek methods of encouraging high-bandwidth interactions with depth. Distance increases difficulty, but it’s not insurmountable.

Feel free to drop a suggestion for enhancing remote communication and decreasing silos.

#DevOps: Embrace the Ops

If you’re at all in touch with the DevOps community, you’re probably aware of the GitLabs Incident on 1/31/2017; I won’t spend too much time rehashing it here, but GitLabs has done a great job of being transparent about the issue and their processes to recover. Mike Walsh (Straight Path Solutions) wrote a great blog post about it entitled DevOps: Don’t Forget The Ops, which covers a lot of ground from a database administration perspective. Mike ultimately ends up with three specific action items for DevOps teams:

  1. Plan to Fail (so you don’t)
  2. Verify Backups (focus on restores, not backups)
  3. Secure your environment (from yourself).

I agree with all of these ideas; I think Mike is spot on about the need to Remember the Ops in DevOps. However, I want to go a step further, and encourage DevOps adoptees to Embrace the Ops.

What do I mean by that? Let me start with this; Brent Ozar posted this on Facebook yesterday (the image will take you to the job description):

Now, it’s obvious that GitLabs had a backup strategy (they detailed it in their notes), so I don’t mean to imply that they didn’t expect administrative tasks from their database people, but I do think we can infer that administrative tasks were not prioritized as much as other tasks (high availability, performance tuning, etc.). Again, we know that GitLabs had strategy for backups, so it appears that this is a cultural issue (at least based on this flimsy evidence and the outage). And to some degree, that’s understandable; one of the longest running challenges on the operations side is being labeled as a cost center as opposed to development being viewed as a revenue generator. This perception is pervasive in traditional IT shops, so it’s probable that even Unicorn shops share some of this mentality. Development (new features) makes money; Operations cost money.

However, in a true DevOps model, the focus is on delivering quality services to customer, faster. New features may bring new clients, but reliable service retains clients; both are revenue generating. So while it may add some cost to deliver quality service to customers, cutting corners in operations risks impacting the bottom line. From this perspective, I’m arguing that DevOps shops should not only remember the ops, they should embrace it. The entire value stream of a business service includes people, procedures, and technology split into teams; the fewer the teams per service, the fewer silos. So how do we embrace the ops?

  1. If Ops is part of the Value Stream, then apply consistent Development principles to it. I’ve written before that “we are all developers“, and I believe that; administrators are creative folk, just like application developers. Operations includes backup, monitoring, and validation. We should apply development principles to these operations, like creating reusable scripts, finding opportunities for automating validation, and logging (and investigating) errors with that pipeline. We should use source control for these tools, and treat the operations pipeline like any other continuous integration project (automate your backup, automate your restores, and log inconsistencies).
  2. Include operational improvements as part of the development pipeline. I’m borrowing a lot from Google’s SRE model; SRE is what you get when you treat operations as if it’s a software problem (see point 1 above).  However, the SRE model is usually a self-contained bubble within operations; they have their own pipelines for toil reduction. I think if DevOps wants to truly embrace operations, developers need to include toil reduction in the service delivery pipeline. If operations folks have to flip 30 switches to bring an app online, development should make it a priority to reduce that (if possible). It goes back to the fundamental rule for DevOps: communicate. Help each other resolve pain points, and commit to improving everything in the value stream.
  3. Finally, balance risk and experimentation with safety. Gene Kim’s The Phoenix Project provides the Three Ways, and the Third Way is all about creating a culture that rewards risk and experimentation. This is great for developers; try something new, and if it breaks, you can deliver a fix within hours. However, as the GitLabs incident shows, some fixes can’t be delivered, and risk needs to be mitigated by secure data handling processes and procedures. While I’m a big fan of controlled failures (e.g., shutting a server down hard in order to see what the impact is), you don’t do that unless you can test it in a lab first and make sure you have good mitigating option (how do you recover? What error messages do you expect to see? Are you sure your backup systems are working?). Don’t forsake basic safety nets while promoting risk; you want competitive advantages, but you also want to stay in business.

My Challenges with #DevOps

As I’ve alluded to in earlier posts, my career goals have transitioned away from database development and administration into DevOps implementations; it’s been a bit of a challenge for me, because I feel like a stranger in a strange land all over again. Looking at familiar problems turned on their sides isn’t for me, and my day job has some particular challenges that I need to figure out appropriate solutions for. All of that being said, I’ve enjoyed it. However, in an attempt to help me wrap my head around things, I wanted to list out the struggles I’m facing, and include my current “solutions”; these may change over time, but it’s where I’m at today.

  1. It is what it is…

One of the challenges of describing DevOps is that it’s a conglomerate of technical and cultural changes. System and software engineers can easily understand the technical components of iterative software deployment, but it’s tough to describe the organizational and procedural changes necessary to implement a rapid deployment environment (“You want the developers on pager duty? How will they administer the system?”). Most engineers have a tough time interpreting The Phoenix Project, because it’s not a technical manual; they’re used to step-by-step guides, not cultural strategies.

My solution is to describe DevOps as a philosophy, not a methodology. Philosophies have general principles that you agree to abide by (such as seeking efficiency through automation, increasing feedback, and documenting problems without blame); methodologies are strategies for implementing philosophies. What this means is that as a manager, my method of implementing DevOps is probably different than yours, and there may even be differences within the organization; the key focus should be on reducing or eliminating silos through communication. Where those silos must remain (say, in a highly regulated environment where development and operations need to be separate), workflow in each silo needs to be as transparent as possible.

  1. Brownfield change is harder than greenfield development.

Greenfield projects are new software projects or initiatives; brownfield development is a revitalization of an existing project. While each has challenges from a DevOps perspective, the technical and cultural debt that is associated with brownfield development often slows down adoption of DevOps practices. It’s tough to maintain a system while making suggestions for improvement, particularly when multiple departments are involved, all with their own goals.

For me, it all goes back to two principles: tight focus, and increased communication. As a change agent, I need to carve out time each week to focus on one small, incremental change that I can make to increase efficiencies; for example, I’m currently working on developing a standard postmortem practice for tracking issues with our business service (using this free guide from VictorOps). The point is that I may not have opportunity to make sweeping changes, but I can do a little bit at a time (and encourage my team to do so as well).

  1. Compliance is a constraint.

Working in the financial industry brings some unique challenges to implementing DevOps; while philosophically the ideal DevOps process is to build automation pipelines from development through deployment, regulatory policies are written to dictate separation and control between environments. The thicker the wall, the more likely you are to successfully pass an audit, but those walls make it tough to attain rapid development. If your developers have one set of goals (deploy new features) and your operations team have another (keep the system stable with few changes), you’ve got to figure out a way to reconcile those.

Communication is key, and that includes have a common issue tracking system to report operational issues to development as soon as they occur; I don’t manage the developers in my business unit, so I can’t set their priorities. But I can make them aware of the pain points, and the expenses associated with those struggles. I can also find ways to make our infrastructure more predictable so that developers can develop code faster, and our QA teams can automate tests with some assurance. It’s tough, but it’s my goals.

Summary

I realize that this post may not be that insightful, but I’m looking at it as an effort to keep writing and thinking about these issues. Expect more from me in the future as I continue to try and learn something new.

#DevOps “We are all developers”

https://youtu.be/RYMH3qrHFEM

While thinking about the Implicit Optimism of DevOps, I started running through some of the cultural axioms of DevOps; I’m not sure if anyone has put together a comprehensive list, but I have a few items that I think are important. Be good at getting better is my new mantra, and now, I’m fond of saying “We are all developers”. I remember eating lunch at SQL Saturday Atlanta 2016 listening to a database developer describing this perspective to a DBA, and hearing how strongly the DBA objected to that label. I tentatively agreed with the developer, but recently, I’ve gotten more enamored with that statement.

Having worked as both a developer and an administrator, I get it; there’s an in-group mentality. The two sides of the operational silo are often working toward very different goals; developers are tasked with promoting change (new features, service packs, etc). DBA’s are tasked with maintaining the stability of the system; change is the opposite of stability. Most technical people I know are very proud of their work, which means that there’s often a desire for accuracy in the work we do. If a DBA is trying to make a system stable, and you call them a developer (think: change instigator), then it could be perceived as insulting.

It’s not meant to be.

Efficient development (to me) revolves around the three basic principles of:

  1. Reduce – changes should be highly targeted, small in scope, and touch only what’s necessary.
  2. Reuse – any process that is repeated should be repeated consistently; and,
  3. Recycle – code should be shared with stakeholders, so that inspiration can be shared.

From that perspective, there’s lots of opportunities to apply development principles to operational problems. For my DBA readers (all three of you), think about all the jobs you’ve written to automate maintenance. Think about the index changes you’ve suggested and/or implemented. Think about the reports you’ve written to monitor the performance of your systems. Any time you’ve created something to help you perform your job more efficiently, that’s development.

DevOps is built on the principle of infrastructure as code, with an emphasis on giving developers the ability to build the stack as needed. Google calls its implementation of DevOps principles Site Reliability Engineering, and characterizes it as “what you get when you treat operations as if it’s a software problem”. Microsoft is committed to DevOps as part of its application lifecycle management (although it’s notably cloud-focused). When dealing with large-scale implementations, operations can benefit from the application of the principles of efficient development.

We are all developers; most of us have always been developers. We just called it something else.

The Implicit Optimism of #DevOps

One of my favorite podcasts lately is DevOps Café; John Willis and Damon Edwards do a great job of talking about the various trends in IT management, and have really opened my eyes to a lot of different ways of thinking about problems in enterprise systems administration. On a recent podcast, John interviewed Damon about his #DOES15 presentation, “DevOps Kaizen: Practical Steps to Start & Sustain a Transformation“. During that conversation, Damon mentioned a phrase that really resonated with me: Be Good at Getting Better.

At the heart of the DevOps philosophy is the desire to improve delivery of services through removal of cultural blockages. Success isn’t measured by the amount of code pushed out the door or the number of releases; it’s the ability to continuously improve over time. Companies that experiment (even with ideas that don’t work) learn a different way to approach any problem that they face. The freedom to experiment means that failure is not an outcome; it’s a method of improvement.

The optimism of that appeals to me; I think if you’re focusing on continuous improvement, then you’ve implicitly accepted two fundamental principles of optimism:

  1. Change is necessary for growth, and
  2. Things CAN improve (you just need to figure out how).

There’s some beauty in that; if you’re an organization facing overwhelming technical debt, it’s not uncommon to sink into a spiral of despair, where changes are infrequent for fear of breaking something. Mistrust breeds, as organizations point fingers at other teams for “failing to deliver”. You quit working toward solutions, and instead focus on fighting fires and maintaining some sort of desperate last stand.

You’re better than that.

DevOps is a cultural change; it’s an optimistic philosophy focused on changing IT culture while being open to different strategies for doing so. If you can commit to Be Good at Getting Better, you can change. It may be slow, it may be frustrating, but every day is an opportunity to incrementally move the ball forward in delivering quality business services. The trick is not to focus on where to begin, but simply to begin.


#DevOps Two Books for Operations

Over the last couple years, there’s been a subtle shift in my responsibilities at my day job (and my interests in technology overall).  I’ve been doing much less database development and administration work, and more general system architecture work.  That’s harder to write up in blog posts than SQL code, so I’ve struggled with writing, but I want to get back into the habit.  So excuse the choppiness, and let me try to put some thoughts on digital paper.

I’m pushing very hard for my company to adopt DevOps principles.  There’s a lot of material out there about DevOps from the developer perspective, but there’s few resources for those of us on the operations side of the house.  In a pure sense, there’s no such thing as sides, but in a regulated industry like healthcare or financial services, old walls are tough to break down, so they’re useful as organizational frameworks for general responsibilities.  However, we are all developers, whether or not we sling code or manage infrastructure as code; the goal is to produce repeatable patterns and tools that allow growth and change.

Two great books that I’m reading right now are:

The Practice of Cloud System Administration by Limoncelli, Chalup, and Hogan.  Tons of practical advice for building large-scale distributed processing systems, and DevOps philosophy is woven throughout (and specifically highlighted in Chapter 8).  This is one of those books that you’ll feel like diving in on some sections, and skimming over others; it’s a through examination of system administration from development through implementation, so there’s lots of conceptual hooks to grab hold of (and conversely, things that you may not have experienced).

The second book that I’ve recently started reading is Site Reliability Engineering: How Google Runs Production Systems.  This book is a collection of essays which explore Google’s method of approaching reliability; like most things Google, Site Reliability Engineering is similar to DevOps, but specific to the ways that Google does thing.  It’s also light on documentation (insert joke about Google and beta products here).  However, it does offer several insights into day-to-day system administration at Google.  While the SRE model is not exactly like DevOps, there’s lots of overlap, and differences may be attributed more to practice than to concepts.

More to come.