Mitch Wyle's Web Log

Monday, February 17, 2020

New Salt release

There are some folks who still use Ansible, Chef, Puppet, or Saltstack as part of managing their configurations, deployments or infrastructure in some way. One of the most vital, growing, and well-maintained tool in this family is Salt (part of Salt stack) that just released a new version this month. The release notes are comprehensive with many motivating examples.

monitoring changes in kubernetes by persisting kubernetes audit logs

If you are using a reliable public cloud kubernetes (k8s) infrastructure you probably should persist, monitor, and possibly alert security about certain kubernetes changes. The k8s audit logs can also help developers understand what happens when deployments or other changes occur to help them diagnose or prevent issues. Unfortunately in my case, the subset of k8s infrastructure available is so woefully unreliable and has so few features of real k8s, the logs would not be useful to our teams, even if they were made available.

zero-downtime, rolling Database Schema Migrations at github

Shlomi Noach walks us through the history and details of github's approach to their database schema migrations up to and including details of their current zero-touch automated method that uses github actions. Like many of us they started with Ruby on Rails because it automated schema migrations for them. The current method is interesting but forgives a few bad practices of their developers, including not testing compatibility of clients and servers with disparate schema versions. Still worth a read though.

Organizational Friction and the "side quest"

Tanya Reilly voices some thoughts and wisdom about overcoming organizational friction that she calls the "side quest" when shipping value to customers as well as how to persevere and enable broad initiatives to succeed. She links to this awesome talk (video here) from SREcon. The deck is stand-alone if you don't have time to listen watch the talk at 2x speed on YouTube.

Sunday, February 9, 2020

The Space Merchants by Frederik Pohl

This 1952 book holds up well and the marketing dystopia of consumerism hits close to home, 4/5 stars. Fun.

2020 Week 6 Dev-Ops Scouting Update

I have decided to post my dev-ops musings and scouting on this public Internet blog instead of internally where I work because there is nothing here that is specific or proprietary to my employer and because a few friends have asked for my thoughts. I welcome all feedback, especially bad (criticism).

TL;DR

Gene Kim hawks his new book The Unicorn Project on the Ship Happens podcast

Accenture's Markos Rendell wrote a summary of Team Topologies (slides here)

Rick Branson explains why we should never count bugs / incidents

Great talk on Terraform without the mess by James Nugent, one of the Terraform authors

Control costs for no-ops (Google cloud functions, Amazon lambda) presentation

Here is a short deck about fun and infrequently used features of YAML

Netflix has open-sourced the riskquant python library for helping you quantify risks

ap4k has rebranded itself as dekorate and is in use by some teams where I work

I assume everyone knows that all software and services always takes on the design of the organization that ships it. Features and menu hierarchies in user experience, and micro-services decomposition always align to the shape of the organization that produces them. This phenomenon is known as "Conway's Law." Gene Kim, in his new book, explores other such phenomena and gives some prescriptive device for creating a high-performing engineering team. Much of the book is fluff and common sense. But there are numerous counter-intuitive phenomena and some good easy-to-implement practices.

#

On the topic of organizing a high-performing team, a pair of dev-ops authors wrote another interesting book recently on "cognitive load" and org structures that optimize for the success measures of your product. The slides from their presentation in October 19 are very approachable. The book summary by Markos Rendell is dry and comprehensive. The book itself has some important concepts and should be skimmed by senior leaders in a position to design dev-ops organizations.

#

Teams will always "game the system" when their success is measured by bug counts or incidents. If the goal is fewer bugs and fewer incidents, they will go to great lengths to hide their bugs & incidents from tracking. If the goal is "yield" (ratio of bugs we find before customers find them to the number of customer-reported bugs), then engineers will stuff the bug tracking system with trumped up tick-tack P6 bugs. Reactive support organizations will push every 30-second bit of customer interaction through their ticketing system and demand more headcount because of the flood of tickets.
Don't count bugs or incidents. Just don't! "Counts" is the wrong single-number summary. Instead, consider measurements from the customer's point of view (service level indicators, SLIs). Severe, ship-stopping or money-losing bugs or incidents should be aggregated and combined with a few other measurements into Management's desired single-number summary.

“What can be counted doesn’t always count, and not everything that counts can be counted.”

— William Bruce Cameron.

Terraform has evolved rapidly; James Nugent believes we should apply the "Composition Root" software pattern to factor our Terraform modules. In a recent talk he makes very strong and convincing arguments, especially in light of security and separation of concerns.

#

The reason you are using serverless cloud functions is to reduce operations cost to zero (no-ops). You were wise enough to realize that over 80% of the total cost of your software during its lifetime is maintenance and operation. Now you are looking at customer experience and that hefty public cloud bill with an eye on reducing latency and cost-cutting. How can you further reduce the cloud function costs? This presentation explains where to look. Hints: Don't use Java or a JVM language if you can avoid it.

A good craftsman understands the breadth, depth, and capabilities of the tools she uses. Learn and use more of YAMLs interesting features by reading this fun deck.

I have frequently tried to explain risk analysis and prioritization to people without much success. Netflix has open-sourced "riskquant" (get it?) a python library for risk people to do risk analysis. It's not just financial services institutions that need risk analysis. Software and service failure risks have hefty economic impact as well.

#

Dekorate generates your kubernetes (k8s) configurations for you when you include a dependency in your Java classpath; you can customize your k8s config by setting an annotation or application property in your code.

Saturday, February 8, 2020

Galaxy's Edge: Legionnaire by Anspach Cole

I am enjoying the series, character building, and close combat, great mindless airplane reading, 4/5 stars.

Starship Congress 2019: Bend Metal by the Planetary Society

Great reporting and interviews in an extended podcast by the Planetary Society on the 2019 Starship Congress, 3/5 stars.

Mitch Wyle's Web Log

Saturday, February 22, 2020

Naming Things

Monday, February 17, 2020

New Salt release

monitoring changes in kubernetes by persisting kubernetes audit logs

zero-downtime, rolling Database Schema Migrations at github

Organizational Friction and the "side quest"

Saturday, February 15, 2020

Galaxy's Edge Kill Team by Jason Anspach & Nick Cole

Sunday, February 9, 2020

The Space Merchants by Frederik Pohl

2020 Week 6 Dev-Ops Scouting Update

TL;DR

Saturday, February 8, 2020

Galaxy's Edge: Legionnaire by Anspach Cole

Starship Congress 2019: Bend Metal by the Planetary Society

Labels

Subscribe via Email

Curriculum Vitae

Blog Archive

About Me