Saturday, February 22, 2020

Naming Things

Mitch is reading this book about code craftsmanship.  Recommended.

Monday, February 17, 2020

New Salt release


There are some folks who still use Ansible, Chef, Puppet, or Saltstack as part of managing their configurations, deployments or infrastructure in some way. One of the most vital, growing, and well-maintained tool in this family is Salt (part of Salt stack) that just released a new version this month. The release notes are comprehensive with many motivating examples.

monitoring changes in kubernetes by persisting kubernetes audit logs

If you are using a reliable public cloud kubernetes (k8s) infrastructure you probably should persist, monitor, and possibly alert security about certain kubernetes changes. The k8s audit logs can also help developers understand what happens when deployments or other changes occur to help them diagnose or prevent issues.  Unfortunately in my case, the subset of k8s infrastructure available is so woefully unreliable and has so few features of real k8s, the logs would not be useful to our teams, even if they were made available.

zero-downtime, rolling Database Schema Migrations at github

Shlomi Noach walks us through the history and details of github's approach to their database schema migrations up to and including details of their current zero-touch automated method that uses github actions.  Like many of us they started with Ruby on Rails because it automated schema migrations for them.  The current method is interesting but forgives a few bad practices of their developers, including not testing compatibility of clients and servers with disparate schema versions.  Still worth a read though.

Organizational Friction and the "side quest"

Tanya Reilly voices some thoughts and wisdom about overcoming organizational friction that she calls the "side quest" when shipping value to customers as well as how to persevere and enable broad initiatives to succeed.  She links to this awesome talk (video here) from SREcon.  The deck is stand-alone if you don't have time to listen watch the talk at 2x speed on YouTube.

Saturday, February 15, 2020

Galaxy's Edge Kill Team by Jason Anspach & Nick Cole


Fun, 4/5 stars.

Sunday, February 9, 2020

The Space Merchants by Frederik Pohl


This 1952 book holds up well and the marketing dystopia of consumerism hits close to home, 4/5 stars.  Fun.

2020 Week 6 Dev-Ops Scouting Update

I have decided to post my dev-ops musings and scouting on this public Internet blog instead of internally where I work because there is nothing here that is specific or proprietary to my employer and because a few friends have asked for my thoughts.  I welcome all feedback, especially bad (criticism).

TL;DR

  • Control costs for no-ops (Google cloud functions, Amazon lambda) presentation
  • Here is a short deck about fun and infrequently used features of YAML
  • Netflix has open-sourced the riskquant python library for helping you quantify risks
  • ap4k has rebranded itself as dekorate and is in use by some teams where I work
 #


I assume everyone knows that all software and services always takes on the design of the organization that ships it.  Features and menu hierarchies in user experience, and micro-services decomposition always align to the shape of the organization that produces them.  This phenomenon is known as "Conway's Law." Gene Kim, in his new book, explores other such phenomena and gives some prescriptive device for creating a high-performing engineering team.  Much of the book is fluff and common sense.  But there are numerous counter-intuitive phenomena and some good easy-to-implement practices.

 #  













On the topic of organizing a high-performing team, a pair of dev-ops authors wrote another interesting book recently on "cognitive load" and org structures that optimize for the success measures of your product.  The slides from their presentation in October 19 are very approachable.  The book summary by Markos Rendell is dry and comprehensive.  The book itself has some important concepts and should be skimmed by senior leaders in a position to design dev-ops organizations.

 

 # 













Teams will always "game the system" when their success is measured by bug counts or incidents. If the goal is fewer bugs and fewer incidents, they will go to great lengths to hide their bugs & incidents from tracking.  If the goal is "yield" (ratio of bugs we find before customers find them to the number of customer-reported bugs), then engineers will stuff the bug tracking system with trumped up tick-tack P6 bugs.  Reactive support organizations will push every 30-second bit of customer interaction through their ticketing system and demand more headcount because of the flood of tickets.
Don't count bugs or incidents. Just don't! "Counts" is the wrong single-number summary.  Instead, consider measurements from the customer's point of view (service level indicators, SLIs).  Severe, ship-stopping or money-losing bugs or incidents should be aggregated and combined with a few other measurements into Management's desired single-number summary.


“What can be counted doesn’t always count, and not everything that counts can be counted.”

           — William Bruce Cameron.
 #















Terraform has evolved rapidly; James Nugent believes we should apply the "Composition Root" software pattern to factor our Terraform modules.  In a recent talk he makes very strong and convincing arguments, especially in light of security and separation of concerns.

 #






The reason you are using serverless cloud functions is to reduce operations cost to zero (no-ops).  You were wise enough to realize that over 80% of the total cost of your software during its lifetime is maintenance and operation.  Now you are looking at customer experience and that hefty public cloud bill with an eye on reducing latency and cost-cutting.  How can you further reduce the cloud function costs?  This presentation explains where to look.  Hints: Don't use Java or a JVM language if you can avoid it.


 # 


A good craftsman understands the breadth, depth, and capabilities of the tools she uses.  Learn and use more of YAMLs interesting features by reading this fun deck.

 #














I have frequently tried to explain risk analysis and prioritization to people without much success. Netflix has open-sourced "riskquant" (get it?)  a python library for risk people to do risk analysis. It's not just financial services institutions that need risk analysis.  Software and service failure risks have hefty economic impact as well.

 #



 



Dekorate generates your kubernetes (k8s) configurations for you when you include a dependency in your Java classpath; you can customize your k8s config by setting an annotation or application property in your code.

Saturday, February 8, 2020

Galaxy's Edge: Legionnaire by Anspach Cole


I am enjoying the series, character building, and close combat, great mindless airplane reading, 4/5 stars.

Starship Congress 2019: Bend Metal by the Planetary Society


Great reporting and interviews in an extended podcast by the Planetary Society on the 2019 Starship Congress, 3/5 stars.