Saturday, February 29, 2020
Monday, February 24, 2020
Sunday, February 23, 2020
Kotlin domain specific language (DSL) instead of YAML
This codetalk video by Fedor Korotkov is another alternative to copy/pasting large YAML files. Here, Fedor shows how to use kotlin instead of bash, javascript, or go.
Labels:
devops
Too much copy/pasted YAML
Daniele Polencic tells us how and why we should use templating tools like yq and kustomize to template our YAML with real code instead of trying to shoehorn our kubernetes configurations into Helm charts. We are still in early days of abstracting, factoring, and choosing dynamic configuration policies for kubernetes. Therefore picking and using one or another of the templating and customization frameworks is your own decision. Use the tools and programming language you prefer but stop copy/pasting!
Labels:
devops
Amazon's passion for operational excellence
Adrian Hornsby, an Amazon architecture evangelist (whatever that means) is writing a long, three-part blog post espousing his views on operational excellence. He does not touch on architecture principles or design for operability such as recovery-orientation or redundancy but he does have some good insights into high-level policies and tools for operations.
Labels:
devops
shadow org chart (graph) of influencers in an organization
Labels:
devops
Saturday, February 22, 2020
Another Open Policy Agent application for k8s
Preflight is another kubernetes configuration checker that uses open policy agent. The libraries included are a great set of policy checks for your YAML.
Labels:
devops
DevOps Days in NY is March 3-4
DevOpsDays in New York on March 3-4 has a great set of talks in the program.
Labels:
devops
Google Go Language advantages for cloud
Another Go evangelist explains some of the powerful features and reasons Go is a good choice for cloud development, especially for cloud infrastructure code.
Labels:
devops
Infrastructure code (Terraform) is still code and needs code maintenance
Just as you must refactor and maintain your application code, so too should you carefully refactor your terraform code.
Labels:
devops
Migrating from Jenkins to Concourse
There are too many lighter-weight, container-oriented continuous integration and continuous delivery tools to count and they are exploding. This story about adopting concourse and replacing Jenkins is a great example. The nice part about concourse is that it can be plugged in to many, smaller developer testing tasks such as git merges.
Labels:
devops
Monday, February 17, 2020
New Salt release
Labels:
devops
monitoring changes in kubernetes by persisting kubernetes audit logs
If you are using a reliable public cloud kubernetes (k8s) infrastructure you probably should persist, monitor, and possibly alert security about certain kubernetes changes. The k8s audit logs can also help developers understand what happens when deployments or other changes occur to help them diagnose or prevent issues. Unfortunately in my case, the subset of k8s infrastructure available is so woefully unreliable and has so few features of real k8s, the logs would not be useful to our teams, even if they were made available.
Labels:
devops
zero-downtime, rolling Database Schema Migrations at github
Shlomi Noach walks us through the history and details of github's approach to their database schema migrations up to and including details of their current zero-touch automated method that uses github actions. Like many of us they started with Ruby on Rails because it automated schema migrations for them. The current method is interesting but forgives a few bad practices of their developers, including not testing compatibility of clients and servers with disparate schema versions. Still worth a read though.
Labels:
devops
Organizational Friction and the "side quest"
Tanya Reilly voices some thoughts and wisdom about overcoming organizational friction that she calls the "side quest" when shipping value to customers as well as how to persevere and enable broad initiatives to succeed. She links to this awesome talk (video here) from SREcon. The deck is stand-alone if you don't have time to listen watch the talk at 2x speed on YouTube.
Labels:
devops
Saturday, February 15, 2020
Sunday, February 9, 2020
2020 Week 6 Dev-Ops Scouting Update
I have decided to post my dev-ops musings and scouting on this public Internet blog instead of internally where I work because there is nothing here that is specific or proprietary to my employer and because a few friends have asked for my thoughts. I welcome all feedback, especially bad (criticism).
TL;DR
- Gene Kim hawks his new book The Unicorn Project on the Ship Happens podcast
- Accenture's Markos Rendell wrote a summary of Team Topologies (slides here)
- Rick Branson explains why we should never count bugs / incidents
- Great talk on Terraform without the mess by James Nugent, one of the Terraform authors
- Control costs for no-ops (Google cloud functions, Amazon lambda) presentation
- Here is a short deck about fun and infrequently used features of YAML
- Netflix has open-sourced the riskquant python library for helping you quantify risks
- ap4k has rebranded itself as dekorate and is in use by some teams where I work
I assume everyone knows that all software and services always takes on the design of the organization that ships it. Features and menu hierarchies in user experience, and micro-services decomposition always align to the shape of the organization that produces them. This phenomenon is known as "Conway's Law." Gene Kim, in his new book, explores other such phenomena and gives some prescriptive device for creating a high-performing engineering team. Much of the book is fluff and common sense. But there are numerous counter-intuitive phenomena and some good easy-to-implement practices.
#
On the topic of organizing a high-performing team, a pair of dev-ops authors wrote another interesting book recently on "cognitive load" and org structures that optimize for the success measures of your product. The slides from their presentation in October 19 are very approachable. The book summary by Markos Rendell is dry and comprehensive. The book itself has some important concepts and should be skimmed by senior leaders in a position to design dev-ops organizations.
#
#
On the topic of organizing a high-performing team, a pair of dev-ops authors wrote another interesting book recently on "cognitive load" and org structures that optimize for the success measures of your product. The slides from their presentation in October 19 are very approachable. The book summary by Markos Rendell is dry and comprehensive. The book itself has some important concepts and should be skimmed by senior leaders in a position to design dev-ops organizations.
#
Teams will always "game the system" when their success is measured by bug counts or incidents. If the goal is fewer bugs and fewer incidents, they will go to great lengths to hide their bugs & incidents from tracking. If the goal is "yield" (ratio of bugs we find before customers find them to the number of customer-reported bugs), then engineers will stuff the bug tracking system with trumped up tick-tack P6 bugs. Reactive support organizations will push every 30-second bit of customer interaction through their ticketing system and demand more headcount because of the flood of tickets.
Don't count bugs or incidents. Just don't! "Counts" is the wrong single-number summary. Instead, consider measurements from the customer's point of view (service level indicators, SLIs). Severe, ship-stopping or money-losing bugs or incidents should be aggregated and combined with a few other measurements into Management's desired single-number summary.
“What can be counted doesn’t always count, and not everything that counts can be counted.”
— William Bruce Cameron.
#Terraform has evolved rapidly; James Nugent believes we should apply the "Composition Root" software pattern to factor our Terraform modules. In a recent talk he makes very strong and convincing arguments, especially in light of security and separation of concerns.
#
The reason you are using serverless cloud functions is to reduce operations cost to zero (no-ops). You were wise enough to realize that over 80% of the total cost of your software during its lifetime is maintenance and operation. Now you are looking at customer experience and that hefty public cloud bill with an eye on reducing latency and cost-cutting. How can you further reduce the cloud function costs? This presentation explains where to look. Hints: Don't use Java or a JVM language if you can avoid it.
#
A good craftsman understands the breadth, depth, and capabilities of the tools she uses. Learn and use more of YAMLs interesting features by reading this fun deck.
#
I have frequently tried to explain risk analysis and prioritization to people without much success. Netflix has open-sourced "riskquant" (get it?) a python library for risk people to do risk analysis. It's not just financial services institutions that need risk analysis. Software and service failure risks have hefty economic impact as well.
#
Dekorate generates your kubernetes (k8s) configurations for you when you include a dependency in your Java classpath; you can customize your k8s config by setting an annotation or application property in your code.
Labels:
devops
Saturday, February 8, 2020
Monday, February 3, 2020
Our Mathematical Universe by Max Tegmark
Max Shapiro (Tegmark) writes better for a popular science audience than Roger Penrose. But having just read Cycles of Time, I am wondering how Tegmark's very odd theory at the end accounts for the second law of Thermodynamics. The end of Tegmark's book is slightly off-topic & very depressing but the entire book is very well-written, worthwhile, and approachable by a lay audience (like me) 4/5 Stars.
Labels:
science
Sunday, February 2, 2020
What's Next with Containers?
Chris Hickman speculates about the future of containers in his blog this week with a nominal and partial tour of virtualization directions. He touches on my favorite concept of unikernels in containers. I have a different point of view from Chris' but I think his ideas are more mainstream.
Why Unikernels? (The application is the container!)
I am very passionate about elegance and simplicity in Software Design and in software service security, I believe there is no attack surface like no attack surface. That is, If you remove everything from your container that can be attacked, you are more likely to be secure than if you bring all of the attack surfaces of a full-blown kernel and operating system. And even if you are compromised, other microservices should be protecting themselves from you (authentication, parameter checking). Further, there is almost nothing in your unikernel the attacker can use to attack the rest of your microservices ecosystem.
Obviously it is more difficult to debug complex microservices to discover why they are failing in production if there are no helper tools and capabilities in the container. But an instrumented services mesh can enable "playback" payloads and traffic for a non-production version of your microservice to diagnose your problem. And, removing complex components can prevent these useless resource wasting elements from interfering with your service; so bugs frequently become shallower (all of the bugs are yours).
However, coders are lazy and will prefer to have convenient shell access to their containers running in production so that they can debug under live traffic circumstances. And, most developers prefer to bolt on convenient tools, libraries, deep stacks, and monolithic resource hogging pieces to their run-time environment because they perceive it is faster and easier to copy/paste a few annotations or changes into a larger software monolith. Therefore I am pessimistic we shall see a rise in Unikernels outside of environments where security is important and leadership understands the value of simplicity.
Labels:
devops
Subscribe to:
Posts (Atom)