Mitch Wyle's Web Log: Software Quality analysis from Web of Knowledge by Dave Jarvis

Thursday, June 6, 2019

Software Quality analysis from Web of Knowledge by Dave Jarvis

https://dave.autonoma.ca/blog/2019/06/06/web-of-knowledge/

Here is an excerpt from a recent blog post by Dave Jarvis that appeared at work:

During the 1980s, a VT100-controlled radiation therapy machine named the Therac-25 fatally overdosed six people. After an extensive investigation by MIT professor Nancy Leveson, a number of lessons were put forward for safety-critical systems development, including:

Overconfidence – Engineers tend to ignore software
Reliability versus safety – False confidence grows with successes
Defensive design – Software must have robust error handling
Eliminate root causes – Patching symptoms does not increase safety
Complacency – Prefer proactive development to reactive
Bad risk asessments – Analyses make invalid independence claims
Investigate – Apply analysis procedures when any accidents arise
Ease versus safety – Ease of use may conflict with safety goals
Oversight – Government-mandated software development guidelines
Reuse – Extensively exercised software is not guaranteed to be safe

The remaining lesson was about inadequate software engineering practices. In particular, the investigation noted that basic software engineering principles were violated for the Therac-25, such as:

Documentation – Write formal, up-front design specifications
Quality assurance – Apply rigorous quality assurance practices
Design – Avoid dangerous coding practices and keep designs simple
Errors – Include error detection methods and software audit trails
Testing – Subject software to extensive testing and formal analysis
Regression – Apply regression testing for all software changes
Interfaces – Carefully design input screens, messages, and manuals

In 2017, Leveson revisited those lessons and concluded that modern software systems still suffer from the same issues. In addition, she noted:

Error prevention and detection must be included from the outset.
Software designs are often unnecessarily complex.
Software engineers and human factors engineers must communicate more.
Blame still falls on operators rather than interface designs.
Overconfidence in reusing software remains rampant.

Whatever the reasons (market pressures, rushing processes, inadequate certifications, fear of being fired, or poor project management), Leveson's insights are being ignored. For example, after the first fatal Boeing 737 Max flight, why was the entire fleet not grounded indefinitely? Or not grounded after an Indonesian safety committee report uncovered multiple failures? Or not grounded when an off-duty pilot helped avert a crash? What analysis procedures failed to prevent the second fatal Boeing 737 Max flight?

Mitch Wyle's Web Log

Thursday, June 6, 2019

Software Quality analysis from Web of Knowledge by Dave Jarvis

No comments:

Labels

Subscribe via Email

Curriculum Vitae

Blog Archive

About Me