Sunday, September 8, 2019

AI, human behaviors, bias, subtle unobserved data, & causality (long but worth it)

In my day job, I am now trying to measure and predict which software efforts from my teams will deliver how much incremental revenue for my company.  The inherent uncertainties (errors) of these estimations are larger than any forecast value.  70% (or more) of all software efforts fail across most industries.  The most frustrating part of my experience is that everyone lies and pretends their estimates are always perfect.  And there are no data or scholarly analysis for the justifications.  And, of course, the real "attribution" of any revenue to any specific effort in the complex ecosystem of a marketplace is very dodgy and is itself uncertain.



My colleagues in "analytics" data science claim to measure (perfectly, of course!) the exact percentage of people who "would have bought anyway" without my team's marketing effort / campaign / incentive.  The assumptions they make are (of course!) tested with biased assumptions in the test formulation and the predictive accuracy on unknown data are never tested.  But in general, they do great work and I agree with all of their reasoning if not all of their numerical methods.

But the main purpose of my rant here for why everyone in AI should pay more attention to the points Taleb raises in Incerto (uncertainty) is that many of the axioms and foundations upon which we are basing "AI" and "data science" are themselves very questionable:

In this latter case there are very many reasons we need to be even more careful about AI in our judiciary and legal proceedings, not just those points raised by the author.   The origin data upon which judgments and the AI calculates are biased because of the humans who acted, recorded, selected, and encoded.  Judges, using ancient human unconscious perceptions of other humans' feelings, motivation, trustworthiness. 

No comments: