ML is uniquely powerful, complex, and opaque (as we’ve written about before!). It’s also relatively new. This combination of factors often creates a disconnect between stakeholders and data scientists — which, unfortunately, can lead to relationships becoming so dysfunctional that they undermine data science success.
Let’s talk about tradeoffs, as an example. Back at Uber in 2018, my (Duncan’s) teams were building models that set rider prices. These models were incredibly powerful, responsible for pricing 10s of billions of dollars a year in trips. As part of the pricing optimization, we needed to encode the right balance between volume of trips with profit margins, but struggled to work well with our business counterparts to define what that should look like. “Tradeoffs” just weren’t really how folks thought about it.
Worse yet, a month after we finally came to an agreement on how to do it, our stakeholders noticed that our model was making short trips cheap relative to long trips. Depending on who you asked, this either made a ton of sense (let’s win the frequent short commutes!) or was a terrible idea (we’re going to lose the airport trips!). This got acrimonious, quickly.
Working productively on hammering out what the right tradeoffs were — amidst the larger political dynamics at the company of who gets to make these decisions, no less — was a real struggle.
This is just one example, and to paraphrase Tolstoy, no two unhappy data science/stakeholder families are exactly alike. There are quite a few ways these conflicts can arise. But over the years, we’ve seen some recurring themes in stakeholders stalling out ML projects that many data leaders may recognize — and should be ready to address.
1. Analysis paralysis interferes with deep R&D
Data science and ML are inherently experimental. There is almost never a “sure thing”. Making progress involves trying things out, learning from results, and continuously improving. That means you need to move on to the next phase of iteration and not get stuck forever in here-and-now.
But stakeholders tend to focus on exactly the here-and-now. In fairness, that’s what they see in the P&L. And they can have so many questions and concerns about what’s happening in the current state that it prevents the data science team from moving on to build the next, better version.
This can be debilitating when you’re working on mission-critical algorithms. Take surge pricing at Uber. At countless times we knew the algorithm wasn’t doing exactly what we wanted, and we also knew we had a long roadmap of ideas to make it better. And it was all too easy to spend all our time explaining and defending the current model to stakeholders. A key job of leadership was to provide (often uncomfortable) air cover to focus on R&D.
This issue is closely related to…
2. Unrealistic expectations of the ML build-cadence
Most stakeholders are accustomed to the traditional software development process, but ML functions more like a research lab — and the differences can be jarring.
Here’s what many stakeholders simply don’t get: in ML, the first thing you deliver is far from the best thing. And not in the way that software teams deliver an MVP to learn from beta users — with ML, it’s more fundamental. In software, maybe the beta delivers 90% of the final value; in ML, the first version might deliver only 10%.
Take moving your homepage from a static page to an ML-driven recommender system. You probably start super simple — using a very basic model that predicts clicks. Everyone knows clicks aren’t a good thing to optimize for, so as you work on making your click model better, you also start to layer on a multi-objective optimization (MOO) framework. Under the hood, maybe you also start to build more sophisticated user activity embeddings. Each of these new additions solves real shortcomings of the model that came before.
When stakeholders don’t grok a vision like this, and expect what’s coming off the assembly line in the first three months to fundamentally solve the final problem, conflict will inevitably arise.
Our experience is that ML is much more often experienced as a continuous set of improvements over time, and not a single step change improvement, and it’s key that management understand and expect that pattern. As a data leader, you need to constantly dance between selling the vision and managing expectations for the here and now.
3. Inherent tradeoffs aren’t how stakeholders think
As mentioned earlier, tradeoffs are a common friction point. These tradeoffs are hard! Taking the Uber pricing example from the intro, asking stakeholders for an acceptable amount of profit to trade for better user experience typically makes them very uncomfortable — these are metrics and KPIs they’ve been trusted to protect and improve. They will typically say they care a lot about both.
But tradeoffs are inherent to ML models, and without knowing what compromises you’re willing to make, you can’t build algorithms that will work effectively. Do we want fewer trips at higher prices? Or the most possible trips at reduced prices? There’s an ideal balance somewhere in between those two extremes, but getting stakeholders to nail down the specifics isn’t easy.
Our best advice here is to start by encoding the tradeoffs that your business teams are already making. In other words, look at their current decisions, and work out what that means for tradeoffs. And then go as high as you need to in the org to get that tradeoff stamped as the way to move forward. If that tradeoff lives in your team’s head, and would be surprising to your CEO, you have a problem.
4. Complex, noisy experiments make success murky
Major data science products can be major product innovations – which is exciting, and why many of us love this field. But this also makes testing for them tricky — and getting alignment on how you measure success can be difficult.
For instance, when we were launching Uber Express Pool (which allows users to share rides with other passengers) we couldn’t do a simple A/B test to assess its impact on revenue. Since you can’t get clean experiment results if you let riders in treatment match to other riders in control, we had to do city-level tests. And measuring the impact of city-level interventions is really difficult, because there are so many other factors at play — the weather, major events, or traffic incidents all impact user behavior in one market but not another. What do you do when a major snowstorm blankets the entire east coast and ruins your rollout measurement? (Fun fact, this is the final question of an HBS case study featuring yours truly.)
Additionally, because of their deep influence on user experience, we’ve found that many ML products have substantial long-term effects. And long-term effects are complicated. As Archie Abrams, VP Product, Head of Growth at Shopify recently explained on Lenny’s podcast, “Shopify has found that 30% to 40% of experiments that show positive short-term results have no long-term impact. Initial lifts can be misleading, and some of your ‘losers’ might actually yield unexpected long-term value.”
None of this is what stakeholders want to hear. They want simple A/B tests, where all the numbers are stat sig and colored green. If some are red and some are gray and some are green, as you almost always see with ML projects, you have a much more complex stakeholder situation to manage. If they aren’t adept at making sense of those numbers, it’s easy to trip up — and if you are only allowed to roll out if everything is green, your team can miss out on delivering a lot of value.
5. Data scientists struggle to explain their work
In many of these scenarios — when the veteran business executive is misinterpreting the latest test results or upset about how long it’s taking to see results from the latest model deployment — the only person who understands exactly what’s going on is a data scientist 2 years out of school and quaking in their boots.
Most data scientists come from academia, and their coursework did not include explaining models to the C-suite. Many don’t really know how to operate in the business world. As a data leader, it’s your responsibility to teach them how to educate stakeholders, explain their models to non-practitioners, and back them up when these conflicts arise.
A few years back I was fortunate to take a professional class on presentation put on by Lauren Weinstein (here’s a workshop of hers on Youtube). Wow, am I glad I did. The value of investing in communication skills — there are real algorithms here to learn! — is something we can’t emphasize enough.
Hard, but necessary
If any of these scenarios are familiar, take solace: you’re not alone. These issues are fundamentally challenging, and yet making it work is how we all create real value from data together.
Beyond helping data scientists learn to communicate better, you can improve ML literacy among stakeholders. Many business folks simply haven’t learned much about data science. Bring them into the fold – that’s the first step to proactively creating an environment where it’s OK to ask questions. Send them this article, host a lunch-and-learn, get to know them socially, and try to laugh about how tricky this stuff is.
But whatever you do, don’t ignore these issues. If you leave these areas to fester, they’ll continue to silently burn — potentially cratering morale and bringing productivity to a standstill.