The Dark Side of Metric Fixation


One of the most misquoted sayings in business is “if you can’t measure it, you can’t manage it”. This statement (and its variations) is often meant to say that, to improve something, we need a precise metric that captures it and that should be tracked in order to understand if our efforts to improve it are effective. 

It is interesting that this “quote” is actually the complete opposite of the original, which was: 

“It is wrong to suppose that if you can’t measure it, you can’t manage it – a costly myth.” — W. Edwards Deming (The New Economics).

This difference between the original and the commonly used version highlights why it’s dangerous to rely on single metric to assess how well a business is performing: that one metric can be manipulated in ways unrelated to what it is supposed to measure.  This is the phenomenon described by Campbell’s law.

Campbell’s law states that the more important a metric is in social decision making, the more likely it is to be manipulated. 

In other words, when a single metric is used to determine success or failure, human beings are likely to try to optimize their behavior to improve that metric — sometimes with ridiculous or dangerous consequences. People manage the metric, rather than using the metric to help manage the underlying issue of interest.

A classic example is “teaching to the test” — a common practice familiar to all those who have taken any standardized test of some meaningful consequence. For example, some critics of standardized tests like the SAT and GRE point out that they don’t measure knowledge or potential, but instead measure how well individuals can study for the test.

A similar law, Goodhart’s Law, states that “When a measure becomes a target, it ceases to be a good measure” (as paraphrased by Marilyn Strathern).   Both statements come from social scientists (Campbell was a social psychologist, and Goodhart was an economist) who were frustrated that single metrics were used as a substitute for a holistic, nuanced understanding of complicated behaviors.

Campbell’s Law in Everyday Life 

We can find numerous examples of the perverse incentives of Campbell’s law in our daily lives. 

Wait Time vs. Hold Time in Healthcare

Consider this example shared by a nurse working at an outpatient healthcare facility. Healthcare administration is often metric-obsessed, even to ridiculous or harmful degrees. 

One performance metric used by this particular healthcare facility was how long patients had to wait on hold on a phone call. Employees were constantly pushed to improve this metric and, thus reduce the patients’ wait time (which certainly is a reasonable goal).

A common cause of long hold times was that it would take time for the receptionists to track down the patient’s doctor or nurse and transfer the call. One manager figured out that, if the receptionists muted the phone call and walked down the hall to find a nurse or doctor, rather than putting patients on hold while they transferred the call, the hold time was technically zero (even though the overall wait time for patients was often longer than if they were put on hold). 

The metric was drastically improved, so the manager got a huge bonus and a lot of respect within the organization. Other managers kept asking him how he was able to incur such dramatic change to the KPI. Of course, his solution didn’t improve the patient experience at all!

Rating Racketeering

Anyone who has ever used a gig-economy service (Airbnb, Uber, Lyft, etc.) may have experienced Campbell’s law. Many of these services use five-star ratings to evaluate customer satisfaction. If drivers or Airbnb hosts’ ratings fall below a certain threshold, they risk losing their jobs or customers. As a result, drivers and hosts are motivated not only to provide high-quality service (the metric’s intent), but also to manipulate the metric by directly asking customers to give them a perfect score.

For example, one of our coworkers recently stayed at an Airbnb. At the end of her stay, she received this message from her host:

“It would help me out tremendously if you enjoyed the apartment and are willing to leave me five stars. Airbnb actually has the rights to remove listings that have below 4.4 star rating, so every five star review goes a really long way in helping me host more guests like you in the future.”

By explaining that the score could impact his ability to continue on Airbnb, this host was clearly attempting to pressure our colleague to give him a five-star review (or no review at all), instead of an honest rating.

Rating racketeering occurs in many cultures. For example, Chinese users on the popular shopping platform Taobao are often pressured to provide high ratings. Many Taobao shops directly ask their customers to provide positive reviews. Some of them even provide coupons to shoppers who provide good reviews — and a few will even harass customers who leave negative ones. 

Campbell’s Law in Design

While Campbell’s law impacts many industries, it’s particularly nefarious in the world of digital design.

Anonymous Streaming Service

The following story was shared by a product designer — we’ll call her Keily — who wanted to remain anonymous.  Keily was working for a leading television-streaming subscription service. Despite having a fairly modern service, this company’s culture was somewhat similar to that of an old-fashioned cable provider, and that showed up in their metrics.

This company was highly concerned with customer retention — the longer a subscriber stayed with it, the more money it made. As a result, the company’s leadership wanted to reduce subscription cancellations. 

Keily noticed something odd. The company defined “saving” an account as occurring if someone entered the cancellation flow, but then abandoned it and did not cancel their subscription. While this metric was intended to capture the ability to change consumers’ mind and make them stay with the company, it had an unintended consequence — it incentivized the company’s designers to make the cancellation process intentionally difficult. 

During user testing, Keily observed that customers would become overwhelmed while trying to cancel their accounts online and would give up. In the company’s analytics metrics, this type of behavior was counted as “saving” an account. But the account wasn’t saved — the customer simply switched channels and called the customer-service department to cancel. 

As a result, the metric looked good, but the outcome of this farce was:

  • Angrier and more annoyed customers, not likely to return or recommend the company to others
  • Higher cost, because the company had to pay for the customer-service department to cancel the account instead of letting it happen online
  • Accounts that were still eventually canceled

Keily tried to bring this issue to the attention of the company’s leadership. She argued that an account is only saved if the company solved the user’s problem, not if it intentionally made it difficult to cancel the subscription online (an unethical, dark practice). Leadership didn’t want to listen, so Keily left her job to work for a company with higher UX maturity.

Metric Obsession Weakens UX: The Facebook Case

Facebook’s focus on increasing engagement with its products has been one of the most consequential case studies of metric obsession in recent years. 

As recently reported by whistleblower Frances Haugen, Facebook’s corporate culture heavily emphasized and incentivized user engagement on the platform. According to Haugen’s testimony, “the metrics make the decision.”  The company’s leadership prioritized the number of daily active users (DAUs) over moral and ethical priorities. 

Allegedly, Facebook’s leadership preferred using its algorithms to increase DAUs even when presented with evidence that doing so would be detrimental to users. Haugen accused them of intentionally prioritizing DAUs at the cost of exacerbating:

  • Habitual behaviors bordering on addiction
  • Eating disorders among teenagers
  • Ever-more-extreme political content
  • Misinformation   

Even though Facebook wasn’t really falsifying or manipulating the DAUs metric, prioritizing it at all costs clearly corrupted the intention behind the metric — which was to capture the overall health and growth of the company. Even though the DAUs weren’t being faked, this example is consistent with the underlying observation behind Campbell’s law. Obsession with one metric ultimately distorts the reality that the metric is supposed to evaluate. In this case, it resulted in sacrificing the actual user experience and long-term relationship with users.

The Facebook case is an extreme example. Because of the nature of its products and audience size, when such a company is hyperfixated on narrow metrics, the consequences are dire. However, examples of Campbell’s law abound in our industry. While these don’t always have ethical consequences like in Facebook’s case, they can still be disastrous to business goals

How to Avoid Dangerous Metric Obsession 

So, how can UX professionals avoid the trap of tracking metrics that are easily abused? 

First, recognize that all metrics are limited in their ability to describe the world fully and accurately; every metric that you collect reflects a decision about what you consider to be important. 

A metric is a signal that reflects the goal or outcome you’re seeking, but is not the full picture. There is no one truly absolute metric that completely captures a real-world behavior or phenomenon. Just because a measure is quantitative, does not mean that the data collected will be free from bias. The North Star framework (where a single metric is chosen as the overall ‘health’ indicator for a company) can be dangerous, because it amplifies the inherent limitations of that individual metric.  

Second, combine metrics together to obtain a fuller understanding of the phenomenon you wish to study. Satisfaction ratings can be a useful metric, but they can be a lot more powerful when combined with behavior metrics such as time on task or success rate or with analytics metrics. It’s a lot harder to game all these metrics than to game any one of them. 

Third, never rely on quantitative measures alone. Triangulation with qualitative data (such as the insights that come from user interviewsusability testsfield studies, and diary studies) help you to understand the nuanced consequences of design choices that might otherwise be missed entirely if you just rely on passively collected analytics data. 

Finally, treat data as a tool to assist in decision making, but do not allow the metrics alone to determine the decision. Don’t lose the forest for the trees. Keep an eye on what really counts — building positive long-term relationships that add value to users’ lives, and don’t damage them.

Conclusion

Quantitative metrics are useful and necessary for successful businesses and designs. But when those metrics are optimized at the cost of all else, we fail our business and our users. 

These case studies all illustrate the negative consequence of prioritizing short-term metric growth over positive long-term relationships with our customers.

Learn about how to select, apply, and interpret metrics appropriately and ethically in our Analytics & UX, and Measuring UX & ROI seminars. 

References

W. Edwards Deming. 2019. The New Economics for Industry, Government, Education. (3rd. Ed.) MIT Press, Cambridge, MA

Charles Goodhart . 1975. Problems of Monetary Management: The UK Experience. In Papers in Monetary Economics. Sydney: Reserve Bank of Australia.

Marilyn Strathern. 1997. ‘Improving ratings’: audit in the British University system. European Review, 5, 3 (July 1997), 305 – 321. DOI:10.1017/s1062798700002660.



Source link

More To Explore

Links 11/21/2021 | naked capitalism

This is Naked Capitalism fundraising week. 1534 donors have already invested in our efforts to combat corruption and predatory conduct, particularly in the financial realm.

Share on facebook
Share on twitter
Share on linkedin
Share on email