Verboten! Cross Cultural NPS Comparisons

closeup photo of

I was prepared for the worst.

It was a short domestic flight in France so I thought I would gamble on EasyJet.

I had some experience with “ultra low cost” airlines in the states so I assumed that EasyJet would be the same. I braced for the worse: an old rickety airplane, upcharge for every small amenity with sparse and sub-par customer service.

I was pleasantly surprised. While they did charge for baggage and amenities they were very upfront about everything from booking forward. The plane was older, but clean and freshly painted. The customer service was helpful, knowledgeable, friendly, and polite. We even got “speedy” boarding because I was traveling with my family. All in all our EasyJet experience was much better than the majority of domestic carriers I experience regularly in the United States.

Does EasyJet do well in NPS? According to npsbenchmarks.com EasyJet ranks at a -16. Not good. This is far below the mainline carriers such as British Airways, Lufthansa, or other carriers In Europe. But when your point of comparison is US domestic providers they kick it out of the football arena (or at least through that net thing at the end of the field). Did they have a good day? Maybe. But based on my experience traveling both in the US and Europe, air travel in Europe is a dream compared to the United States.

According to the same site, all US providers are in the positive side of NPS with Southwest at 62, Jet Blue at 59, Delta at 41, United at 10, and American Airlines at 3. Would it be fair to conclude EasyJet has worse service than all the mainstream US providers? Based on NPS alone you might be tempted to say yes.

I would argue it is an unfair and unwise comparison. In fact, this is one of three fundamental reasons why cross-cultural comparisons of many attitudinal metrics (including NPS) are fraught with problems that make their comparison problematic.

Reason 1: Your Experience Sets the Baseline

First, as illustrated in my EasyJet example, your past experience will strongly influence the baseline for your future comparisons. If you have always had nearly flawless experience with shipping in the United States and Europe and then you move to a developing country, of course you are going to be disappointed. Reverse the situation and you will be ecstatic.

This baseline effect is one of the reasons why older car buyers are generally more pleased with the experience of purchasing a vehicle than younger buyers. I remember the youth-oriented Scion brand getting trounced in JD Power rankings when they launched. Their scores were much worse than even the Toyota parent brand. What was really strange was Toyota invested heavily in training and new customer friendly processes for a brand which was housed within Toyota dealerships. Was the Scion customer experience bad? Nope, they just focused on younger buyers who had different (higher) expectations.

This “experience” gap has been usefully exploited by startups such as Lemonade, Uber, Airbnb, and others to disrupt whole industries. Your baseline experience will influence your metrics. Are they real differences? They are to your customers. Can you compare them? You should do so with great care.

Reason 2: Lack of Language Equivalency

Another major issue is the concept of language equivalency. The word “good “ or “would recommend” in English does not always hedonically translate equivalently into other languages. Take for example the word “malo” in Spanish. I am not an expert in Spanish, but my understanding it is that is not good, but probably not as bad as “terrible” but probably not as good as “poor”. I am sure there are better examples, but you get the idea.

Even the most careful screening and testing of Likert based anchor points may not work out, as there may not be exact hedonic equivalents in other languages. There are also other subtle language differences that may introduce bias.

The NPS scale canon holds it should go from 0 (Definitely not recommend) to 10 (Definitely recommend) from low (on the left) to high (on the right). This is very good for Western participants, but what about other cultures? Many Middle Eastern languages go from top to bottom or even right to left. Some Asian languages also go from top to bottom. Does this influence how they may respond to a Western-based left to right approach? Probably. But there is still one even larger issue.

Reason 3: Cultural Response Bias

Different cultures tend to respond differently to Likert scale questions in general. For example, some cultures tend to be more generous in their grading, while others are harsher graders. In the research, I have conducted in the US (and from others globally) Hispanic responders tend to be much more lenient, (give higher scores), while Asian responders are much harsher graders. Is this due to the actual service provided? Nope. These are simply cultural differences in response style. This is further exasperated when we expand to look across different countries for some kind of global comparison. Can this be corrected statistically? Perhaps, but I have encountered no practitioners who has taken the trouble to do so (be happy to hear from you if you have!)

Another unhappy psychometric issue is this customer experience ratings are also impacted by where you live. Without fail, those who live in more densely populated areas are much harsher graders on nearly everything. This means that customers in Hong Kong will always score lower than those in Cheyenne, Wyoming even if the experience was identical.

What To Do Instead

So should you just give up hope comparing different regions of the world? While I would not recommend direct comparisons on NPS or other “NPS”-like measures there are many other practical options. After all, large multi-national organizations must have a way to understand the health of their customer experience globally and where to invest. The good news is there are many other ways by which you can judge where to allocate your time and effort rather than by simple (and misleading) direct comparisons between geographies.

Idea 1: Link Attitudes to Outcomes

A good way of doing this is by conducting Linkage Analysis within each geography. In linkage analysis you connect the exogenous attitudinal variables (perceptions of price, service, product, etc.) to business outcome variables. Since many times this is done at an aggregate level, it is useful to have mediator variables such as NPS used in the analysis. In this way you know what “score” is good by geography by connecting to actual business outcomes. What is important in Turkey may not be in Brazil. Knowing what drives outcomes is much more important than a simple index for comparison. While statistically sound, some front line operators might not trust the perceived voodoo of statistical analysis the underlies this approach. If this is an issue simpler approaches can also be applied.

Idea 2: Look at Improvement vs. Raw Scores

One very simple approach is to look at the amount of improvement a geographic unit has over a period of time. In this way you are not necessarily looking that the score by itself, but the improvement in the score over time. While not perfect (ceiling effects tend to put a damper on the party over time), it is a simple one to apply that everyone can understand.

Idea 3: Focus on Antecedent Behaviors

A third approach is to not focus on attitudinal measures at all, but focus on behaviors. How many cases were closed? How many action plans were implemented? How many complaints were registered? These are all antecedents to an attitudinal construct and usually are influential on business outcome variables (e.g., retention, share of wallet, etc.). While not perfect either, these behavioral measures are not plagued (as much) by the cultural issues.

Idea 4: Get to Language Sentiment

Probably the best approach if cross-cultural comparisons are needed is to start with the true voice of the customer: the verbatim. Build up native text taxonomies of positive and negative feedback in the native language. You can then build indices of the ratio to positive to negative relative the culture and language in which the experience is embedded. Many text analytics providers offer great solutions for this today. It will take a while for your stakeholder to get comfortable with this approach, but it has the added benefit of also being a bit more difficult to be the victim of the “coaching” customers to provide a specific answer. If you want to get really sophisticated hook this in with the linkage approach (Idea #1) and you have a very robust approach.

Practically Speaking

Country and global managers need to make comparisons. This is a business reality. If you really need to do these comparisons, I would strongly advocate a transition to one of four ideas above. At the very least, you should educate your management about the perils of cross-cultural NPS comparisons. Just like what is considered spicy in Calcutta is very different than what is considered spicy in Cincinnati, so too is your Customer Experience and how it is measured.

Related Posts

About Us
curiosity black text

CuriosityCX sits at the intersection of customer experience, behavioral science, and business strategy.

Let’s Socialize

Popular Post