Sampling in Customer Experience researchFrom simple lies to difficult truths or how to avoid statistical mistakes
The statement “statistics don’t lie” hides an important truth. In research analytics, it is easy to draw wrong conclusions, even from correct data. Data collection methods, interpretation, and comparability of results are all at risk of error. For an analyst undertaking CX studies, knowledge of statistics and discipline in following rules that help to distinguish between trends and appearances are the lifesaver.
The retail trade, by moving to the Internet, has created an interesting precedent. For the first time in history, marketing research has become a low-cost privilege for all retailers, not just the largest brands. The use of analytical tools to monitor customer flows and activities on websites has become commonplace. The knowledge about visits, shopping carts, conversions, and transactions was supplemented by demographic data. Now it’s easy to ask questions in the form of surveys, which provide marketers and researchers with additional information about declared preferences, moods, and intentions.
The availability of digital research tools makes data-driven decisions easier to make. Nowadays, a growing number of people responsible for business development, in order to monitor key performance indicators and draw correct conclusions, are faced with the necessity of independent data analysis. It is certainly necessary
A small sample is a big lie?
In 2014 thirty-year-old Elizabeth Holmes, the founder of Theranos
Fortunately, in the work of an analyst, it is not charisma that determines success, and the principles leading to credibility remain the same. It is the results and correct assessment of the sample size that counts.
Let us start by saying that statistical research does not have to cover the entire population (that is, the ‘set of research elements’ to which we want to refer) in order for the results to be true for the whole group. Thus, on the basis of CX, researching customer experience does not require research of all consumers who have purchased a product or service of a given brand. In statistics, we call such a “partial” survey when information is collected only from units selected from a group. They form a sample community, i.e. a so-called research sample. Data on the sample allows for judgments to be issued on the entire population, provided that the sample is “representative” of the whole population. How to achieve this?
It is possible to use two methods of sample selection. The first, when the sample is selected by the researcher (the so-called targeted selection), requires that the research sample be statistically probable. Therefore, it must reflect the conditions of its structure that are true for the whole community. For example, in deciding about the population of Poland, the sample cannot consist of people who come only from large cities with higher education. For the results to be representative of Poles as a group, the sample should reflect the distribution of characteristics characteristic of Polish society, in which almost 40% of people live in rural areas and no more than 21% of the population have completed higher education. Knowing the percentage distributions of such variables in the population, we can reproduce the distributions in a sample. According to this method, a lot of attention needs to be paid to whoever we invite to the survey, and yet there is still a significant risk of error, which is difficult to estimate.
Therefore, an easier idea for selecting participants is a random selection, in which the respondents have an equal probability of joining the sample. This method requires a much larger sample, the exact number of which depends on many factors. Therefore, an easier idea for selecting participants is a random selection, in which the respondents have an equal probability of joining the sample. This method requires a much larger sample, the exact number of which depends on many factors.
These include, but are not limited to:
- an acceptable error rate of measurement (lower expected error = higher sample rate)
- the range of variability of the measured characteristic in the population (greater variance = greater sample)
- assumed confidence interval (smaller confidence interval = larger sample)
- population size (the larger the population, the smaller the sample may be).
When conducting CX studies on the online channel, i.e. e-commerce, corporate websites or mobile applications, one should take into account data from analytical systems that indicate the volume of traffic divided into unique visits and subsequent sessions. When looking for results representative for the whole group of customers, it is usually necessary to obtain a large number of opinions from users visiting the company’s website or application.
Many online marketing professionals are making this mistake. The frequent aim of internal website research is to improve its conversion via A/B experiments. The layman thinks that in such an experiment, when comparing the results of 100-person groups, he can assess the impact of differences in creations on the conversion, but for the results to be statistically significant, the group should usually be counted at least in thousands. How many? In A/B tests, in order to improve the conversion, it is worth using a calculator that estimates the necessary size of the research sample: https://www.optimizely.com/sample-size-calculator/
The same is true in the area of customer experience. When conducting this type of research as part of YourCX activity, we observe that companies are in a hurry to analyze data and draw conclusions even from only dozens of questionnaires collected during the next few days. Meanwhile, low scores in small research samples may simply be a coincidence. Conclusions based on small samples may lead to large errors.
Small changes need a lot of time
Among many fans of conversion optimization, there is a belief that small changes can and should be made often. Let’s assume a hypothetical situation. Bearing in mind that a large research sample is needed, we made some changes to the website and collected the responses of 5000 respondents in the following month. Opinions about the new website are slightly better than in the previous survey. Was the experiment successful? Not necessarily.
Another trap in data analysis is to rely only on the size of the sample to validate the results, instead of relying on proper sample selection. When introducing pro-client changes to the website, we cannot follow the reaction of satisfaction ratings only month to month. It’s not unusual if the service ratings did not increase significantly in the next month after the changes were introduced. Change is an abstract concept, we feel it only when we know the state “before” and compare it to the state “after”. Here again, the Internet analyst comes in handy, allowing us to track the so-called cohorts, i.e. groups of users with their history of visits.
Changes made to the service may be evaluated by new people who have not seen its previous version (did not experience a problem). These people will always give quite high scores because they did not have a negative experience. Thanks to cohort analysis with YourCX, it is possible to track returning user groups over time. This is the only way to investigate that the changes made at a particular point in a customer’s journey not only actually affect the customers but also those customers who previously complained about the problem. And since waiting for the return of users from this cohort, in such a number that the sample size can be representative, can take up to several months, it is good to conduct regular surveys and analyze the results at longer intervals.
Apples and pears
Amazon, the largest online store in the world, as a group of companies is currently worth almost a trillion dollars. However, 19 years ago it was only one of many online stores created on the wave of dot-coms, i.e. companies offering products in the online ordering model. America went crazy about them, believing that everything that develops on the Internet will achieve success. It took advantage of Wall Street by dragging hundreds of dot-coms onto the stock market. The public offer of the Amazon online bookstore was announced at a price of $18 per share, and after 3 years in January 2000, the same shares were worth $89. When the dot-com bubble broke, the shares of all e-commerce companies and portals went down without exception. Similarly, the Amazon price fell to $6 in September 2001. Subsequent Internet companies were closing their doors and many predicted the collapse of Amazon as well. When asked about this period, Jeff Bezos says he wasn’t worried about the results of the action, because he measured a number of other internal indicators that determined the condition of his business.
As well as a company value, any other value can be measured using different methods. Therefore, comparing two, even the same indicators, may not make any sense. In CX studies, the NPS methodology is commonly used, which is shared publicly in many industry benchmarks. This often results in a desire to compete for a higher score, but tracking competition without the right context can be misleading. Although the Net Promoter Score methodology strives to gain objective knowledge on the strength of recommendations based on the quality of experience, in practice it is difficult to compare results without knowing the methods of their measurement. The same result for satisfaction measured after the purchase or after the receipt of the shipment may assess two completely different experiences.
Not everyone understands numbers the same. It is the emotions of customers that are most important. There are surveys with NPS rating “9”, which in open questions reveal unpleasant experiences. The numbers are not always accurate, and the result is a simplification of reality.
Like the stock market, there is no point in comparing day-to-day performance in the CX study, especially as it increases the risk of relying on a low sample. Every investor knows that trends are more important than values and these are the ones to look for. With an NPS score the same as last year, it is possible that we were ahead of the competition. For example, for the software industry, the general trend in Net Promoter Score is that the average performance of the industry is falling, in correlation to the increase in customer requirements. If the annual NPS score of a software vendor stands at the same point, then there has nevertheless been an improvement in customer experience. Knowing the trends, you can compare the results more effectively.
How to find the truth?
Thanks to the YourCX research platform, it is possible to create an unlimited number of surveys tailored to the context of a visit, as well as to accurately track customer behavior. This gives a special opportunity to create “smaller”, contextual surveys asking only for a specific point in a customer’s journey.
On the other hand, it is possible to understand the context of negative experience on the basis of the observation of the client’s activities during the visit to the e-commerce platform. Web surveys on individual areas of online experience, e.g. on return and complaint pages, are combined with user history data in YourCX. This makes it possible to analyze relevant situations, not only for a specific point of contact (viewing information on returns or complaints) but also for customers who have returned to the store after purchase.
For example, we have a hypothesis that we owe our negative ratings to a return and complaint policy that has not been properly adjusted to expectations. Let us remember how important the size of a research sample is, which may falsify the results of the research. Even if the overall experience with the website was assessed by 10,000 people in a given month, it may not be enough to assess this single aspect. What we study is not as important as what customers rate. Selecting a sample in this particular case may be crucial!
Why? Imagine that 30% of people surveyed in the general CX study gave a negative opinion, but only 20% of those people made a purchase and only 5% of them returned the goods. This gives us a research sample of only 30 people! For this reason, we need more time to gather a satisfactory sample.
We have found real reasons for a low NPS, now it’s time to change! We have introduced a new return policy and dedicated phone number so that nervous customers don’t have to go through the general hotline or other complicated contact processes. In a new survey of the website, NPS score increased, but how can we make sure that with this particular change?
When comparing the NPS results for the new customer service standard, we can’t just look at the new month’s result, because some of the assessments may be from people who remember their previous experience. Again, cohort and context analysis is needed to explore recurring and new customer groups. Large sample size will allow you to filter the results in YourCX platform, e.g. exclude people who have filed a complaint or requested a refund under the previous procedure. The most interesting cohort will be the customers who returned goods under the old procedure and now evaluated the same experience after the changes. By collecting a sample of 500 people, we can calculate the percentage probability of a reliable result and if we obtain at least 95% certainty, we can report positive changes without hesitation. In this way, we not only measure but also realistically improve clients’ experiences.
Don’t wait, let your customers equip your company with the knowledge necessary for its development. Start customer experience research with YourCX, fill in the form on the website and contact our representative.