CRO Glossary
Selection Bias
Selection bias is one of the most common and dangerous forms of bias because it can occur subtly during the data collection or participant selection process. Whether you're running an academic study, designing a product survey, or analyzing user behavior on an ecommerce site, selection bias can skew results and lead to poor decision-making.
In digital analytics and Conversion Rate Optimization (CRO), failing to account for selection bias can result in flawed A/B tests, wasted resources, and missed growth opportunities. That's why understanding how and when it arises is essential for researchers, marketers, and analysts alike.
What is Selection Bias?
Selection bias occurs when the group of individuals or data points chosen for a study or analysis is not representative of the larger population that you're trying to understand. This discrepancy introduces a systematic error, distorting results and limiting the ability to generalize findings.
It often happens during the sampling process when certain individuals have a higher or lower chance of being included. For instance, if you're testing a new website layout but only show it to repeat customers, your conclusions might not apply to new visitors.
Selection bias can creep in due to poor study design, self-selected participants, exclusion criteria, or dropouts (attrition). In all cases, the common thread is a lack of randomness in who is included in the dataset.
Impact of Selection Bias
- Invalid Conclusions: Results do not reflect reality due to a skewed sample.
- Overgeneralization: Findings may be applied to groups that weren’t accurately represented.
- Missed Opportunities: In CRO or product analytics, biased data can lead teams to optimize for the wrong audience.
- Wasted Resources: Campaigns or features based on faulty insights often fail to produce the expected results.
Types of Selection Bias
Understanding selection bias variations is key to recognizing and preventing it in your own work. Below are the most common types, starting with two that frequently appear in both research and real-world business contexts.

1. Sampling Bias
Sampling bias occurs when certain groups or individuals within a target population are systematically more likely to be included, from the sample than others. This leads to a sample that doesn’t accurately reflect the overall population, and the results derived from it can be significantly skewed.
This often arises due to flaws in the way participants or data points are selected. It could be due to convenience (e.g. surveying only people in one geographic area), technical limitations (e.g. excluding users who access a website via mobile), or deliberate choices (e.g. targeting only high-spending customers in a usability test).
Example:
An ecommerce company wants to evaluate whether their new checkout design reduces cart abandonment. They test it only on logged-in customers. Since logged-in users are typically more engaged and more likely to purchase, the results show a much lower abandonment rate than would occur in the general population of site visitors.
Impact:
- Poor generalizability of insights
- Inflated or deflated performance metrics
- Misguided strategy or product decisions
2. Attrition Bias
Attrition bias happens when participants drop out of a study or test over time, and the final results are based only on those who remain. These remaining subjects may differ significantly from those who left, leading to misleading conclusions.
Survivorship bias is a closely related concept, often seen in retrospective analyses where only “successful” examples are considered, and failed or dropped-out cases are ignored.
Attrition bias is common in long-term studies or multi-step funnels. For example, in product testing or onboarding flows, users who don’t complete the process are often excluded from final analysis, yet their behavior is highly relevant.
Example:
A SaaS company tests a new onboarding flow and reports a high activation rate. However, they only analyze users who completed the onboarding sequence, excluding those who dropped off early. This leads them to wrongly conclude the new flow is effective, when in fact many users abandon it midway.
Impact:
- Overestimation of success rates
- Ignoring critical friction points
- Misidentification of what drives user behavior
3. Self-Selection Bias
Self-selection bias occurs when individuals decide for themselves whether to participate in a study or dataset, rather than being randomly selected. This can lead to a skewed sample because those who choose to participate may share certain characteristics or motivations that differ from those who don’t.
This type of bias often arises in surveys, product feedback forms, or opt-in studies where participation is voluntary. People who feel strongly, about a topic are more likely to respond, while those who are neutral or indifferent may not engage.
Example:
A product team collects user feedback through a voluntary survey embedded in their app. Only users who are highly satisfied (loyal fans) or very frustrated (angry users) take the time to fill it out. The resulting data overrepresents extreme opinions and misses the silent majority.
Impact:
- Distorted insights into user sentiment or satisfaction
- Product or UX decisions based on biased opinions
- Misalignment between internal assumptions and actual user needs
4. Exclusion Bias
Exclusion bias happens when certain individuals or data are intentionally or unintentionally left out of the sample. This can occur during the design of the study or in the analysis stage, when certain groups are removed due to technical issues, data filters, or inclusion/exclusion criteria.
In analytics or A/B testing, this might occur if visitors using outdated browsers are excluded, or if mobile users are not tracked properly due to tracking script limitations. In research studies, it might happen when specific demographic groups are left out of the recruitment process.
Example:
An ecommerce team analyzes purchasing behavior but filters out all sessions with less than 30 seconds on site, assuming they’re not meaningful. However, some of these excluded sessions included quick repeat purchases by returning customers. The team misses key revenue-driving behaviors.
Impact:
- Missing out on key insights from specific user groups
- Biased metrics that fail to reflect real-world usage
- Flawed conclusions about product performance or campaign success
5. Observer Bias
Observer bias occurs when a researcher or analyst’s expectations, beliefs, or prior knowledge influence how they collect, interpret, or report data. In the context of selection bias, this often means certain outcomes or participants are emphasized or selected based on subjective judgment rather than objective criteria.
In qualitative research or usability testing, moderators might give more attention or weight to participants who confirm their assumptions. In analytics, analysts might favor data segments that support a hypothesis and disregard contradictory evidence.
Example:
A CRO team runs a user test to evaluate a new product page design. The facilitator, believing the new layout is superior, unintentionally leads users during the interview (“Wouldn’t you say this version is easier to use?”). They only flag usability issues in the old version and ignore hesitations in the new one.
Impact:
- Skewed insights and confirmation bias
- Wrong changes implemented based on subjective evidence
- Decreased reliability of qualitative findings
6. Non-Response Bias
Non-response bias happens when the people who don’t respond to a survey, feedback form, or study invitation differ significantly from those who do, and their absence distorts the results.
This is common in email or website surveys where only a fraction of users respond. If non-respondents are systematically different (e.g. less engaged, more price-sensitive), then their absence leads to biased conclusions about the overall user base.
Example:
An online store sends out a post-purchase survey to recent customers. Only 10% respond, and most of them are frequent shoppers. The feedback suggests high satisfaction and ease of checkout, but ignores the less engaged or frustrated customers who chose not to reply.
Impact:
- Overestimation of satisfaction, loyalty, or product-market fit
- Failure to detect friction points or unmet needs
- Misleading metrics for business decisions
How to Identify Selection Bias
Spotting selection bias early in your research or analytics process is essential to avoid misleading conclusions. Whether you're conducting academic research, user testing, or conversion rate audit, asking the right questions about your sampling method can help detect issues before they affect your results.
Here are key questions and considerations:
- Was the selection process random or non-random?
If participants or users were selected based on convenience, availability, or self-selection (e.g., voluntary sign-ups), your sample may not reflect the broader population. - Does the sample accurately represent your target population?
Review the composition of your sample: age, location, device usage, buying behavior, or any other demographic or psychographic variables relevant to your study. A lack of diversity or coverage may signal underrepresentation. - Were any groups unintentionally excluded?
Sometimes biases creep in because of how surveys are distributed, when they’re conducted, or who has access. For example, surveying only desktop users ignores mobile behavior, which could be critical in ecommerce. - Are dropout rates high in specific segments?
If participants from certain groups (e.g., low-intent users or first-time visitors) are more likely to abandon the study, this could point to attrition bias, a subset of selection bias. - Is your sample size large and varied enough?
A small or overly homogenous sample limits generalizability and makes it easier for bias to skew your insights. The more varied and representative your sample, the more reliable your conclusions will be.
Examples of Selection Bias
Selection bias can creep into many different contexts: academic research, digital marketing, A/B testing, UX research, and customer analytics. Below are several real-world examples that illustrate how this type of bias can distort data and lead to flawed conclusions.
1. Ecommerce A/B Testing Bias
Imagine you launch an A/B test for a new checkout page, but you exclude mobile users because the design isn’t mobile-ready yet. After two weeks, the test shows a 20% increase in conversions.
The problem? Your sample only includes desktop users, while mobile accounts for 70% of your total traffic. The test results may look impressive but won’t translate to real-world performance when the full user base is included.
This is a classic example of sampling bias in digital experimentation.
2. CRO Form Optimization Study
Let’s say you run a survey asking visitors why they didn’t complete a purchase. Only users who stayed on the page longer than 60 seconds get the survey popup.
The result? You’re only hearing from more engaged visitors. Those who bounced in under 10 seconds are excluded.
This introduces non-response bias and may lead you to draw the wrong conclusions about what’s broken in your funnel.
3. User Interviews for Product Research
A product team recruits beta testers from their existing user base to provide feedback on a new feature. Most of the respondents are loyal customers who already love the brand.
The result? You get overwhelmingly positive feedback, but it’s not representative of the broader audience, especially new or skeptical users. That’s self-selection bias in action.
4. Attrition in Longitudinal Studies
An ecommerce subscription service wants to measure customer satisfaction over time. They survey the same group of customers every three months.
But only the happiest and most loyal subscribers continue to participate, while dissatisfied customers cancel their subscription and leave the panel.
The result? Your satisfaction scores go up, not because you’re doing better, but because unhappy customers dropped out. This is survivorship (attrition) bias.
How to Avoid Selection Bias
Avoiding selection bias is essential if you want to produce reliable results, whether you want to run successful A/B tests, conduct user research, or analyze marketing data. While it's difficult to eliminate bias entirely, there are several strategies you can implement to minimize its impact.
1. Define a Representative Sample
Selection bias often begins at the sampling stage. If the people (or users) you're studying don’t accurately reflect the broader population, your conclusions won't either.
In academic research, this could mean choosing participants from a single university and trying to apply the findings to an entire country. In digital analytics or CRO, it might mean evaluating the success of a landing page using only returning users, while ignoring new visitors.
A non-representative sample leads to conclusions that are only valid for that subgroup, yet are often mistakenly generalized. In ecommerce, this might mean optimizing for loyal customers but losing potential conversions from first-time buyers who behave differently.
How to apply it:
- Segment your audience before testing (new vs. returning, mobile vs. desktop, etc.)
- Compare sample demographics with your full user base
- Use analytics tools (e.g., Google Analytics, Mixpanel) to assess who is included
2. Use Randomization
Randomization helps prevent bias by ensuring that each participant or visitor has an equal chance of being selected for any given group or condition. This is a gold standard in A/B testing and scientific research.
When you allow users to self-select into experiences or groups, whether knowingly or due to how your traffic is segmented, certain traits can cluster (e.g., high-intent users see Version A, low-intent users see Version B). This invalidates comparisons.
How to apply it:
- Use CRO platforms that support true random allocation
- Avoid “rolling” tests where different users are exposed to different versions at different times
- Don’t filter traffic before randomizing unless you’re intentionally targeting a segment
3. Track Drop-Offs and Attrition
Attrition bias (a form of selection bias) happens when certain participants or users drop out of a stud more than others, leaving behind a skewed sample.
In digital analytics, this occurs when you only analyze the behavior of people who complete a process (e.g., make a purchase or finish a survey), ignoring those who drop off along the way.
People who complete a purchase or survey often have different motivations or experiences than those who don’t. If you only study “completers,” you miss insights about pain points and barriers that prevent others from converting.
How to apply it:
- Set up funnel analysis to track where and when users abandon the process
- Compare the characteristics of completers vs. non-completers (e.g., device, source, behavior)
- Run exit surveys or polls at key drop-off points (e.g., cart abandonment, sign-up form)
4. Avoid Self-Selection
Self-selection bias occurs when participants or users decide on their own whether to participate in a study, test, or experience. This leads to skewed results because people who opt in typically have stronger opinions, more interest, or a specific motivation compared to those who don’t.
You end up learning about the behavior and preferences of highly engaged or enthusiastic users, while overlooking the silent majority who might represent your real business opportunity. This is especially common with surveys, reviews, or feedback forms.
How to apply it:
- Use passive data collection methods (e.g. behavioral analytics, heatmaps) alongside surveys
- Trigger surveys or popups randomly across your audience, not just for those who click a button or reach a specific page
- Make participation easy and appealing for a broader user base (e.g. no long forms, mobile-friendly UI)
5. Validate Assumptions with Multiple Data Sources
Relying on a single data source can reinforce selection bias if that source has limitations you’re unaware of. Instead, triangulating your insights from various tools and user perspectives helps detect inconsistencies and blind spots.
You may be basing decisions on an unrepresentative segment or missing data altogether. Cross-validating can uncover contradictions that signal something’s off in your sampling or segmentation.
How to apply it:
- Combine qualitative and quantitative methods: e.g., user testing + analytics + surveys
- Compare results from different platforms (e.g., heatmaps vs. GA4 behavior reports)
- Analyze both successful and failed user journeys
Other Types of Research Bias: Response Biases
While selection bias stems from how participants are chosen or filtered, response biases arise during the data collection process, specifically in how people respond to surveys, interviews, or questionnaires. These types of bias can significantly affect the accuracy of your findings, especially when collecting customer feedback or conducting user research for CRO and product decisions.
Response Bias
Non-Response Bias
Voluntary Response Bias
To Wrap Things Up
Selection bias is one of the most common threats to accurate research and data analysis.
By understanding the different types of selection bias and how they show up in real-world scenarios, you can design better studies, make smarter decisions, and avoid costly missteps. Always question how your data is collected, who it represents, and whether key segments may have been unintentionally left out.
The more intentional and inclusive your sampling methods, the more trustworthy your insights will be.