-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC 392 #493
base: main
Are you sure you want to change the base?
DOC 392 #493
Changes from all commits
1d14626
a53e91d
63bfe26
3ae0939
9dea82d
3ef898a
fb8bd7c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
--- | ||
id: f544b5a1-0689-4137-ab14-690903ae7902 | ||
blueprint: experiment | ||
title: 'Dimensional Analysis' | ||
landing: false | ||
exclude_from_sitemap: false | ||
updated_by: 0c3a318b-936a-4cbd-8fdf-771a90c297f0 | ||
updated_at: 1737480058 | ||
--- | ||
|
||
Sometimes, you might want to remove QA users or other internal traffic from your analyses because they're not representative of your customer base, and may skew results. | ||
|
||
Amplitude's Dimensional Analysis capabilities enable you to exclude groups of users that you define from analysis on a per-experiment basis. | ||
|
||
## Define your testers | ||
|
||
In your Feature Experiment, use Targeting settings to define your test users. | ||
|
||
Oftentimes you may want to remove QA users or your internal traffic from analysis because those users are not representative of your customer base and may skew results. You can do this by clicking on the "All Users" dropdown and selecting "All Users without testing users". Doing so will remove the users in the "Testing" section on the "Settings" tab from the analysis you are seeing. If you have selected multiple targeting segments, you may want to analyze each of the segments individually because you may see a lift in iOS for example but not on android. You can do this with a single click by clicking on the segment name in the "All Users" dropdown. The users in | ||
Check warning on line 19 in content/collections/experiment/en/dimensional-analysis.md
|
||
"Testing" section on the "Settings" tab will also be filtered out from the analysis and diagnostics charts. | ||
Check warning on line 20 in content/collections/experiment/en/dimensional-analysis.md
|
||
It can be helpful to investigate the impact of experiments on specific user segments. Experiments that are not statistically significant overall can often contain a small group of users for which the result is statistically significant. Likewise, for statistically significant results, the overall performance can be driven by a small segment of users. | ||
Check warning on line 21 in content/collections/experiment/en/dimensional-analysis.md
|
||
You can further investigate the impact of the experiment on specific user segments by clicking on the "All Users" button to look at particular Amplitude out of the box segments, saved segments, or cohorts. If you want to add other user property filters, you can click on the "Add Filter" button. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,59 +13,117 @@ | |
|
||
In the *Analysis* card, you’ll be able to tell at a glance whether your experiment has yielded **statistically-significant** results, as well as what those results actually are. Amplitude Experiment takes the information you gave it during the design and rollout phases and plugs them in for you automatically, so there’s no repetition of effort. It breaks the results out by variant, and provides you with a convenient, detailed tabular breakdown. | ||
|
||
{{partial:admonition type='note'}} | ||
This article continues directly from the [article in our Help Center on rolling out your experiment](/docs/feature-experiment/workflow/experiment-test). If you haven’t read that and followed the process it describes, do so before continuing here. | ||
Amplitude doesn't generate p-values or confidence intervals for experiments using binary metrics (for example, unique conversions) until each variant has 100 users **and** 25 conversions. Experiments using non-binary metrics need only to reach 100 users per variant. | ||
|
||
## Filter card | ||
|
||
On the Filter card, set criteria that updates the analysis on the page. Filter your experiment results with the following: | ||
|
||
* Date | ||
* Segment | ||
* Property | ||
|
||
### Date filter | ||
|
||
The date filter defaults to your experiment's start and end date. Adjust the range to scope experiment results to those specific dates. | ||
|
||
### Segment filter | ||
|
||
The segment filter enables you to select predefined segments, or create one ad-hoc. Predefined segments include: | ||
* Experiment | ||
* All exposed users. Users who were exposed to a variant. | ||
* Testers. Users added as "testers" during experiment configuration. | ||
* Exclude testers. Excludes users added as "testers" during experiment configuration | ||
* Exclude users who variant jumped. Excludes users who were exposed to more than one variant. | ||
* Amplitude | ||
* New user. Users who triggered at least one new user event during the selected date range. | ||
* Mobile web. Users who triggered events on the web from a mobile device. | ||
* Desktop web. Users who triggered events on the web from a desktop device. | ||
|
||
{{partial:admonition type="note" heading="Support for segments"}} | ||
The Testers and Exclude Testers segments are available on feature experiments that use [Remote evaluation](/docs/feature-experiment/remote-evaluation). | ||
|
||
The Exclude users who variant jumped segment is available on experiment types other than [multi-armed bandit](/docs/feature-experiment/workflow/multi-armed-bandit-experiments). | ||
{{/partial:admonition}} | ||
|
||
Amplitude will not generate p-values or confidence intervals for experiments using binary metrics (i.e., unique conversions) until each variant has 100 users **and** 25 conversions. Experiments using non-binary metrics need only to reach 100 users per variant. | ||
These segments update in real-time. | ||
|
||
## View results | ||
Click *+Create Segment* to open the Segment builder, where you can define a new segment on the fly. Segments you create in one experiment are available across all other experiments, and appear in the *All Saved Segments* category. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we have a doc that explains saved segments, we can link to that doc |
||
|
||
To generate and view experimental results, follow these steps: | ||
### Property filter | ||
|
||
1. In your experiment, the *Activity* page includes two sections to view your results. The *Summary* section and the *Analysis* card. The *Summary* section will describe your experiment's hypothesis and note whether it has or has not reached statistical significance. | ||
Filter your experiment results based on user properties. For example, create a filter that excludes users from a specific country or geographic region, or users that have a specific account type on your platform. | ||
|
||
An experiment is said to be **statistically significant** when we can confidently say that the results are highly unlikely to have occurred due to random chance. (More technically, it’s when we reject the null hypothesis.) That might sound pretty subjective, but it’s grounded solidly in statistics. Stat sig relies on a variant’s **p-value**, which is the probability of observing the data we see, assuming there is no difference between the variant and the control. If this probability drops below a certain threshold (statisticians refer to this threshold as the **alpha**), then we consider our experiment to have achieved statistical significance. | ||
## Data Quality card | ||
Check warning on line 57 in content/collections/workflow/en/experiment-learnings.md
|
||
|
||
The *Summary* section will display a badge labeled *Significant* if stat sig was met, and a badge labeled *Not Significant* if stat sig was not met. | ||
{{partial:admonition type="note" heading="Availability"}} | ||
Data Quality is available to organizations with access to Experiment who have recommendations enabled. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have a doc to link out to for how to enable the recommendations? Recommendations are enabled by default but people can disable them There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this experiment-specific recommendations? Or the recommendations in Audiences? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
{{/partial:admonition}} | ||
|
||
The *Summary* section may include multiple badges simultaneously: | ||
Data Quality checks the setup, instrumentation, and statistical integrity of your experiment as it runs, and alerts you to issues it finds. | ||
|
||
* *Inconclusive*: the test was inconclusive for the primary metric. | ||
* *Above Goal* or *Below Goal:* the primary metric's mean was either **above** or **below** its goal depending on the direction of the test (increase = above, decrease = below). | ||
* *Above Control* or *Below Control:* the primary metric's mean was either **above** or **below** the control's mean, depending on the direction of the test (increase = above, decrease = below). These badges are only relevant to stat sig results. | ||
When you expand a category, or click *Guide*, the Data Quality Guide opens in a side panel where you can address or dismiss issues | ||
|
||
 | ||
## Summary card | ||
|
||
2. At the top of the *Analysis* section is an overview of how your experiment performed, broken down by metric and variant. Below that is the experiment's **exposure definition:** how many variants were shown, what the primary metric was, and what the **exposure event** was. This is the event users will have to fire before being included in an experiment. | ||
|
||
{{partial:admonition type='note'}} | ||
The exposure event is **not the same thing** as the assignment event. If, for example, you’re running an experiment on your pricing page, a user might be evaluated on the home page for the experiment—but if they don’t visit the pricing page, they'll never actually be exposed to it. For that reason, this user should not be considered to be part of the experiment. | ||
{{/partial:admonition}} | ||
|
||
To learn more about exposure events, see [this article in the Amplitude Developer Center](/docs/feature-experiment/under-the-hood/event-tracking). | ||
|
||
Click _Chart Controls_ to see the chart definition. | ||
|
||
You can also create a chart in Amplitude Analytics from this experiment by clicking *Open in Chart*. | ||
|
||
{{partial:admonition type='note'}}If you are running an A/B/n test, Amplitude Experiment displays the confidence interval / p-value for the control against each treatment individually. To instead see the comparison between two non-control treatments, either change the control variant, or open the test in Analytics and create a chart using the two treatments you're interested in. | ||
{{/partial:admonition}} | ||
|
||
3. If desired, adjust the experiment’s **confidence level**. The default is 95%. You can also [choose between a sequential test and a T-test](/docs/feature-experiment/workflow/finalize-statistical-preferences). | ||
{{partial:admonition type="note" heading="Availability"}} | ||
Summary is available to organizations with access to Experiment who have recommendations enabled. | ||
{{/partial:admonition}} | ||
|
||
The Summary card describes your experiment's hypothesis and lets you know if it's reached statistical significance. | ||
markzegarelli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
{{partial:admonition type="note" heading="Statisical significance and Amplitude"}} | ||
Amplitude considers an experiment to be **statistically significant** when Amplitude can confidently say that the results are unlikely to have occurred due to random chance. More technically, it’s when Amplitude rejects the null hypothesis. That may sound subjective, but it’s grounded solidly in statistics. Statistical significance relies on a variant’s **p-value**, which is a value that represents the likelihood that your results occurred by chance. A lower p-value means your results are probably not random, and there's evidence to support your hypothesis. If this value drops below a threshold, Amplitude considers the experiment to be statistically significant. | ||
{{/partial:admonition}} | ||
|
||
The Summary card displays a badge labeled *Significant* if the experiment reached statistical significance, and a badge labeled *Not Significant* if it didn't. This card can display several badges at once: | ||
|
||
* *Inconclusive*: the test was inconclusive for the primary metric. | ||
* *Above Goal* or *Below Goal:* the primary metric's mean was either **above** or **below** its goal depending on the direction of the test (increase = above, decrease = below). | ||
* *Above Control* or *Below Control:* the primary metric's mean was either **above** or **below** the control's mean, depending on the direction of the test (increase = above, decrease = below). These badges are only relevant to stat sig results. | ||
|
||
 | ||
|
||
|
||
## Analysis card | ||
|
||
At the top of the Analysis card is an overview that explains how your experiment performed, broken down by metric and variant. Below that, a collection of experiment results charts, which you can analyze by metric, display information about: | ||
|
||
* Confidence intervals | ||
* Cumulative exposure* | ||
markzegarelli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* Event totals | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This chart is not always for event totals. For example the metric can be prop sum or uniques or anything |
||
* Mean value over time | ||
markzegarelli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For more information, see [Dig deeper into experimentation data with Experiment Results](/docs/analytics/charts/experiment-results/experiment-results-dig-deeper#interpret-your-results.) | ||
|
||
{{partial:admonition type="tip" heading="Chart filtering"}} | ||
The Experiment Results chart on the Activity tab responds to the selections you make in the [Filter card](#filter-card). | ||
{{/partial:admonition}} | ||
|
||
Click *Open in Chart* to open a copy of the Experiment Results in a new chart. | ||
|
||
{{partial:admonition type='note'}} | ||
If you are running an A/B/n test, Amplitude Experiment displays the confidence interval / p-value for the control against each treatment individually. To instead see the comparison between two non-control treatments, either change the control variant, or open the test in Analytics and create a chart using the two treatments you're interested in. | ||
{{/partial:admonition}} | ||
|
||
If desired, adjust the experiment’s **confidence level**. The default is 95%. You can also [choose between a sequential test and a T-test](/docs/feature-experiment/workflow/finalize-statistical-preferences). | ||
|
||
{{partial:admonition type='note'}} | ||
Lowering your experiment’s confidence level will make it more likely that your experiment achieves statistical significance, but the trade-off is that doing so increases the likelihood of a false positive. | ||
{{/partial:admonition}} | ||
{{partial:admonition type='note'}} | ||
Lowering your experiment’s confidence level makes it more likely that your experiment achieves statistical significance, but the trade-off is that doing so increases the likelihood of a false positive. | ||
{{/partial:admonition}} | ||
|
||
4. Set the **time frame** for your experiment analysis, either from the selection of pre-set durations, or by opening the date picker and choosing a custom date range. | ||
## Diagnostics card | ||
|
||
The tables, graphs, and charts shown in the Analysis section are explained in depth in the articles on [understanding the Experiment Analysis view](/docs/feature-experiment/analysis-view) and [interpreting the cumulative exposures graph in Amplitude Experiment](/docs/feature-experiment/advanced-techniques/cumulative-exposure-change-slope). | ||
The Diagnostics card provides information about how your experiment is delivering. It shows charts about: | ||
|
||
{{partial:admonition type='note'}} | ||
Amplitude Experiment needs something to compare your control to in order to generate results. If you neglect to include **both** the control and **at least one** variant, your chart will not display anything. | ||
{{/partial:admonition}} | ||
* Assignment events (cumulative and non-cumulative) | ||
* Exposure events (cumulative and non-cumulative) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
* Assignment to exposure conversion | ||
* [Variant jumping](/docs/feature-experiment/troubleshooting/variant-jumping) | ||
* Anonymous exposures (cumulative and non-cumulative) | ||
* [Exposures without Assignments](/docs/feature-experiment/troubleshooting/exposures-without-assignments) (cumulative and non-cumulative) | ||
|
||
For more control, open any of these charts in the chart build. | ||
|
||
## Interpret notifications | ||
|
||
|
@@ -75,10 +133,10 @@ | |
|
||
Click the check box next to the desired notification: | ||
|
||
* **Experiment end reached:** You will receive this notification when your experiment is complete. | ||
* **SRM detected:** You will receive this notification if a [sample ratio mismatch](/docs/feature-experiment/troubleshooting/sample-ratio-mismatch) issue is identified. | ||
* **Long-running experiments:** You will receive this notification when your long-running experiment is complete. | ||
* **Statsig for the recommendation metric is reached:** You will receive this notification when your experiment's recommendation metric has reached stat sig. | ||
* **Experiment end reached:** Amplitude sends this notification when your experiment is complete. | ||
* **SRM detected:** Amplitude sends this notification if it identifies a [sample ratio mismatch](/docs/feature-experiment/troubleshooting/sample-ratio-mismatch) issue. | ||
* **Long-running experiments:** Amplitude sends this notification when your long-running experiment is complete. | ||
* **Statsig for the recommendation metric is reached:** Amplitude sends this notification when your experiment's recommendation metric has reached stat sig. | ||
|
||
Amplitude Experiment sends a notification to the editors of the experiment. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section header is about "testers" but there is information that is not specific to "testers"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably retire this article and move the information about testers to somewhere more logical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a new article right?