coterie
Neon Blue treatment vs control across 9 whitelist-gated Klaviyo flows
2026-05-12
One row per experiment. The Target metric column names the intent-aligned primary; the lift and significance are read against it.
| Experiment | Target metric | Treatment | Control | n (T / C) | Δ | z | Confidence |
|---|---|---|---|---|---|---|---|
| Pilot — Cart Abandonment Wipe Upsell | Wipe revenue per exposure-day (d30) | 1.1508 | 1.0979 | 35,893 / 120,589 | +4.8% | 4.56 | stat-sig |
| Pilot — Browse Abandonment Wipe Upsell | Wipe revenue per exposure-day (d30) | 1.1486 | 1.1008 | 24,094 / 82,975 | +4.3% | 4.14 | stat-sig |
| Flow 1 — First Purchase OTP, Diaper NB/1/2 | First AR Order — d60 unique conversion rate | 0.2308 | 0.1617 | 130 / 371 | +42.7% | 1.66 | marginal |
| Flow 2 — First Purchase OTP, Diaper Size 3+ | First AR Order — d60 unique conversion rate | 0.0959 | 0.1182 | 73 / 220 | -18.9% | -0.55 | n.s. |
| Flow 3 — First Purchase OTP, No-Hero | First Diaper-or-Pant non-trial order — d60 unique conversion rate | 0.2254 | 0.2197 | 71 / 173 | +2.6% | 0.10 | n.s. |
| Flow 4 — First Purchase OTP, The Pant | First AR Order — d60 unique conversion rate | 0.0000 | 0.3333 | 2 / 3 | -100.0% | -1.22 | n.s. |
| Flow 6 — Trial Pack | First non-trial order — d30 unique conversion rate | 0.2222 | 0.2941 | 27 / 51 | -24.4% | -0.70 | n.s. |
| Flow 10 — Milestone Pant | Skincare/Wipes on subscription — d30 unique | 0.0000 | 0.0000 | 0 / 0 | 0% | NaN | n.s. |
| Pilot — Reclaim Browse Abandonment (Retention) | Wipe revenue per exposure-day (d30) | 0.8381 | 0.8969 | 2,274 / 5,810 | -6.6% | -1.30 | n.s. |
stat-sig = |z|≥1.96, marginal = |z|≥1.64, n.s. = not significant. Values for normalized metrics are events (or USD) per capped exposure-day; multiply by the window length for a back-of-envelope d{N}-equivalent.
Audience
Shoppers who started checkout but bailed before paying. Caught 5 minutes after the abandonment event by Klaviyo's checkout-abandonment flow.
Intent
Two jobs in one email: recover the abandoned cart AND attach wipes to whatever the user already had in their basket. Treatment swaps the plain "finish your order" creative for a Neon Blue-rendered template centered on the wipes attach.
Why this metric
Sum of wipe-product line revenue per user, divided by capped exposure days. Captures both attach-rate AND per-attach basket size — the upsell substitution effect the variant is designed to create. Uses the products taxonomy (metric_category = 'wipe').
Read
Stat-sig positive (z=+4.56) at full cohort N. Treatment lifts wipe revenue per exposure-day by ~5% without cannibalizing other categories. Clearest positive read in the portfolio.
Leading variant
variant_id: 9580970a…slot: default · n=104 · mean=0.7084 · LB=0.4561
Audience
Users who viewed product pages but didn't add-to-cart or check out. Caught 15 minutes after the browse event inside Klaviyo's product-abandonment series.
Intent
Same wipe-upsell test as the cart pilot, but earlier in the funnel — replace the existing browse-abandon wipe-upsell creative with a Neon Blue-rendered variant.
Why this metric
Same definition as the cart pilot. d30 horizon gives the lift time to materialize — d7 attach effects sometimes pull-forward, while d30 captures durable behavior change.
Read
Stat-sig positive (+4.3%, z=+4.14). Slightly weaker than the cart pilot but same direction at meaningful scale.
Leading variant
variant_id: a5fd7431…slot: default · n=273 · mean=1.2365 · LB=1.0670
Audience
First-time Coterie buyers placing a one-time (non-subscription) order for newborn-tier diapers (sizes NB, 1, 2). Reached ~1 day after that first OTP purchase.
Intent
Convert OTP buyers into AR Customers. Coterie's success event for this flow: "First AR order." Variants test alternative framings to nudge first-time newborn-tier diaper buyers into a recurring subscription.
Why this metric
Coterie's design-spec event for this flow. Binary per user (did they place a subscription order within 60 days of cohort entry). Strict d60 window — counts only users whose 60-day clock has fully elapsed.
Read
+42.7% relative lift in first-AR conversion, z=+1.66 — marginal but the strongest direction among the OTP-OTP→AR flows. Volume will keep ramping; revisit when more users complete d60.
Leading variant
No qualifying variant — all treatment variants underpowered (n < 50).
Audience
First-time Coterie buyers placing a one-time order for Size 3+ diapers, reached ~1 day after purchase. Older-baby households (Size 3+ = ~6+ months) trying the brand without committing.
Intent
Convert OTP Diaper Buyers (sizes 3+) into AR Customers. Coterie's success event: "First AR order." Variants test alternative framings for the older-baby cohort, where the next-purchase cycle is slower and brand-fit takes more deciding.
Why this metric
Same as flow-1 — binary per user for first subscription order within 60 days of cohort entry. The older-baby (Size 3+) audience has a slower reorder cadence, so d60 catches the natural decision window.
Read
-18.9% directional on first-AR conversion, z=-0.55 at n=293 with d60 complete. Underpowered for a verdict.
Leading variant
No qualifying variant — all treatment variants underpowered (n < 50).
Audience
OTP buyer whose first one-time order had no hero diaper or pant in the basket (skincare, wipes, accessories). Reached 25 minutes after that purchase.
Intent
Convert into a Diapering Customer. Success event per Coterie's spec: "First order containing diaper or pant (excluding trial)." Variants test whether per-user generated copy can move a hard-to-merchandise no-hero audience into the core diapering category.
Why this metric
Coterie's spec event for this flow. Binary per user: did they place a non-trial order containing a diaper or pant line within d60. Uses the products taxonomy (metric_category IN ('diaper','pant') AND NOT is_trial).
Read
+2.6% on first diaper-or-pant non-trial, z=+0.10 — flat. Earlier (-53%) read was an artifact of incomplete product categorization; the corrected taxonomy now counts The 360° as pant, The Swimsuit as a diaper, etc.
Leading variant
No qualifying variant — all treatment variants underpowered (n < 50).
Audience
Recent one-time buyers of The Pant (pull-up training pants), entering a post-purchase nurture ~2 days after purchase. Users haven't yet converted to subscription on pants.
Intent
Convert OTP Pant Buyers into AR Customers. Coterie's success event: "First AR order." Variants test alternative framings to move a first-time pants buyer onto a recurring subscription.
Why this metric
Same as flow-1/2 — binary per user for first subscription order within d60 post-cohort.
Read
Severely underpowered (only 5 users with d60 complete). No verdict possible.
Leading variant
No qualifying variant — all treatment variants underpowered (n < 50).
Audience
First-time buyers who took the lowest-commitment entry — the Trial Pack — rather than a full order or subscription. Reached ~2 days after that purchase.
Intent
Convert trial samplers into Full Size Customers. Coterie's success event: "First order with non-trial products." Variants test whether per-user copy can move trial buyers off the no-commitment entry SKU.
Why this metric
Spec event. Binary per user — did they place an order containing any non-trial product line in the 30 days post-cohort. Uses the products taxonomy (NOT is_trial).
Read
-24.4% directional on first-non-trial conversion at n=78 with d30 complete. Underpowered; could be real or noise.
Leading variant
No qualifying variant — all treatment variants underpowered (n < 50).
Audience
Existing subscribers whose baby has aged into The Pant life stage. Klaviyo detects the milestone and fires the email immediately.
Intent
Reduce Pant Churn + Upsell Flush Wipe + Education. Coterie's success event: "Flush Wipe added to subscription OR first Flush Wipe order." Variants pitch a Flush Wipe upsell to pant-milestone users.
Why this metric
The spec event ('Flush Wipe added to subscription') collapses into our broader 'wipes-on-subscription' check since we don't separate flush wipes from regular wipes. Strict d30 — requires completed 30-day window.
Read
No users have completed d30 yet — flow was launched too recently. Revisit in ~2 weeks.
Leading variant
No qualifying variant — all treatment variants underpowered (n < 50).
Audience
Existing customer/subscriber who browsed the wipes catalog but didn't add-to-cart or check out. Caught 45 minutes after the browse event — the retention version of the abandonment flow.
Intent
Re-engage a known buyer who lapsed into browsing. Treatment swaps the soft "See something you like? We caught you" nudge for a Neon Blue-rendered wipe-upsell variant with model-chosen copy.
Why this metric
Same as the other wipe-upsell pilots. Retention audience has higher baseline wipes-buying, so the relative-lift bar is higher.
Read
Directionally negative (-6.6%, z=-1.30) but not significant at n=8K. High control conversion floor on this already-engaged retention cohort makes the bar harder to clear.
Leading variant
variant_id: e022776e…slot: default · n=211 · mean=1.2126 · LB=1.0377