coterie

Coterie Variant Performance Evaluation

Neon Blue treatment vs control across 9 whitelist-gated Klaviyo flows

2026-05-12

Summary
Each experiment is read against the metric that best matches its underlying intent — subscription revenue for first-purchase nurtures, wipe-line revenue for upsell pilots, pant orders for the milestone-pant transition. Long-horizon metrics use exposure normalization so partial-window users contribute proportionally to the cohort, letting us read effects before strict d30/d60 windows complete.

Headline read per experiment

One row per experiment. The Target metric column names the intent-aligned primary; the lift and significance are read against it.

ExperimentTarget metricTreatmentControln (T / C)ΔzConfidence
Pilot — Cart Abandonment Wipe UpsellWipe revenue per exposure-day (d30)1.15081.097935,893 / 120,589+4.8%4.56stat-sig
Pilot — Browse Abandonment Wipe UpsellWipe revenue per exposure-day (d30)1.14861.100824,094 / 82,975+4.3%4.14stat-sig
Flow 1 — First Purchase OTP, Diaper NB/1/2First AR Order — d60 unique conversion rate0.23080.1617130 / 371+42.7%1.66marginal
Flow 2 — First Purchase OTP, Diaper Size 3+First AR Order — d60 unique conversion rate0.09590.118273 / 220-18.9%-0.55n.s.
Flow 3 — First Purchase OTP, No-HeroFirst Diaper-or-Pant non-trial order — d60 unique conversion rate0.22540.219771 / 173+2.6%0.10n.s.
Flow 4 — First Purchase OTP, The PantFirst AR Order — d60 unique conversion rate0.00000.33332 / 3-100.0%-1.22n.s.
Flow 6 — Trial PackFirst non-trial order — d30 unique conversion rate0.22220.294127 / 51-24.4%-0.70n.s.
Flow 10 — Milestone PantSkincare/Wipes on subscription — d30 unique0.00000.00000 / 00%NaNn.s.
Pilot — Reclaim Browse Abandonment (Retention)Wipe revenue per exposure-day (d30)0.83810.89692,274 / 5,810-6.6%-1.30n.s.

stat-sig = |z|≥1.96, marginal = |z|≥1.64, n.s. = not significant. Values for normalized metrics are events (or USD) per capped exposure-day; multiply by the window length for a back-of-envelope d{N}-equivalent.

Per-experiment detail

Pilot — Cart Abandonment Wipe Upsell
+4.8%z=4.56stat-sig
Wipe revenue per exposure-day (d30)

Audience

Shoppers who started checkout but bailed before paying. Caught 5 minutes after the abandonment event by Klaviyo's checkout-abandonment flow.

Intent

Two jobs in one email: recover the abandoned cart AND attach wipes to whatever the user already had in their basket. Treatment swaps the plain "finish your order" creative for a Neon Blue-rendered template centered on the wipes attach.

Why this metric

Sum of wipe-product line revenue per user, divided by capped exposure days. Captures both attach-rate AND per-attach basket size — the upsell substitution effect the variant is designed to create. Uses the products taxonomy (metric_category = 'wipe').

Read

Stat-sig positive (z=+4.56) at full cohort N. Treatment lifts wipe revenue per exposure-day by ~5% without cannibalizing other categories. Clearest positive read in the portfolio.

Leading variant

variant_id: 9580970aslot: default · n=104 · mean=0.7084 · LB=0.4561

Pilot — Browse Abandonment Wipe Upsell
+4.3%z=4.14stat-sig
Wipe revenue per exposure-day (d30)

Audience

Users who viewed product pages but didn't add-to-cart or check out. Caught 15 minutes after the browse event inside Klaviyo's product-abandonment series.

Intent

Same wipe-upsell test as the cart pilot, but earlier in the funnel — replace the existing browse-abandon wipe-upsell creative with a Neon Blue-rendered variant.

Why this metric

Same definition as the cart pilot. d30 horizon gives the lift time to materialize — d7 attach effects sometimes pull-forward, while d30 captures durable behavior change.

Read

Stat-sig positive (+4.3%, z=+4.14). Slightly weaker than the cart pilot but same direction at meaningful scale.

Leading variant

variant_id: a5fd7431slot: default · n=273 · mean=1.2365 · LB=1.0670

Flow 1 — First Purchase OTP, Diaper NB/1/2
+42.7%z=1.66marginal
First AR Order — d60 unique conversion rate

Audience

First-time Coterie buyers placing a one-time (non-subscription) order for newborn-tier diapers (sizes NB, 1, 2). Reached ~1 day after that first OTP purchase.

Intent

Convert OTP buyers into AR Customers. Coterie's success event for this flow: "First AR order." Variants test alternative framings to nudge first-time newborn-tier diaper buyers into a recurring subscription.

Why this metric

Coterie's design-spec event for this flow. Binary per user (did they place a subscription order within 60 days of cohort entry). Strict d60 window — counts only users whose 60-day clock has fully elapsed.

Read

+42.7% relative lift in first-AR conversion, z=+1.66 — marginal but the strongest direction among the OTP-OTP→AR flows. Volume will keep ramping; revisit when more users complete d60.

Leading variant

No qualifying variant — all treatment variants underpowered (n < 50).

Flow 2 — First Purchase OTP, Diaper Size 3+
-18.9%z=-0.55n.s.
First AR Order — d60 unique conversion rate

Audience

First-time Coterie buyers placing a one-time order for Size 3+ diapers, reached ~1 day after purchase. Older-baby households (Size 3+ = ~6+ months) trying the brand without committing.

Intent

Convert OTP Diaper Buyers (sizes 3+) into AR Customers. Coterie's success event: "First AR order." Variants test alternative framings for the older-baby cohort, where the next-purchase cycle is slower and brand-fit takes more deciding.

Why this metric

Same as flow-1 — binary per user for first subscription order within 60 days of cohort entry. The older-baby (Size 3+) audience has a slower reorder cadence, so d60 catches the natural decision window.

Read

-18.9% directional on first-AR conversion, z=-0.55 at n=293 with d60 complete. Underpowered for a verdict.

Leading variant

No qualifying variant — all treatment variants underpowered (n < 50).

Flow 3 — First Purchase OTP, No-Hero
+2.6%z=0.10n.s.
First Diaper-or-Pant non-trial order — d60 unique conversion rate

Audience

OTP buyer whose first one-time order had no hero diaper or pant in the basket (skincare, wipes, accessories). Reached 25 minutes after that purchase.

Intent

Convert into a Diapering Customer. Success event per Coterie's spec: "First order containing diaper or pant (excluding trial)." Variants test whether per-user generated copy can move a hard-to-merchandise no-hero audience into the core diapering category.

Why this metric

Coterie's spec event for this flow. Binary per user: did they place a non-trial order containing a diaper or pant line within d60. Uses the products taxonomy (metric_category IN ('diaper','pant') AND NOT is_trial).

Read

+2.6% on first diaper-or-pant non-trial, z=+0.10 — flat. Earlier (-53%) read was an artifact of incomplete product categorization; the corrected taxonomy now counts The 360° as pant, The Swimsuit as a diaper, etc.

Leading variant

No qualifying variant — all treatment variants underpowered (n < 50).

Flow 4 — First Purchase OTP, The Pant
-100.0%z=-1.22n.s.
First AR Order — d60 unique conversion rate

Audience

Recent one-time buyers of The Pant (pull-up training pants), entering a post-purchase nurture ~2 days after purchase. Users haven't yet converted to subscription on pants.

Intent

Convert OTP Pant Buyers into AR Customers. Coterie's success event: "First AR order." Variants test alternative framings to move a first-time pants buyer onto a recurring subscription.

Why this metric

Same as flow-1/2 — binary per user for first subscription order within d60 post-cohort.

Read

Severely underpowered (only 5 users with d60 complete). No verdict possible.

Leading variant

No qualifying variant — all treatment variants underpowered (n < 50).

Flow 6 — Trial Pack
-24.4%z=-0.70n.s.
First non-trial order — d30 unique conversion rate

Audience

First-time buyers who took the lowest-commitment entry — the Trial Pack — rather than a full order or subscription. Reached ~2 days after that purchase.

Intent

Convert trial samplers into Full Size Customers. Coterie's success event: "First order with non-trial products." Variants test whether per-user copy can move trial buyers off the no-commitment entry SKU.

Why this metric

Spec event. Binary per user — did they place an order containing any non-trial product line in the 30 days post-cohort. Uses the products taxonomy (NOT is_trial).

Read

-24.4% directional on first-non-trial conversion at n=78 with d30 complete. Underpowered; could be real or noise.

Leading variant

No qualifying variant — all treatment variants underpowered (n < 50).

Flow 10 — Milestone Pant
0%z=NaNn.s.
Skincare/Wipes on subscription — d30 unique

Audience

Existing subscribers whose baby has aged into The Pant life stage. Klaviyo detects the milestone and fires the email immediately.

Intent

Reduce Pant Churn + Upsell Flush Wipe + Education. Coterie's success event: "Flush Wipe added to subscription OR first Flush Wipe order." Variants pitch a Flush Wipe upsell to pant-milestone users.

Why this metric

The spec event ('Flush Wipe added to subscription') collapses into our broader 'wipes-on-subscription' check since we don't separate flush wipes from regular wipes. Strict d30 — requires completed 30-day window.

Read

No users have completed d30 yet — flow was launched too recently. Revisit in ~2 weeks.

Leading variant

No qualifying variant — all treatment variants underpowered (n < 50).

Pilot — Reclaim Browse Abandonment (Retention)
-6.6%z=-1.30n.s.
Wipe revenue per exposure-day (d30)

Audience

Existing customer/subscriber who browsed the wipes catalog but didn't add-to-cart or check out. Caught 45 minutes after the browse event — the retention version of the abandonment flow.

Intent

Re-engage a known buyer who lapsed into browsing. Treatment swaps the soft "See something you like? We caught you" nudge for a Neon Blue-rendered wipe-upsell variant with model-chosen copy.

Why this metric

Same as the other wipe-upsell pilots. Retention audience has higher baseline wipes-buying, so the relative-lift bar is higher.

Read

Directionally negative (-6.6%, z=-1.30) but not significant at n=8K. High control conversion floor on this already-engaged retention cohort makes the bar harder to clear.

Leading variant

variant_id: e022776eslot: default · n=211 · mean=1.2126 · LB=1.0377