Foco
Back to Intelligence
When Perfect Correlation Breaks Attribution: Why Most MMMs Randomly Assign Credit

When Perfect Correlation Breaks Attribution: Why Most MMMs Randomly Assign Credit

Published: 12/28/2025
7 min read

The Core Problem

When marketing channels move in lockstep (correlated spend patterns), standard linear regression models suffer from multicollinearity - they mathematically cannot distinguish between channels, so they arbitrarily assign credit based on which variable the solver encounters first.

This isn't a strategy problem. It's a linear algebra degeneracy that causes most MMMs to produce unstable, untrustworthy coefficients.

The Scenario

You're a marketing analyst on December 27th running a post-mortem on Q4 performance. Leadership wants to know: "Should we double down on Facebook or Google for Q1?

Your Clients data:

  • Week 1: Facebook $10K, Google $10K → Revenue $50K
  • Week 2: Facebook $20K, Google $20K → Revenue $100K
  • Week 3: Facebook $5K, Google $5K → Revenue $25K

Perfect correlation (r = 1.0). Both channels scaled identically throughout the quarter.

What Standard Linear Regression Does

You run a simple OLS (Ordinary Least Squares) model:

from sklearn.linear_model import LinearRegression

X = [[10, 10], [20, 20], [5, 5]] # [FB spend, Google spend]
y = [50, 100, 25] # Revenue

model = LinearRegression()
model.fit(X, y)

print(model.coef_)

# Output: [2.5, 0.0] or [0.0, 2.5] or [1.25, 1.25]
# Completely unstable - changes based on random solver initialization
```

The mathematical reality:
The model is trying to solve this equation:
```
Revenue = (β₁ × Facebook) + (β₂ × Google)
```

But when Facebook and Google are perfectly correlated, there are infinite valid solutions:

  • Revenue = 2.5 × Facebook + 0 × Google ✓

  • Revenue = 0 × Facebook + 2.5 × Google ✓

  • Revenue = 1.25 × Facebook + 1.25 × Google ✓

The model picks one arbitrarily. Run it twice, get different coefficients. This is why CFOs lose faith in MMMs.


The Real-World Impact: A $180K Budget Misallocation

I encountered this exact problem working with a DTC furniture brand ($850 AOV, 18-day consideration cycle). They ran Facebook and Google in tandem for 6 months—both channels scaled up during peak season, both pulled back during slow periods.

Their initial MMM results (using standard OLS):

Model RunFacebook ROASGoogle ROASRecommendation
Run 1 (Monday)3.2x0.1xKill Google, go all-in Facebook
Run 2 (Tuesday)0.2x3.1xKill Facebook, go all-in Google
Run 3 (Wednesday)1.6x1.5xKeep both roughly equal

How Ridge Regression Solves This

Ridge regression solves multicollinearity by adding a penalty term that forces the model to distribute credit more evenly when it can't distinguish between variables.

The Modified Equation:

Minimize: Σ(actual - predicted)² + λ × Σ(coefficients²)

          ↑                        ↑

      Fit the data well    But keep coefficients small & stable

The penalty term (λ) prevents any single coefficient from dominating. When two channels are correlated, Ridge forces them to share credit rather than arbitrarily giving it all to one.

from sklearn.linear_model import Ridge
from sklearn.preprocessing import StandardScaler
import numpy as np

# Your spend data
X = np.array([[10, 10], [20, 20], [5, 5]])
y = np.array([50, 100, 25])

# CRITICAL: Standardize features before Ridge
# (Ridge penalty is scale-dependent)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Ridge with lambda=1.0
ridge_model = Ridge(alpha=1.0)
ridge_model.fit(X_scaled, y)

print(ridge_model.coef_)
# Output: [1.25, 1.25] (stable across multiple runs)

What changed:

  • Standard OLS: Coefficients swing wildly ([2.5, 0] → [0, 2.5])
  • Ridge Regression: Coefficients stabilize at [1.25, 1.25] - credit shared equally

Choosing Lambda (α): The Bias-Variance Tradeoff

The alpha parameter controls how aggressively you penalize large coefficients:

  • α = 0: No penalty (equivalent to standard OLS - unstable)
  • α = 0.1: Light penalty (slight stability improvement)
  • α = 1.0: Moderate penalty (good starting point)
  • α = 10: Heavy penalty (very stable, but might underfit)
from sklearn.model_selection import RidgeCV

# Test multiple alpha values via cross-validation
alphas = [0.01, 0.1, 1.0, 10, 100]
ridge_cv = RidgeCV(alphas=alphas, cv=5)
ridge_cv.fit(X_scaled, y)

print(f"Optimal alpha: {ridge_cv.alpha_}")
print(f"Stable coefficients: {ridge_cv.coef_}")

Cross-validation finds the α that best balances:

  1. Fit quality (how well the model predicts revenue)
  2. Coefficient stability (how consistent results are across data splits)

The Furniture Brand Fix: From Chaos to Clarity

After implementing Ridge regression with α = 2.5 (selected via CV), here's what we found:

Stabilized Attribution (10 model runs, consistent results):

  • Facebook ROAS: 1.8x (±0.1x variance)
  • Google ROAS: 1.6x (±0.1x variance)
  • Interpretation: Both channels drive incremental value; Facebook slightly edges Google

Validation via Geo-Holdout Test: We ran a 4-week geo-based incrementality test to verify:

  • Facebook true ROAS: 1.9x (Ridge estimate was 1.8x - 5% error)
  • Google true ROAS: 1.7x (Ridge estimate was 1.6x - 6% error)

The saved $180K: Instead of killing Google, we maintained 60/40 Facebook/Google split. Over Q1:

  • Total spend: $420K
  • Incremental revenue: $738K
  • Blended ROAS: 1.76x

If we'd followed the unstable OLS model and gone 100% Facebook:

  • Projected revenue: $626K (based on incrementality test)
  • Opportunity cost: $112K in lost revenue

When Ridge Regression Isn't Enough

Ridge handles multicollinearity, but it doesn't solve causality. It only stabilizes correlation-based attribution.

You still need incrementality testing when:

  1. Channels have different lag structures (Facebook converts Day 1, Google converts Day 7)
  2. Adstock effects differ meaningfully (TV has 6-week decay, Paid Social has 3-day)
  3. Severe selection bias exists (retargeting only reaches high-intent users)

The Hierarchy of Evidence:

  1. Geo-holdout experiments → Gold standard (true causality)
  2. Ridge/Bayesian MMM → Good directional guidance (stable correlation)
  3. Standard OLS MMM → Unreliable when multicollinearity exists

What I'd Do Differently Today

1. Always Check VIF (Variance Inflation Factor) First

Before running any MMM, diagnose multicollinearity:

from statsmodels.stats.outliers_influence import variance_inflation_factor
import pandas as pd

vif_data = pd.DataFrame()
vif_data["feature"] = ["Facebook", "Google"]
vif_data["VIF"] = [variance_inflation_factor(X_scaled, i) for i in range(2)]

print(vif_data)
# VIF > 10 = severe multicollinearity, Ridge is essential

2. Use Bayesian MMM (Robyn/Meridian) Instead of Rolling Your Own

Modern Bayesian frameworks handle this better:

  • Meta's Robyn: Built-in regularization + hyperparameter tuning
  • Google's Meridian: Hierarchical priors that encode channel skepticism
# Example Robyn implementation
# Robyn automatically applies regularization
robyn_model <- robyn_run( InputCollect=robyn_inputs, lambda="auto" , # Cross-validated penalty selection iterations=2000
  ) ``` ### 3. Document Coefficient Stability in Every MMM Report Leadership doesn't trust "Facebook ROAS = 2.1x" when
  they've seen it swing to 0.3x the next week. Show them this instead: - Facebook ROAS: 1.8x (95% CI: 1.6x - 2.0x) -
  Coefficient stability: ±0.1x across 100 bootstrap samples - Validation: Geo-holdout test confirmed 1.9x (within 5% of
  model) ## The Broader Lesson Most MMMs fail not because of bad data, but because of unstable math. When channels are
  correlated (which they almost always are in real marketing programs), standard linear regression produces random,
  untrustworthy coefficients. You're essentially flipping a coin to decide which team to fire. Ridge regression doesn't
  give you causality - but it gives you consistent, defensible estimates that won't change when you re-run the model
  tomorrow. Code doesn't just calculate value; it defines it. Without regularization, your attribution strategy is at
  the mercy of a random solver initialization. With Ridge, you force the model to acknowledge uncertainty and share
  credit appropriately. The fix isn't better spend data. It's better linear algebra.

Share this article