Working Paper · Methodological Appendix

Econometric Evidence Matrix

THE ALGORITHMIC WEDGE · Companion Econometric Appendix · 2026-05-20

AuthorAlex Lima
RegressorDatingRev: Global dating app revenue ($B) — verified 2015–2024
Section 0
0

Data Inventory & Variable Construction

DatingRev = global dating app industry revenue ($B), Business of Apps / Statista. Verified 2015–2024 (±5%); estimated 2010–2014 (±30%). Pre-2015 points shown with hollow markers throughout.

Data quality note — DatingRev pre-2015 2010–2014 values are market estimates (±30% uncertainty), not verified industry figures. Business of Apps reliable series begins 2015. All regressions are run on the full 2010–2024 series (T=15 in levels; T=14 in first differences) but robustness to excluding 2010–2014 is reported in §11 along with COVID exclusion. Scatter plots use hollow points for estimated years and solid points for verified years.
T0.1 · Descriptive Statistics
Core Panel US 2010–2024 (T=15)
Note: std uses sample SD (n-1) — corrected from v2. Sources: DatingRev: Business of Apps/Statista. Smartphones: Pew. TFR/Marriage/Divorce: CDC NCHS. Depression: SAMHSA NSDUH. SSRI: IQVIA. STIs: CDC STD Surveillance.
F0.1 · Z-scores 2010–2024
All Series Standardised — Tinder launch annotated (2012)
2012 annotation: Tinder launched September 2012. Visual inspection shows a post-2012 acceleration in DatingRev, STI series, and continued decline in TFR/Marriage. Pre-2015 DatingRev segment shown with reduced opacity (estimated data).
Section 1
1

Pearson Correlation Matrix

Pairwise Pearson correlations in levels (Panel A) and first differences (Panel B). * p<0.10, ** p<0.05 (t-distribution, df=N-2).

Reminder: Levels correlations are inflated by shared trends. First-difference panel (B) is the diagnostic of interest.
T1.1 · Levels
Correlation Matrix — Levels
T1.2 · First Differences
Correlation Matrix — First Differences
Key signal after differencing: ΔDatingRev–ΔSyphilis and ΔDatingRev–ΔGonorrhea remain positive and relatively strong. TFR and Marriage correlations weaken considerably. Depression and SSRI approach zero — consistent with their being shared-trend artifacts.
Section 2
2

Bivariate OLS — HC1 Standard Errors

Y = α + β·DatingRev + ε, T=15. HC1 = HC0 × n/(n-2) where HC0 = Σ(xi-x̄)²ei² / [Σ(xi-x̄)²]². Scatter plots show 95% confidence band around OLS line. Hollow points = pre-2015 estimated data. 2012 annotated.

Stationarity warning: Most series are I(1) in levels (see §6). Levels OLS R² is inflated by shared trends. Use first-difference specifications (§3) for substantive conclusions.
T2.1 · Bivariate OLS — Levels
HC1 = White (1980) heteroskedasticity-consistent SE with small-sample correction n/(n-2). t-critical (df=13): |t|>1.77 (10%), |t|>2.16 (5%), |t|>3.01 (1%). HC1 SEs: White (1980) heteroskedasticity-consistent with n/(n-2) small-sample correction.
F2.1
DatingRev → TFR + 95% CI band
F2.2
DatingRev → Marriage + 95% CI band
F2.3
DatingRev → Syphilis + 95% CI band
F2.4
DatingRev → Depression + 95% CI
F2.5
Smartphones → TFR (general digitisation check)
F2.6
DatingRev → Gonorrhea + 95% CI
Section 3 — Preferred Specification
3

First-Difference Regressions — Preferred Specification

ΔY = α + β·ΔDatingRev + ε, T=14. HC1 corrected. 2020 COVID point flagged with triangle marker. 95% CI band shown.

ΔYt = α + β·ΔDatingRevt + εt  |  t=2011,...,2024 (T=14)  |  HC1 SEs (White 1980, n/(n-2) correction)
T3.1 · First-Difference OLS (HC1, Preferred)
Preferred Specifications — Results with Corrected Standard Errors
HC1 SEs: Heteroskedasticity-consistent standard errors (White, 1980) with small-sample correction n/(n-2). The 2020 COVID outlier creates heteroscedastic residuals — see §11 for COVID-excluded robustness.
Multiple testing correction applied in §13. See §11 for COVID robustness (T=11 excluding 2020–2022). See §12 for placebo tests.
F3.1
ΔDatingRev → ΔTFR · 2020 flagged
F3.2
ΔDatingRev → ΔSyphilis · 2020 flagged
F3.3
ΔDatingRev → ΔGonorrhea · 2020 flagged
F3.4
Residuals — ΔTFR ~ ΔDatingRev · COVID outlier
Large negative residual in 2020: COVID-19 disruption. This outlier inflates HC1 SEs across all specifications due to heteroscedastic error variance. §11 tests robustness by excluding 2020–2022.
Section 4
4

Lagged Regression Models

Y_{t} = α + β·DatingRev_{t-k} + ε, k=0,1,2,3. Lagged associations are more consistent with causal mechanisms than contemporaneous correlations.

T4.1 · Lagged OLS — TFR, Marriage, Syphilis
F4.1 · Coeff by Lag — TFR
Error bars = ±1.96×HC1 SE. Wide intervals reflect T=15 minus k observations. Sign consistently negative across lags.
F4.2 · Coeff by Lag — Syphilis
Contemporaneous effect strongest for Syphilis, consistent with rapid epidemic transmission dynamics. Contrast with TFR where effect persists across lags.
Section 5
5

Granger Causality Tests

F-test: does past DatingRev predict Y beyond past Y? Effective T=11 after 1 lag. Severely underpowered — results are directional signals only. F-statistic now uses HC1-consistent denominator where feasible.

Power caveat: F(1,9) critical at p=0.10 ≈ 3.36. With T_eff=11, these tests have <40% power to detect moderate effects. Non-significance implies insufficient sample, not absence of Granger causation. For validation, see R code: lmtest::grangertest().
T5.1 · Granger F-Tests
F5.1 · Forward vs. Reverse F-statistics
Asymmetry finding: The forward direction (DatingRev → Y) produces larger F-statistics than the reverse (Y → DatingRev) for STI outcomes, consistent with — though not proving — a directional mechanism. The asymmetry is most pronounced for Syphilis.
Section 6
6

Stationarity & Unit Root Assessment

Bug fixed: ADF τ-statistic no longer compared to Student-t distribution (which is incorrect). Now compared to MacKinnon (1996) finite-sample critical values: CV(T) = β∞ + β₁/T + β₂/T² where T = observations in the ADF regression. No p-value is reported — only threshold classification.

MacKinnon (1996) formula — ADF with constant CV(T) = β∞ + β₁/T + β₂/T² with β∞ = [−3.4335, −2.8621, −2.5668] for [1%, 5%, 10%]; β₁ = [−5.999, −2.738, −1.438]; β₂ = [−29.25, −8.36, −4.48]. For our T=14 (levels): 5% CV ≈ −3.10. For T=13 (FD): 5% CV ≈ −3.12. These are more negative than asymptotic values (−2.86), making rejection harder in small samples — the correct direction for conservatism.
v2 error impact: The v2 appendix reported p-values from tCDF(τ, df), which is incorrect for ADF. A τ of −2.5 gives p≈0.03 under t-distribution but τ < −3.10 is needed to reject at 5% under MacKinnon. Several "stationary" classifications in v2 should have been "not stationary" at 5%. Corrected results follow.
T6.1 · ADF Test Statistics — MacKinnon (1996) Critical Values
τ-statistic from OLS regression Δy = α + ρ·y_{t-1} + ε (OLS SEs, not HC1 — standard for ADF). MacKinnon (1996) CVs with finite-sample correction. No p-value reported (ADF τ has non-standard distribution). ✓ = reject H₀ (unit root) at threshold; ✗ = fail to reject.
F6.1 · ADF τ vs. MacKinnon CVs
Horizontal dashed lines = MacKinnon CVs for 5% (stronger) and 10% (weaker). Bars extending below the dashed line = stationary at that level. Most levels series fail to cross 5% threshold; first differences achieve stationarity.
F6.2 · DatingRev — Levels vs. First Differences
DatingRev in levels shows clear trend (non-stationary). ΔDatingRev oscillates around zero (stationary). Same visual pattern as v2 but classification now uses correct MacKinnon thresholds.
Section 7
7

Multiple Regression — Controlling for Smartphones

Bug fixed: ols2() now computes full HC1 sandwich SEs via: V_HC1 = (n/(n-2)) × (X'X)⁻¹ × [Σei²xixi'] × (X'X)⁻¹. §7 now reports t-statistics and p-values for all models. Multicollinearity warning: r(DatingRev, Smartphones) ≈ 0.96 — coefficient split in Model B is unstable.

Model A: Y = α + β₁·DatingRev + ε  |  Model B: Y = α + β₁·DatingRev + β₂·Smartphones + ε  |  Model C (Δ): ΔY = α + β₁·ΔDatingRev + β₂·ΔSmartphones + ε
T7.1 · Multiple Regression — TFR & Marriage
Key result: In Model C (first differences + smartphone control), β(DatingRev) retains negative sign for TFR and Marriage, though significance is fragile given multicollinearity with ΔSmartphones. Model C is the preferred specification. The coefficient split in Model B is unreliable (severe VIF due to r=0.96) — use partial correlations (§8) as the more interpretable test of dating-specificity.
F7.1 · β(DatingRev) Stability — TFR Outcome
Point estimates ± 1.96×HC1 SE. Wide CI in Model B reflects multicollinearity. Consistent negative sign across all three models is the relevant finding, not the point estimate magnitude.
Section 8
8

Partial Correlations Controlling for Smartphones

ρ(DatingRev, Y | Smartphones): the residual correlation between dating app revenue and outcomes after removing the general digitisation component. Primary test of dating-specificity.

T8.1 · Partial Correlations
Dating-specific signal: After controlling for smartphone penetration, DatingRev retains positive residual correlation with Syphilis (~+0.43) and Gonorrhea (~+0.38). TFR and Marriage show negative residual correlations (~−0.28 and ~−0.24 respectively). Depression and SSRI residual correlations are weak (<0.20), consistent with their being shared-trend artifacts.
F8.1 · Zero-order vs. Partial r — DatingRev
Blue = zero-order r(DatingRev, Y). Green = partial r after controlling Smartphones. The gap (blue minus green) = portion explained by general digitisation. Residual green = dating-specific signal.
Section 9
9

Cross-Country Analysis & Event Study

N=9 countries, 2023. Descriptive context only — N too small for reliable inference. Hudson & Moscoso-Boedo (2026) event study replication: SSRN 6676839, University of Cincinnati, April 2026.

F9.1 · SP% vs. TFR (N=9, 2023)
T9.1 · Cross-Country OLS
N=9 caveat: Sufficient for descriptive pattern only. Severe collinearity with income, urbanisation, education. Do not cite as causal evidence.
F9.2 · Event Study — Smartphone Take-off × TFR Inflection
Visual replication: Hudson & Moscoso-Boedo (SSRN 6676839, 2026) and FT/Burn-Murdoch (May 2026). Five country groups aligned by smartphone mass-adoption year. All show downward TFR inflection 1–3 years post-adoption. Formal statistical test requires sub-national data.
Section 10
10

Evidence Strength Summary Map

T10.1 · Consolidated Evidence Matrix (v3)
F10.1 · Evidence Radar
Hierarchy (unchanged from v2): STI channel most consistent → Marriage/TFR sign-consistent but not significant → Depression/SSRI no signal after detrending. Bug fixes (HC1, ADF) moderately reduce STI significance but do not change the ordering. Multiple testing results (§13) are the critical update: even the STI channel does not survive Bonferroni correction for 8 simultaneous tests.
Section 11
11

COVID Robustness — Excluding 2020–2022

2020–2022 are structurally anomalous: STI surveillance disrupted, marriage rates collapsed (court closures), dating-app revenue had atypical patterns (COVID boom then correction). If the key result (ΔDatingRev→ΔSyphilis) is driven by these years, it may not be a secular signal. This section re-estimates all first-difference specifications excluding 2020, 2021, and 2022. T_eff = 11 (2011–2019 + 2023–2024).

Why this test matters most The plan identifies this as the most critical available robustness check. If the STI result survives exclusion of 2020–2022, it is genuinely secular — driven by the 2013–2019 expansion of dating apps, not the pandemic disruption. If it disappears, the paper needs to be honest about that finding. This is a falsification test, not a sensitivity analysis.
T11.1 · First-Difference OLS — Excluding 2020–2022 (T=11)
Robustness to COVID Exclusion — Full vs. Non-COVID Sample
T=11 after excluding 2020, 2021, 2022 from first-difference sample. HC1 SEs. Comparison: Full sample (T=14) vs. Non-COVID sample (T=11). Critical question: does sign and magnitude survive?
F11.1 · ΔDatingRev → ΔSyphilis — Full vs. Non-COVID
F11.2 · ΔDatingRev → ΔTFR — Full vs. Non-COVID
F11.3 · β Comparison — All Outcomes, Full vs. Non-COVID Sample
Section 12
12

Placebo Tests — Falsification

If ΔDatingRev is genuinely associated with STI transmission through the proposed mechanism (expanded sexual networks), it should NOT be associated with outcomes that have no theoretical connection to dating markets. We test three placebo outcomes: WTI crude oil price (EIA), US new auto sales (Ward's/FRED), and US full-service restaurant revenue index (Census/BEA). If these show similar significance to the STI results, the model is producing spurious associations, not mechanism-specific signals.

Interpretation key Desired result: Placebos non-significant (p > 0.20), while STI outcomes remain significant. This pattern would support the claim of a dating-specific mechanism. Concerning result: Placebos significant at similar levels to STI outcomes — would suggest the model captures a general time-series relationship, not a mechanism-specific one.
T12.1 · Placebo Tests — ΔDatingRev → Δ(placebo outcome)
Data sources: WTI oil price: EIA monthly averages, annual mean ($/barrel). Auto sales: Ward's Automotive Research/FRED (millions of units). Restaurant revenue: US Census Bureau/BEA full-service restaurant industry revenue index (2010=100). All series 2010–2024.
F12.1 · ΔDatingRev → ΔOil Price
F12.2 · ΔDatingRev → ΔAuto Sales
F12.3 · ΔDatingRev → ΔRestaurant Rev
F12.4 · β comparison: STI outcomes vs. Placebos
Standardised β coefficients (per 1 SD of each outcome) allowing comparison across scales. Green = STI outcomes (theory-consistent); grey = placebos (should be near zero if model is mechanism-specific).
Section 13
13

Multiple Testing Correction — Bonferroni & BH-FDR

Testing 8 outcomes simultaneously inflates the probability of at least one false positive. With α=0.10 and m=8 independent tests, the probability of at least one false positive by chance is 1−(0.90)⁸ ≈ 57%. We report all 8 first-difference p-values with Bonferroni and Benjamini-Hochberg (1995) corrections. This is the most honest section in the appendix.

Bonferroni αadj = 0.10/8 = 0.0125  |  BH-FDR: reject H(k) if p(k) ≤ (k/m)×q, where q=0.10 and m=8
What this section may show Expectation: After Bonferroni correction for 8 tests, it is likely that no individual result survives at α=0.10 (because Bonferroni threshold is α/m = 0.0125, and most FD p-values are in the 0.05–0.20 range). BH-FDR at q=0.10 is less conservative and may allow 1–2 results through. This is the correct finding if true — not a failure of the paper, but an honest statement that aggregate time-series with T=14 and 8 outcomes does not have sufficient power for robust multiple-inference. The theoretical framework and proposed identification strategy (county-level DiD) remain the central contribution.
T13.1 · Multiple Testing Results — All 8 FD Outcomes
F13.1 · p-value Plot with Thresholds
Sorted raw p-values (dots) with Bonferroni threshold (red dashed) and BH-FDR thresholds (orange stepped line). Points below the threshold line pass that correction level. The visual makes the multiple-testing situation immediately apparent.
What remains true after multiple testing correction The ranking of evidence strength. Even if no individual result survives Bonferroni correction, the relative ordering (STI > Marriage > TFR > Depression/SSRI) is meaningful and consistent with theoretical predictions about mechanism proximity. The multiple testing correction tells us we cannot claim statistical significance at conventional thresholds for any single outcome from this T=14 aggregate dataset — it does not tell us the associations are absent or zero. It tells us we need the county-level event study (§5 of the working paper) to achieve adequate identification and power.
Section 15 · State-Level Panel · Real Google Trends + CDC Data
15

State GT Analysis — Real Data, Honest Results

Actual Google Trends "Tinder" export by US state (2012–2023) × CDC P&S Syphilis Surveillance Reports (2008–2023). N=51 states × 550 FD observations. Primary regressor: GT Tinder index (0–100 relative). Outcome: log syphilis rate per 100k. HC1 SEs throughout.

Central finding — GT is an inverse urbanisation proxy Cross-section r = −0.734*** (N=51, 2023). States with highest GT Tinder interest (ND=100, ME=98, WY=92) have the lowest syphilis rates. States with lowest GT (TX=48, GA=52, DC=49) have the highest. Google Trends normalises within each state — rural states with fewer competing searches produce higher relative Tinder indices without higher absolute adoption. The GT index is structurally invalid as a cross-state treatment variable for outcomes correlated with urbanisation.
F15.1 · Cross-Section: GT Tinder Index vs. P&S Syphilis Rate — 2023 (N=51)
Google Trends "Tinder" Index vs. Syphilis — r = −0.734*** · Negative slope reveals the confound
The scatter tells the whole story. Every state with GT index above 80 (ND, ME, WY, AK, SD, VT, MT) has syphilis below 22/100k. Every state with syphilis above 60/100k (TX, GA, DC, LA, SC, NV, NY, NC, FL) has GT index below 70. This is the urbanisation confound visualised in actual data: rural states dominate the relative GT index; urban/Southern states dominate the STI burden.
Source: Google Trends "Tinder" by US subregion, 2023 (actual export). CDC STD Surveillance Report 2023, P&S Syphilis by State. OLS line: β=−1.33 (SE=0.14, t=−9.69***). R²=0.538. HC1 SE.
F15.2 · GT Tinder Trend — Rural vs Urban States 2012–2023
GT Index Over Time — Rural states persistently lead the index
Rural states (MT, ME) consistently score higher on GT Tinder index than urban states (TX, CA, GA) across all 12 years — confirming the pattern is structural, not year-specific. Urban states peaked in 2014–2016 and declined as users fragmented to Hinge/Bumble/Grindr.
F15.3 · GT Top vs Bottom — State Rankings 2023
GT "Tinder" 2023 — Top 10 (rural) vs Bottom 10 (urban/Southern)
Red = top 10 GT states (all rural/low-density). Blue = bottom 10 (urban/Southern). The divide maps almost perfectly onto Census Bureau Urban-Rural Continuum Codes.
T15.1 · FD-OLS: Δlog(Syphilis) ~ ΔTinder_GT — Real Data, Four Specifications (HC1)
Sign reversal confirms the confound. Pre-peak FD (2012–2018): β=−0.00064*** — during diffusion, states gaining GT interest were lower-syphilis rural states. Ex-COVID FD: β≈0 (n.s.) — FD signal collapses when pandemic disruption removed. Only the post-peak period (2019–2023) shows positive β, likely reflecting GT becoming more uniform across states rather than a causal signal. The full-sample positive β is driven by COVID period co-movements.
F15.4 · Year-by-Year FD Correlation r(ΔGT, Δlog_syph) — 2013–2023
Annual r between ΔTinder GT and Δlog(Syphilis) — No stable positive direction
No consistent causal signal. The correlation alternates between positive and negative with no monotone pattern. 2015 (r=+0.37) and 2023 (r=+0.40) are the strongest positive years; 2013 (r=−0.08) and 2022 (r=−0.22) are negative. This volatility is inconsistent with a stable causal mechanism — further evidence that GT Tinder is not a valid FD treatment variable.

What Does Work — Pure CDC Data

Robust finding — No GT proxy required Within-state syphilis growth accelerated +7.6pp/yr*** (t=15.2) post-2013. Comparing each state's own annual growth rate before (2009–2012) and after (2013–2019) the Tinder window, pooled across 51 states. No cross-state comparison, no treatment proxy, no parallel trends assumption — each state is its own control. Universal across all state groups. Inconsistent with a pure secular-trend explanation since the acceleration represents a break from the pre-existing trend.
F15.5 · P&S Syphilis — All 51 States 2008–2023 · Universal Post-2012 Acceleration (Real CDC Data)
Every state shows a growth inflection at 2012–2013 — the Tinder launch window
The most important visual in this dataset. All 51 lines show a visible acceleration in slope after 2012–2013 regardless of baseline level, geographic region, or GT Tinder index. This universal pattern is the within-state before-after evidence in visual form: it does not depend on any treatment proxy and cannot be explained by the urbanisation confound.
Source: CDC STD Surveillance Reports 2008–2023. Vertical dashed = 2012 (Tinder launch). Highlighted: DC (purple, highest), LA/TX/GA (orange/amber), MT/ND/VT (grey, lowest). All other states = faint white lines.
T15.2 · Within-State Before-After — Syphilis Growth Rate by Period (Real CDC Data Only)
F15.6 · Growth Rate Before vs After Tinder by Group
Mean annual Δlog(syphilis)×100 = %/yr. Green = post-2013; grey = pre-2013. The acceleration (green taller) is consistent across all groups.
F15.7 · β Stability — GT FD Across Four Specifications
β(ΔTinder×10) — Sign reversal pre vs post peak
Pre-peak coefficient is significantly negative (confirms confound). Ex-COVID collapses to zero. Only post-peak is positive. This instability rules out the GT FD as a reliable causal estimate.
§15 Summary — What Real GT Data Establishes

Correct treatment variable for future work: Absolute app adoption data — Sensor Tower DMA-level downloads, Match Group MAU by geography, or a composite GT index (Tinder + Hinge + Bumble + Grindr) — would avoid the relative-index normalisation problem and enable valid cross-state FD identification.