Results
Multi-Variable Validation: 14-Year Forecasting Horizon¶
We tested value forecasting across 17 GSS variables using gpt-3.5-turbo-instruct (September 2021 training cutoff) to predict 2024 values from a 2010 baseline—a 14-year forecasting horizon.
Table 1:Multi-Variable Forecasting Results (2010→2024, all 16 variables tested)
| Variable | 2010 | 2024 Actual | Predicted | Error | Dir |
|---|---|---|---|---|---|
| PRAYER | 44% | 46% | 46% | 0 | ✓ |
| FEPOL | 79% | 82% | 81% | -1 | ✓ |
| NATEDUC | 72% | 76% | 75% | -1 | ✓ |
| GUNLAW | 74% | 70% | 72% | +2 | ✓ |
| POLVIEWS | 29% | 29% | 31% | +2 | ✗ |
| NATENVIR | 57% | 66% | 60% | -6 | ✓ |
| NATHEAL | 60% | 74% | 80% | +6 | ✓ |
| TRUST | 33% | 25% | 31% | +6 | ✓ |
| FAIR | 38% | 46% | 40% | -6 | ✓ |
| GRASS | 48% | 68% | 60% | -8 | ✓ |
| PREMARSX | 53% | 65% | 56% | -9 | ✓ |
| EQWLTH | 42% | 54% | 45% | -9 | ✓ |
| HELPPOOR | 28% | 39% | 30% | -9 | ✓ |
| CAPPUN | 32% | 40% | 30% | -10 | ✗ |
| ABANY | 44% | 60% | 46% | -14 | ✓ |
| NATRACE | 34% | 51% | 37% | -14 | ✓ |
Key findings across 16 variables:
MAE: 6.4 percentage points vs 9.2 for naive baseline
Improvement: 1.44× over simply predicting the last observed value
Direction correct: 94% (15/16 variables)
Bias: -4.4 points—slight under-prediction of change magnitude
The model correctly captured the direction of change in nearly all cases, including the decline in social trust (TRUST: 33%→25%) and stability in gun permit support (GUNLAW: 74%→70%). The largest errors occurred on variables with rapid change (NATRACE, ABANY) where the model under-predicted the magnitude.
Clean Test: GPT-4o Predicting GSS 2024¶
For a methodologically rigorous test, we used GPT-4o (training cutoff October 2023) to predict GSS 2024 data (collected April-December 2024). This ensures the model could not have seen the target values.
Table 2:GPT-4o Predictions vs. GSS 2024 Actual
| Variable | Prediction | 90% CI | Actual | Error |
|---|---|---|---|---|
| HOMOSEX | 69% | [66, 72] | 54.7% | +14.3% |
| GRASS | 73% | [70, 76] | 68.5% | +4.5% |
The model missed a major reversal. HOMOSEX (acceptance of same-sex relationships) had increased steadily for decades: 13% (1990) → 27% (2000) → 42% (2010) → 62% (2021). GPT-4o extrapolated this trend, predicting 69% for 2024.
Instead, the actual value was 54.7%—a 7 percentage point drop from 2021 and the first reversal in over 30 years.
Multi-Variable Analysis¶
The reversal was not isolated. We analyzed six GSS variables:
Table 3:GSS 2024 Results Across Variables
| Variable | 2018 | 2021 | 2022 | 2024 | Pattern |
|---|---|---|---|---|---|
| HOMOSEX | 57% | 62% | 61% | 55% | ↓ Reversal |
| PREMARSX | 62% | 66% | 69% | 65% | ↓ Peaked |
| NATRACE | 56% | 52% | 56% | 51% | ↓ Declining |
| ABANY | 50% | 56% | 59% | 60% | ↑ Rising |
| GUNLAW | 72% | 67% | 71% | 70% | → Stable |
| CAPPUN | 37% | 44% | 40% | 40% | → Stable |
Values did not move in lockstep. While ABANY (abortion) continued rising post-Dobbs, HOMOSEX and NATRACE (spending on racial issues) reversed. This divergence would be missed by any model assuming “liberalization” as a general pattern.
Demographic Decomposition¶
We analyzed HOMOSEX by party identification:
Table 4:HOMOSEX by Party (% “Not Wrong at All”)
| Year | Democrat | Independent | Republican |
|---|---|---|---|
| 2018 | 62% | 63% | 45% |
| 2021 | 76% | 59% | 43% |
| 2024 | 71% | 57% | 36% |
| Change 2021→24 | -5 | -2 | -7 |
The reversal occurred across all party groups but was largest among Republicans (-7 points). This is consistent with backlash dynamics triggered by political mobilization.
By age group:
Table 5:HOMOSEX by Age (% “Not Wrong at All”)
| Year | 18-29 | 30-44 | 45-64 | 65+ |
|---|---|---|---|---|
| 2021 | 79% | 68% | 61% | 53% |
| 2024 | 69% | 61% | 51% | 45% |
| Change | -10 | -7 | -10 | -8 |
The largest drops were among the youngest (18-29) and middle-aged (45-64) groups. This contradicts simple generational replacement models where younger cohorts drive liberalization.
Long-Term Forecasts with Calibrated Uncertainty¶
We generated long-term forecasts using quantile elicitation and EMOS-style calibration gneiting2005calibrated. For each variable, we elicited five quantiles (10th, 25th, 50th, 75th, 90th percentiles) and calibrated the uncertainty by optimizing CRPS on the 2024 holdout data.
Calibration Results (17 variables, 2021→2024):
Optimal spread multiplier: 1.21 (CIs need 21% widening)
Raw 50% interval coverage: 47% (target: 50%)
Raw 80% interval coverage: 59% (target: 80%)
Mean CRPS: 3.15 points
Table 6:Calibrated Long-Term Forecasts (GPT-4o)
| Variable | 2024 Actual | 2030 | 2050 | 2100 |
|---|---|---|---|---|
| HOMOSEX | 55% | 66% [57,75] | 75% [64,86] | 80% [69,91] |
| GRASS | 68% | 72% [57,87] | 80% [57,103] | 80% [57,103] |
| PREMARSX | 65% | 70% [59,81] | 80% [69,91] | 80% [69,91] |
| ABANY | 60% | 60% [51,69] | 60% [42,78] | 60% [37,83] |
| CAPPUN | 40% | 42% [33,51] | 45% [34,56] | 55% [32,78] |
| TRUST | 25% | 28% [21,35] | 27% [18,36] | 27% [18,36] |
| POLVIEWS | 29% | 30% [25,35] | 30% [23,37] | 31% [24,38] |
Brackets show calibrated 80% confidence intervals.
Key observations:
HOMOSEX: Model predicts recovery to 66% by 2030, 80% by 2100. But 2024 actual (55%) is already below the 2030 lower bound (57%), suggesting the model underestimates reversal risk.
ABANY: Predicted stable at ~60%, unlike the continued rise the model predicted pre-2024.
TRUST: Continued decline predicted (28%→27%), reflecting the long-term erosion of social trust.
Uncertainty widens with horizon: 2100 intervals are appropriately wider than 2030.
These forecasts should be treated as registered predictions subject to future validation, not reliable projections. The 2024 calibration shows models are overconfident; longer horizons likely involve even greater uncertainty than shown.
Income-Values Relationship¶
We found a strong gradient between income and values in GSS 2024:
Table 7:HOMOSEX by Income Quartile (2024)
| Income Quartile | % Accept | Median Income |
|---|---|---|
| Q1 (lowest) | 43% | $7,700 |
| Q2 | 54% | $31,000 |
| Q3 | 61% | $56,000 |
| Q4 (highest) | 67% | $139,000 |
This 24-point gap suggests economic conditions may influence values. Under AI-driven growth scenarios, rising incomes could shift values—though the direction and causality are uncertain.