Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Related Work

PolicyEngine

Moral Change and Axiological Futurism

Philosophers have long studied how moral views evolve. Singer (1981) proposed that the “expanding circle” of moral concern—from kin to tribe to nation to humanity—represents a consistent pattern in moral progress. Pinker (2011) documented declining violence across history, attributing it partly to expanding empathy and reason.

More recently, Danaher (2021) introduced “axiological futurism” as a systematic field studying future human values. This work asks: What are the best predictors of value change? Can we identify patterns that inform expectations about future values? Our work operationalizes these questions empirically.

MacAskill (2014) developed decision theory for acting under “normative uncertainty”—uncertainty about which moral framework is correct. Value forecasting can be seen as quantifying this uncertainty: rather than philosophical argument about which values are correct, we estimate probability distributions over future values.

AI Alignment Approaches

Current alignment approaches typically assume known values. RLHF Christiano et al., 2017Ouyang et al., 2022 trains systems to match human feedback, implicitly treating current preferences as the target. Constitutional AI Bai et al., 2022 specifies principles (be helpful, harmless, honest) that systems should follow.

Gabriel (2020) argues that the “central challenge for theorists is not to identify ‘true’ moral principles for AI; rather, it is to identify fair principles for alignment” given reasonable disagreement. Our approach sidesteps this by forecasting the distribution of values rather than selecting one framework.

Dafoe et al. (2021) highlights that alignment alone is insufficient—systems must also cooperate. Value forecasting could identify which value systems facilitate cooperation, informing both what to align toward and how.

LLMs for Survey Prediction

Recent work demonstrates that LLMs can reproduce survey response patterns. Argyle et al. (2023) showed that LLMs fine-tuned on demographic information can simulate human subpopulations (“silicon samples”), reproducing voting patterns at 85%+ accuracy. This suggests LLMs learn genuine patterns about human attitudes, not just surface-level text statistics.

Hewitt & others (2024) found GPT-4 can predict experimental treatment effects in social science studies with r=0.85 correlation to actual results. This indicates LLMs have learned causal patterns in human behavior, not just correlations.

The SubPOP project fine-tuned LLMs on GSS data, achieving 69% accuracy on opinion prediction SubPOP Team, 2025. However, this work focused on cross-sectional prediction (predicting opinions at a given time), not temporal forecasting (predicting how opinions will change).

Our work extends this line by testing whether LLMs can predict opinion trajectories—not just what people think now, but how that will change over years and decades.

AI Forecasting and Calibration

A growing literature evaluates LLMs as forecasters against human prediction benchmarks. halawi2024approaching developed a retrieval-augmented LLM system that approaches human forecaster accuracy on competitive platforms like Metaculus and Good Judgment Open. Their system achieves RMS calibration error of 0.042 compared to 0.038 for the human crowd aggregate—near parity. Critically, this required both fine-tuning and ensemble aggregation; base models under zero-shot prompting were poorly calibrated.

Forecasting platforms like Metaculus provide benchmarks for evaluating AI predictions using proper scoring rules. The Brier score—where lower is better and 0.25 represents random guessing—has become standard. Early GPT-4 models achieved ~0.25 (random baseline), while GPT-3-level models performed worse than random due to overconfidence metaculus2024. Recent models like o1 and o3 show significant improvements.

The FOReCAst benchmark forecast2024 explicitly evaluates both forecasting accuracy and confidence calibration, built entirely from Metaculus questions with clear resolution criteria. Key findings: aggregating forecasts from multiple sources substantially improves performance, with median predictions achieving Brier scores (~0.12) comparable to the best individual AI systems.

For uncertainty quantification, the literature suggests temperature sampling alone does not yield calibrated probabilities. More principled approaches include: (1) ensemble aggregation across models or prompts, (2) fine-tuning on proper scoring rules, and (3) post-hoc calibration using held-out data. We adopt model ensembling as the most practical approach for expressing uncertainty over long-term value forecasts.

Backlash and Counter-Mobilization

Political scientists have documented that social progress often triggers backlash. Luker (1984) showed how abortion rights mobilized counter-movements. Fetner (2008) documented how LGBTQ+ visibility provoked organized opposition.

More recently, Public Religion Research Institute (2024) found declining support for LGBTQ+ rights across multiple measures after years of steady increase, with the sharpest drops among Republicans and young people. PBS NewsHour (2024) attributed this to counter-mobilization as LGBTQ+ people “identify more publicly and assert their rights.”

This literature suggests that value trajectories may be non-monotonic: progress in one direction can trigger resistance that reverses or slows change. A key question for value forecasting is whether LLMs can predict these inflection points.

References
  1. Singer, P. (1981). The expanding circle: Ethics and sociobiology. Farrar, Straus.
  2. Pinker, S. (2011). The better angels of our nature: Why violence has declined. Viking.
  3. Danaher, J. (2021). Axiological futurism: The systematic study of the future of values. Futures, 132, 102780.
  4. MacAskill, W. (2014). Normative uncertainty [Phdthesis]. University of Oxford.
  5. Christiano, P. F., Leike, J., Brown, T., Martic, M., Legg, S., & Amodei, D. (2017). Deep reinforcement learning from human preferences. Advances in Neural Information Processing Systems, 30.
  6. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
  7. Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., & others. (2022). Constitutional AI: Harmlessness from AI feedback. arXiv Preprint arXiv:2212.08073.
  8. Gabriel, I. (2020). Artificial intelligence, values, and alignment. Minds and Machines, 30(3), 411–437.
  9. Dafoe, A., Bachrach, Y., Hadfield, G., Horvitz, E., Larson, K., & Graepel, T. (2021). Cooperative AI: machines must learn to find common ground. Nature, 593(7857), 33–36.
  10. Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., & Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3), 337–351.
  11. Hewitt, L., & others. (2024). Predicting results of social science experiments using large language models. arXiv Preprint.
  12. SubPOP Team. (2025). SubPOP: LLMs fine-tuned on GSS for opinion prediction.
  13. Luker, K. (1984). Abortion and the politics of motherhood. University of California Press.
  14. Fetner, T. (2008). How the religious right shaped lesbian and gay activism. University of Minnesota Press.
  15. Public Religion Research Institute. (2024). Views on LGBTQ rights in all 50 states: Findings from PRRI’s 2023 American Values Atlas. https://prri.org/research/views-on-lgbtq-rights-in-all-50-states/