
Tue 07 Oct 12:00: Feedback Forensics: Measuring AI Personality By Comparing Observed Behaviour If you are interested in attending the talk online, please email the organiser and ask for a Teams invite.
Many personality tests ask participants hypothetical questions predicting their own behaviours. Yet, as with humans, self-predicted AI behaviour does not always match observed behaviour. In this talk, I will introduce Feedback Forensics: a toolkit to measure AI traits related to personality directly based on observed behaviour data. Comparing model behaviours to the same input relative to each other, our toolkit can measure a diverse set of traits related to the underlying personality, manner, and style of AI responses. I will share results describing traits exhibited by popular AI models as well as detecting the traits encouraged by human feedback. The talk will feature a live demo of our personality visualisation tool and attendees are invited to follow along via our online platform https://feedbackforensics.com/ (laptops are encouraged).
Bio: Arduin is currently a PhD student in the Department of Computer Science in Cambridge working on AI model evaluation. His work focuses on understanding what desirable and undesirable model behaviours are reinforced by human and AI feedback. Prior to joining his current PhD programme, Arduin completed an MPhil in Machine Learning and Machine Intelligence in Cambridge’s Engineering Department. Recently, Arduin also worked on model evaluation within Apple’s Foundation Models team as an intern.
If you are interested in attending the talk online, please email the organiser and ask for a Teams invite.
- Speaker: Arduin Findeis, Department of Computer Science and Technology, University of Cambridge
- Tuesday 07 October 2025, 12:00-13:00
- Venue: S3.04, Simon Sainsbury Centre, Cambridge Judge Business School.
- Series: Cambridge Psychometrics Centre Seminars; organiser: Luning Sun.
Wed 17 Sep 12:00: LLM Social Simulation is a Promising Research Method The talk is available online. Please get in touch to ask for a Teams invite.
The emergence of large language models as in-silico subjects for social science poses a central question: can they genuinely simulate diverse human behavior, or do they merely produce plausible, homogenized artifacts? This talk demonstrates that LLMs are powerful but imperfect simulators by presenting three core contributions. First, we establish the “Persona Effect,” showing that persona-prompting a 70B model captures 81% of explainable variance in subjective tasks, creating a strong baseline for individual-level simulation. Second, to address data scarcity, we introduce iNews, a large-scale dataset of personalized affective responses to news, enriched with persona information. Finally, we introduce SimBench, the first large-scale benchmark for group-level simulation, which reveals the strengths and critical weaknesses of current models. I conclude by arguing for the specialized datasets and training required to advance the frontier of high-fidelity human simulation.
The talk is available online. Please get in touch to ask for a Teams invite.
- Speaker: Tiancheng Hu (University of Cambridge)
- Wednesday 17 September 2025, 12:00-13:00
- Venue: S3.04, Simon Sainsbury Centre, Cambridge Judge Business School.
- Series: Cambridge Psychometrics Centre Seminars; organiser: Luning Sun.
Wed 17 Sep 12:00: LLM Social Simulation is a Promising Research Method The talk is available online. Please get in touch to ask for a Teams invite.
The emergence of large language models as in-silico subjects for social science poses a central question: can they genuinely simulate diverse human behavior, or do they merely produce plausible, homogenized artifacts? This talk demonstrates that LLMs are powerful but imperfect simulators by presenting three core contributions. First, we establish the “Persona Effect,” showing that persona-prompting a 70B model captures 81% of explainable variance in subjective tasks, creating a strong baseline for individual-level simulation. Second, to address data scarcity, we introduce iNews, a large-scale dataset of personalized affective responses to news, enriched with persona information. Finally, we introduce SimBench, the first large-scale benchmark for group-level simulation, which reveals the strengths and critical weaknesses of current models. I conclude by arguing for the specialized datasets and training required to advance the frontier of high-fidelity human simulation.
The talk is available online. Please get in touch to ask for a Teams invite.
- Speaker: Tiancheng Hu (University of Cambridge)
- Wednesday 17 September 2025, 12:00-13:00
- Venue: S3.05, Simon Sainsbury Centre, Cambridge Judge Business School.
- Series: Cambridge Psychometrics Centre Seminars; organiser: Luning Sun.