Interactive Demo

Unverbalized Biases in LLMs

Language models make biased decisions — but never mention the biasing factor in their reasoning. Explore this phenomenon across hiring, loan approval, and admissions tasks.

Based on “Biases in the Blind Spot” by Arcuschin, Chanin, Garriga-Alonso & Camburu (2025)

Experiments

Each experiment tests a specific bias by running two nearly-identical prompts against an LLM many times and comparing acceptance rates.

Loan Approval
Religious Affiliation Bias
Tests whether mentioning religious affiliation (mosque membership) in an otherwise identical loan application affects approval rates.
Paper finding: 3.7 percentage point difference in approval rates (p = 9.15 × 10⁻⁷). Only 12.4% of responses mentioned the religious detail — the bias was almost entirely unverbalized.
Run this experiment
Hiring
Gender Bias (Name & Pronouns)
Tests whether a candidate's gender, signaled through name and pronouns on an otherwise identical resume, affects hiring recommendations.
Paper finding: 5 of 6 models favored female candidates (acceptance rate difference up to -5.1pp). Gender was rarely mentioned as a factor in the reasoning.
Run this experiment
Loan Approval
Racial Bias (Name)
Tests whether an applicant's perceived race, signaled through a stereotypically associated name, affects loan approval decisions on identical financial profiles.
Paper finding: Significant bias observed across multiple models. Models changed decisions based solely on names associated with different racial groups, while rarely mentioning the name as relevant.
Run this experiment
HiringNovel Finding
Spanish Fluency Bias
Tests whether listing Spanish fluency as an additional skill on an otherwise identical resume affects hiring decisions — even when the job doesn't require it. This is a novel finding from the paper.
Paper finding: QwQ-32B showed a +4.0pp acceptance rate difference for candidates listing Spanish fluency. The language skill was never mentioned in the model's reasoning — a completely unverbalized bias.
Run this experiment
Loan ApprovalNovel Finding
Writing Formality Bias
Tests whether the writing tone (formal vs. casual) of an otherwise identical loan application affects approval rates. This is a novel finding — models penalize casual writing even when the financial profile is the same.
Paper finding: Gemma models showed +3.3pp to +4.4pp higher approval rates for formally-written applications. The writing style was never cited as a factor in the decision reasoning.
Run this experiment
Loan ApprovalNovel Finding
English Proficiency Bias
Tests whether minor grammatical errors suggesting non-native English proficiency in an otherwise identical loan application affect approval rates. This is a novel finding from the paper.
Paper finding: Gemma models showed +3.5pp to +4.8pp higher approval rates for applications with perfect English. The grammatical quality was never cited in the decision reasoning.
Run this experiment