Preference Shielding for Human-Robot Interaction (MSc thesis)
MSc thesis. A web-based HRI study comparing four shielding conditions for a Q-learning agent on a 7x7 grid: no shielding, standard preference shielding, Adaptive Shielding (confidence gate), and Hard/Soft per-object Shielding. Participants will watch the agent navigate, express directional preferences, and answer questionnaires.
Thesis hypothesis
Does adding a confidence gate (Adaptive Shielding) or a Hard/Soft per-object enforcement split to the existing Preference Shielding mechanism improve how transparent and trustworthy a learning robot looks to a human observer, without slowing down how quickly it learns the task?
4
Study conditions
240
Pre-study runs
30
Seeds per condition
7×7
Grid size
What participants will compare
Each participant sees one of four shielding regimes for a Q-learning agent on a 7x7 grid. The two new contributions (Adaptive and Hard / Soft) are the heart of the thesis; Baseline and Standard PS are the controls.
Q-learning agent with no shielding. The agent learns from environmental reward only; participant preferences are recorded but never enforced.
The original mechanism from the literature. Every preference is enforced unconditionally near matching objects, regardless of how confident the agent is or how important the object is to the participant.
New contribution. The shield defers to the agent once its Q-value confidence crosses a threshold. Early in learning the participant's preferences carry full weight; as the agent becomes confident, it earns autonomy.
New contribution. Participants tag each object as Strict or Flexible. Strict objects get unconditional override; Flexible objects let the agent learn freely. Users keep guarantees on what matters most without strangling task performance everywhere.
Algorithm benchmark before the humans arrive
Before opening the experiment to participants, I ran a 2-cubed factorial algorithmic pre-study (8 conditions across 30 seeds = 240 runs) to confirm the extensions behave as designed. In particular: the all-Strict configuration of Hard / Soft Shielding reproduces the classic safety-performance tradeoff (perfect preference alignment, lost task success), which is exactly the failure mode the Strict / Flexible split is designed to escape from.




Where participants will meet the agent
The study runs in a browser. A FastAPI backend streams the agent's training over a WebSocket while a React 18 frontend paints the 7x7 grid live, frame by frame. Participants express directional preferences with an arrow grid, watch the agent adapt (or not), and answer questionnaires after each session.
Backend
- FastAPI with async WebSocket training loop
- aiosqlite for participant data + questionnaires
- Per-condition Q-table runs, async streaming
- Admin REST API for analytics, settings, and live monitoring
- Docker-deployable on Fly.io
Frontend
- React 18 + Vite + Tailwind CSS v4
- Live animated 7x7 GridCanvas, looping AnimatedPath
- PreferencePanel (arrow grid) and EnforcementPanel (Hard / Soft per object)
- Mid-training popup questionnaires + post-session screens
- Consent gate + onboarding + debrief flow
Gated on data-collection permission
Algorithms and the web app are complete; the algorithmic pre-study above confirms the contributions behave as designed. The participant experiment opens once permission to collect human-subjects data is granted by the relevant institutional review.
Once participants come through, the headline analysis will compare the four conditions on three outcome families: perceived robot transparency (questionnaire scales), perceived trust (Likert + qualitative), and learning speed (objective task-success time). I'll wire the live results into a dashboard here once the data is in.