From b29e14c0545ffd5c839f8fa99f3b153c0e76db2b Mon Sep 17 00:00:00 2001
From: Sean O'Connor <sso005@bucknell.edu>
Date: Tue, 10 Feb 2026 00:56:19 -0500
Subject: [PATCH] fill chapter03 gap, write new chapter 3

---
 thesis/chapters/03_related_work.tex           | 18 ----------
 thesis/chapters/03_reproducibility.tex        | 34 +++++++++++++++++++
 thesis/chapters/04_reproducibility.tex        | 11 ------
 ...system_design.tex => 04_system_design.tex} |  0
 ...plementation.tex => 05_implementation.tex} |  0
 .../{07_evaluation.tex => 06_evaluation.tex}  |  0
 .../{08_results.tex => 07_results.tex}        |  0
 .../{09_discussion.tex => 08_discussion.tex}  |  0
 .../{10_conclusion.tex => 09_conclusion.tex}  |  0
 thesis/thesis.tex                             | 16 ++++-----
 10 files changed, 42 insertions(+), 37 deletions(-)
 delete mode 100644 thesis/chapters/03_related_work.tex
 create mode 100644 thesis/chapters/03_reproducibility.tex
 delete mode 100644 thesis/chapters/04_reproducibility.tex
 rename thesis/chapters/{05_system_design.tex => 04_system_design.tex} (100%)
 rename thesis/chapters/{06_implementation.tex => 05_implementation.tex} (100%)
 rename thesis/chapters/{07_evaluation.tex => 06_evaluation.tex} (100%)
 rename thesis/chapters/{08_results.tex => 07_results.tex} (100%)
 rename thesis/chapters/{09_discussion.tex => 08_discussion.tex} (100%)
 rename thesis/chapters/{10_conclusion.tex => 09_conclusion.tex} (100%)

diff --git a/thesis/chapters/03_related_work.tex b/thesis/chapters/03_related_work.tex
deleted file mode 100644
index 8eacb2e..0000000
--- a/thesis/chapters/03_related_work.tex
+++ /dev/null
@@ -1,18 +0,0 @@
-\chapter{Related Work and State of the Art}
-\label{ch:related_work}
-
-\section{Existing Frameworks}
-
-The HRI community has a long history of developing custom tools to support WoZ studies. Early efforts focused on providing robust interfaces for technical users. For example, Polonius \cite{Lu2011} was designed to give robotics engineers a flexible way to create experiments for their collaborators, emphasizing integrated logging to streamline analysis. Similarly, OpenWoZ \cite{Hoffman2016} introduced a cloud-based, runtime-configurable architecture that allowed researchers to modify robot behaviors on the fly. These tools represented significant advancements in experimental infrastructure, moving the field away from purely hard-coded scripts. However, they largely targeted users with significant technical expertise, requiring knowledge of specific programming languages or network protocols to configure and extend.
-
-\section{General vs. Domain-Specific Tools}
-
-A recurring tension in the design of HRI tools is the trade-off between specialization and generalizability. Some tools prioritize usability by coupling tightly with specific hardware. WoZ4U \cite{Rietz2021}, for instance, provides an intuitive graphical interface specifically for the Pepper robot, making it accessible to non-technical researchers but unusable for other platforms. Manufacturer-provided software like Choregraphe \cite{Pot2009} for the NAO robot follows a similar pattern: it offers a powerful visual programming environment but locks the user into a single vendor's ecosystem. Conversely, generic tools like Ozlab seek to support a wide range of devices but often struggle to maintain relevance as hardware evolves \cite{Pettersson2015}. This fragmentation forces labs to constantly switch tools or reinvent infrastructure, hindering the accumulation of shared methodological knowledge.
-
-\section{Methodological Critiques}
-
-Beyond software architecture, the methodological rigor of WoZ studies has been a subject of critical review. In a seminal systematic review, Riek \cite{Riek2012} analyzed 54 HRI studies and uncovered a widespread lack of consistency in how wizard behaviors were controlled and reported. The review noted that very few researchers reported standardized wizard training or measured wizard error rates, raising concerns about the internal validity of many experiments. This lack of rigor is often exacerbated by the tools themselves; when interfaces are ad-hoc or poorly designed, they increase the cognitive load on the wizard, leading to inconsistent timing and behavior that can confound study results.
-
-\section{Research Gaps}
-
-Despite the rich landscape of existing tools, a critical gap remains for a platform that is simultaneously accessible, reproducible, and sustainable. Existing accessible tools are often too platform-specific to be widely adopted, while flexible, general-purpose frameworks often present a prohibitively high technical barrier. Furthermore, few tools directly address the methodological crisis identified by Riek by enforcing standardized protocols or actively guiding the wizard during execution. HRIStudio aims to fill this void by providing a web-based, robot-agnostic platform that not only lowers the barrier to entry for interdisciplinary researchers but also embeds methodological best practices directly into the experimental workflow.
diff --git a/thesis/chapters/03_reproducibility.tex b/thesis/chapters/03_reproducibility.tex
new file mode 100644
index 0000000..86144fe
--- /dev/null
+++ b/thesis/chapters/03_reproducibility.tex
@@ -0,0 +1,34 @@
+\chapter{Reproducibility Challenges in WoZ-based HRI Research}
+\label{ch:reproducibility}
+
+Having established the landscape of existing WoZ platforms and their limitations, I now turn to a more fundamental question: what makes WoZ experiments difficult to reproduce, and how can software infrastructure help address these challenges? This chapter analyzes the sources of variability in WoZ studies, examines how current practices in infrastructure and reporting contribute to reproducibility problems, and derives specific platform requirements that can mitigate these issues. Understanding these challenges is essential for designing a system that not only makes experiments easier to conduct but also more scientifically rigorous.
+
+\section{Sources of Variability}
+
+Reproducibility in experimental research requires that independent investigators can obtain consistent results when following the same procedures. In WoZ-based HRI studies, however, multiple sources of variability can compromise this goal. The wizard is simultaneously the strength and weakness of the WoZ paradigm. While human control enables sophisticated, adaptive interactions, it also introduces inconsistency. Consider a wizard conducting multiple trials of the same experiment with different participants. Even with a detailed script, the wizard may vary in timing, with delays between a participant's action and the robot's response fluctuating based on the wizard's attention, fatigue, or interpretation of when to act. When a script allows for choices, different wizards may make different selections, or the same wizard may choose differently across trials. Furthermore, a wizard may accidentally skip steps, trigger actions in the wrong order, or misinterpret experimental protocols.
+
+Riek's systematic review \cite{Riek2012} found that very few published studies reported measuring wizard error rates or providing standardized wizard training. Without such measures, it becomes impossible to determine whether experimental results reflect the intended interaction design or inadvertent variations in wizard behavior.
+
+Beyond wizard behavior, the ``one-off'' nature of many WoZ control systems introduces technical variability. When each research group builds custom software for each study, several problems arise. Custom interfaces may have undocumented capabilities, hidden features, default behaviors, or timing characteristics that are never formally described. Software tightly coupled to specific robot models or operating system versions may become unusable when hardware is upgraded or replaced. Each system logs data differently, with different file formats, different levels of granularity, and different choices about what to record. This fragmentation means that replicating a study often requires not just following an experimental protocol but also reverse-engineering or rebuilding the original software infrastructure.
+
+Even when researchers intend for their work to be reproducible, practical constraints on publication length lead to incomplete documentation. Exact timing parameters are often omitted. Decision rules for wizard actions remain unspecified. Details of the wizard interface go unreported. Specifications of data collection, including which sensor streams were recorded and at what sampling rate, are frequently missing. Without this information, other researchers cannot faithfully recreate the experimental conditions, limiting both direct replication and conceptual extensions of prior work.
+
+\section{Infrastructure Requirements for Enhanced Reproducibility}
+
+Based on this analysis, I identify specific ways that software infrastructure can mitigate reproducibility challenges. Rather than merely providing tools for wizard control, an ideal WoZ platform should actively guide wizards through scripted procedures. This means presenting actions in a prescribed sequence to prevent out-of-order execution, highlighting the current step in the protocol, recording any deviations from the script as explicit events in the data log, and supporting repeatable decision logic through clearly defined conditional branches. By constraining wizard behavior within the bounds of the experimental design, the system reduces unintended variability across trials and participants.
+
+Manual data collection is error-prone and often incomplete. The platform should automatically record every action triggered by the wizard with precise timestamps, all robot sensor data and state changes, timing information indicating when actions were requested, when they began executing, and when they completed, as well as the full experimental protocol embedded in the log file so that the script used for any session can be recovered later. This ``data by default'' approach ensures that critical information is never accidentally omitted.
+
+The experimental design itself should serve as documentation. When interaction protocols are defined using structured formats such as visual flowcharts or declarative scripts rather than imperative code, they become simultaneously executable and human-readable. Researchers can then share complete, unambiguous descriptions of their experimental procedures alongside their results.
+
+To maximize the lifespan and transferability of experimental designs, the platform must separate the high-level logic of an interaction from the low-level details of how specific robots execute those behaviors. This abstraction allows experiments designed for one robot to be adapted to another, extending the reproducibility of interaction designs even when the original hardware becomes obsolete.
+
+\section{Gap Between Current Practice and Requirements}
+
+Existing platforms address some but not all of these requirements. OpenWoZ provides comprehensive logging and supports runtime adaptability, but does not enforce scripted protocols. Choregraphe offers visual programming that is somewhat self-documenting, but tightly couples designs to specific hardware. WoZ4U is accessible and intuitive, but provides limited logging capabilities and no platform independence.
+
+The persistent gap is the absence of a platform that holistically addresses reproducibility by combining enforced experimental protocols, automatic comprehensive logging, self-documenting design interfaces, and platform-agnostic architecture. Closing this gap requires a fundamental rethinking of how WoZ infrastructure is designed, prioritizing methodological rigor as a first-class design goal rather than an afterthought.
+
+\section{Chapter Summary}
+
+This chapter has analyzed the reproducibility challenges inherent in WoZ-based HRI research, identifying three primary sources of variability: inconsistent wizard behavior, fragmented technical infrastructure, and incomplete documentation. I derived four specific infrastructure requirements that can mitigate these challenges: enforced experimental protocols, comprehensive automatic logging, self-documenting experiment designs, and platform-independent abstractions. Current platforms address these requirements only partially, revealing a clear opportunity for a new approach that prioritizes reproducibility from its inception. The following chapters describe the design and implementation of such a system.
diff --git a/thesis/chapters/04_reproducibility.tex b/thesis/chapters/04_reproducibility.tex
deleted file mode 100644
index a52c3eb..0000000
--- a/thesis/chapters/04_reproducibility.tex
+++ /dev/null
@@ -1,11 +0,0 @@
-\chapter{Reproducibility Challenges in WoZ-based HRI Research}
-\label{ch:reproducibility}
-
-\section{Sources of Variability}
-% TODO
-
-\section{Infrastructure and Reporting}
-% TODO
-
-\section{Platform Requirements}
-% TODO
diff --git a/thesis/chapters/05_system_design.tex b/thesis/chapters/04_system_design.tex
similarity index 100%
rename from thesis/chapters/05_system_design.tex
rename to thesis/chapters/04_system_design.tex
diff --git a/thesis/chapters/06_implementation.tex b/thesis/chapters/05_implementation.tex
similarity index 100%
rename from thesis/chapters/06_implementation.tex
rename to thesis/chapters/05_implementation.tex
diff --git a/thesis/chapters/07_evaluation.tex b/thesis/chapters/06_evaluation.tex
similarity index 100%
rename from thesis/chapters/07_evaluation.tex
rename to thesis/chapters/06_evaluation.tex
diff --git a/thesis/chapters/08_results.tex b/thesis/chapters/07_results.tex
similarity index 100%
rename from thesis/chapters/08_results.tex
rename to thesis/chapters/07_results.tex
diff --git a/thesis/chapters/09_discussion.tex b/thesis/chapters/08_discussion.tex
similarity index 100%
rename from thesis/chapters/09_discussion.tex
rename to thesis/chapters/08_discussion.tex
diff --git a/thesis/chapters/10_conclusion.tex b/thesis/chapters/09_conclusion.tex
similarity index 100%
rename from thesis/chapters/10_conclusion.tex
rename to thesis/chapters/09_conclusion.tex
diff --git a/thesis/thesis.tex b/thesis/thesis.tex
index a9f0144..6c1199d 100644
--- a/thesis/thesis.tex
+++ b/thesis/thesis.tex
@@ -4,6 +4,7 @@
 %\usepackage{graphics}            %Select graphics package
 \usepackage{graphicx}             %
 %\usepackage{amsthm}              %Add other packages as necessary
+\usepackage{hyperref}             %Enable hyperlinks and \autoref
 \begin{document}
 \butitle{A Web-Based Wizard-of-Oz Platform for Collaborative and Reproducible Human-Robot Interaction Research}
 \author{Sean O'Connor}
@@ -32,14 +33,13 @@
 
 \include{chapters/01_introduction}
 \include{chapters/02_background}
-\include{chapters/03_related_work}
-\include{chapters/04_reproducibility}
-\include{chapters/05_system_design}
-\include{chapters/06_implementation}
-\include{chapters/07_evaluation}
-\include{chapters/08_results}
-\include{chapters/09_discussion}
-\include{chapters/10_conclusion}
+\include{chapters/03_reproducibility}
+\include{chapters/04_system_design}
+\include{chapters/05_implementation}
+\include{chapters/06_evaluation}
+\include{chapters/07_results}
+\include{chapters/08_discussion}
+\include{chapters/09_conclusion}
 
 \backmatter
 %\bibliographystyle{thesis_num}   %This uses BU thesis file thesis_num.bst