Update thesis content and improve reproducibility framework

- Refine introduction and background chapters for clarity and coherence. - Enhance reproducibility chapter by connecting challenges to infrastructure requirements. - Add new references to support the thesis arguments. - Update .gitignore to include IDE files. - Modify hyperref package usage to hide colored boxes in the document.
2026-05-08 07:08:55 -04:00 · 2026-02-12 10:26:49 -05:00
parent b29e14c054
commit b75f31271b
6 changed files with 65 additions and 19 deletions
@@ -1,7 +1,7 @@
 \chapter{Reproducibility Challenges in WoZ-based HRI Research}
 \label{ch:reproducibility}

-Having established the landscape of existing WoZ platforms and their limitations, I now turn to a more fundamental question: what makes WoZ experiments difficult to reproduce, and how can software infrastructure help address these challenges? This chapter analyzes the sources of variability in WoZ studies, examines how current practices in infrastructure and reporting contribute to reproducibility problems, and derives specific platform requirements that can mitigate these issues. Understanding these challenges is essential for designing a system that not only makes experiments easier to conduct but also more scientifically rigorous.
+Having established the landscape of existing WoZ platforms and their limitations, I now examine the factors that make WoZ experiments difficult to reproduce and how software infrastructure can address them. This chapter analyzes the sources of variability in WoZ studies, examines how current practices in infrastructure and reporting contribute to reproducibility problems, and derives specific platform requirements that can mitigate these issues. Understanding these challenges is essential for designing a system that supports experimentation at scale while remaining scientifically rigorous.

 \section{Sources of Variability}

@@ -17,18 +17,16 @@ Even when researchers intend for their work to be reproducible, practical constr

 Based on this analysis, I identify specific ways that software infrastructure can mitigate reproducibility challenges. Rather than merely providing tools for wizard control, an ideal WoZ platform should actively guide wizards through scripted procedures. This means presenting actions in a prescribed sequence to prevent out-of-order execution, highlighting the current step in the protocol, recording any deviations from the script as explicit events in the data log, and supporting repeatable decision logic through clearly defined conditional branches. By constraining wizard behavior within the bounds of the experimental design, the system reduces unintended variability across trials and participants.

-Manual data collection is error-prone and often incomplete. The platform should automatically record every action triggered by the wizard with precise timestamps, all robot sensor data and state changes, timing information indicating when actions were requested, when they began executing, and when they completed, as well as the full experimental protocol embedded in the log file so that the script used for any session can be recovered later. This ``data by default'' approach ensures that critical information is never accidentally omitted.
+Manual data collection is error-prone and often incomplete. The platform should automatically record every action triggered by the wizard with precise timestamps, all robot sensor data and state changes, timing information indicating when actions were requested, when they began executing, and when they completed, as well as the full experimental protocol embedded in the log file so that the script used for any session can be recovered later. This approach of recording data by default ensures that critical information is never accidentally omitted.

 The experimental design itself should serve as documentation. When interaction protocols are defined using structured formats such as visual flowcharts or declarative scripts rather than imperative code, they become simultaneously executable and human-readable. Researchers can then share complete, unambiguous descriptions of their experimental procedures alongside their results.

 To maximize the lifespan and transferability of experimental designs, the platform must separate the high-level logic of an interaction from the low-level details of how specific robots execute those behaviors. This abstraction allows experiments designed for one robot to be adapted to another, extending the reproducibility of interaction designs even when the original hardware becomes obsolete.

-\section{Gap Between Current Practice and Requirements}
+\section{Connecting Reproducibility Challenges to Infrastructure Requirements}

-Existing platforms address some but not all of these requirements. OpenWoZ provides comprehensive logging and supports runtime adaptability, but does not enforce scripted protocols. Choregraphe offers visual programming that is somewhat self-documenting, but tightly couples designs to specific hardware. WoZ4U is accessible and intuitive, but provides limited logging capabilities and no platform independence.
-
-The persistent gap is the absence of a platform that holistically addresses reproducibility by combining enforced experimental protocols, automatic comprehensive logging, self-documenting design interfaces, and platform-agnostic architecture. Closing this gap requires a fundamental rethinking of how WoZ infrastructure is designed, prioritizing methodological rigor as a first-class design goal rather than an afterthought.
+The reproducibility challenges identified above directly motivate the infrastructure requirements established in Chapter~\ref{ch:background}. Inconsistent wizard behavior violates the requirement for enforced experimental protocols and comprehensive automatic logging. The absence of standardized logging formats and sensor specifications violates both the automated logging and self-documenting design requirements. Technical fragmentation violates the platform-agnostic requirement, as bespoke systems become obsolete when hardware evolves. Incomplete documentation reflects a failure to treat experiment design as executable, self-documenting specifications. No existing platform simultaneously satisfies all six requirements: most critically, the trade-off between accessibility and flexibility remains unresolved, and few tools embed methodological best practices directly into their design. As Chapter~\ref{ch:background} demonstrated, this gap persists across a decade of platform development. Addressing it requires a fundamental rethinking of how WoZ infrastructure is designed, prioritizing reproducibility and methodological rigor as first-class design goals rather than afterthoughts.

 \section{Chapter Summary}

-This chapter has analyzed the reproducibility challenges inherent in WoZ-based HRI research, identifying three primary sources of variability: inconsistent wizard behavior, fragmented technical infrastructure, and incomplete documentation. I derived four specific infrastructure requirements that can mitigate these challenges: enforced experimental protocols, comprehensive automatic logging, self-documenting experiment designs, and platform-independent abstractions. Current platforms address these requirements only partially, revealing a clear opportunity for a new approach that prioritizes reproducibility from its inception. The following chapters describe the design and implementation of such a system.
+This chapter has analyzed the reproducibility challenges inherent in WoZ-based HRI research, identifying three primary sources of variability: inconsistent wizard behavior, fragmented technical infrastructure, and incomplete documentation. Rather than treating these challenges as inherent to the WoZ paradigm, I showed how each stems from gaps in current infrastructure. Software design can systematically mitigate these challenges through enforced experimental protocols, comprehensive automatic logging, self-documenting experiment designs, and platform-independent abstractions. These design goals directly address the six infrastructure requirements identified in Chapter~\ref{ch:background}. The following chapters describe the design, implementation, and empirical evaluation of a system that prioritizes reproducibility as a foundational design principle from inception.