From 1bb935243627748a87e1f9a7265c052be5ffef7b Mon Sep 17 00:00:00 2001 From: Sean O'Connor Date: Sun, 25 Jan 2026 01:13:15 -0500 Subject: [PATCH] add context --- .gitignore | 8 + context/hristudio-lbr.tex | 223 +++++++++++++++++++++++++++ context/hristudio-sp2025.tex | 287 +++++++++++++++++++++++++++++++++++ context/proposal.tex | 141 +++++++++++++++++ context/study_draft.md | 145 ++++++++++++++++++ irbapplication.tex | 154 ++++++++++--------- 6 files changed, 887 insertions(+), 71 deletions(-) create mode 100644 .gitignore create mode 100644 context/hristudio-lbr.tex create mode 100644 context/hristudio-sp2025.tex create mode 100644 context/proposal.tex create mode 100644 context/study_draft.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..6419bb9 --- /dev/null +++ b/.gitignore @@ -0,0 +1,8 @@ +*.aux +*.fls +*.log +*.synctex.gz +*.pdf +*.fdb_latexmk + +*.DS_Store \ No newline at end of file diff --git a/context/hristudio-lbr.tex b/context/hristudio-lbr.tex new file mode 100644 index 0000000..d49fbef --- /dev/null +++ b/context/hristudio-lbr.tex @@ -0,0 +1,223 @@ +% Standard Paper +\documentclass[letterpaper, 10 pt, conference]{ieeeconf} + +% A4 Paper +%\documentclass[a4paper, 10pt, conference]{ieeeconf} + +% Only needed for \thanks command +\IEEEoverridecommandlockouts + +% Needed to meet printer requirements. +\overrideIEEEmargins + +%In case you encounter the following error: +%Error 1010 The PDF file may be corrupt (unable to open PDF file) OR +%Error 1000 An error occurred while parsing a contents stream. Unable to analyze the PDF file. +%This is a known problem with pdfLaTeX conversion filter. The file cannot be opened with acrobat reader +%Please use one of the alternatives below to circumvent this error by uncommenting one or the other +%\pdfobjcompresslevel=0 +%\pdfminorversion=4 + +% See the \addtolength command later in the file to balance the column lengths +% on the last page of the document + +% The following packages can be found on http:\\www.ctan.org +\usepackage{graphicx} % for pdf, bitmapped graphics files +%\usepackage{epsfig} % for postscript graphics files +%\usepackage{mathptmx} % assumes new font selection scheme installed +%\usepackage{times} % assumes new font selection scheme installed +%\usepackage{amsmath} % assumes amsmath package installed +%\usepackage{amssymb} % assumes amsmath package installed +\usepackage{url} +\usepackage{float} + +\hyphenation{analysis} + +\title{\LARGE \bf HRIStudio: A Framework for Wizard-of-Oz Experiments in Human-Robot Interaction Studies} + +\author{Sean O'Connor and L. Felipe Perrone$^{*}$ + \thanks{$^{*}$Both authors are with the Department of Computer Science at + Bucknell University in Lewisburg, PA, USA. They can be reached at {\tt\small sso005@bucknell.edu} and {\tt\small perrone@bucknell.edu}}% +} + +\begin{document} + +\maketitle +\thispagestyle{empty} +\pagestyle{empty} + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{abstract} + +Human-robot interaction (HRI) research plays a pivotal role in shaping how robots communicate and collaborate with humans. However, conducting HRI studies, particularly those employing the Wizard-of-Oz (WoZ) technique, can be challenging. WoZ user studies can have complexities at the technical and methodological levels that may render the results irreproducible. We propose to address these challenges with HRIStudio, a novel web-based platform designed to streamline the design, execution, and analysis of WoZ experiments. HRIStudio offers an intuitive interface for experiment creation, real-time control and monitoring during experimental runs, and comprehensive data logging and playback tools for analysis and reproducibility. By lowering technical barriers, promoting collaboration, and offering methodological guidelines, HRIStudio aims to make human-centered robotics research easier, and at the same time, empower researchers to develop scientifically rigorous user studies. + +\end{abstract} + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%% TODO: Update mockup pictures with photo of subject and robot + +\section{Introduction} + +Human-robot interaction (HRI) is an essential field of study for understanding how robots should communicate, collaborate, and coexist with people. The development of autonomous behaviors in social robot applications, however, offers a number of challenges. The Wizard-of-Oz (WoZ) technique has emerged as a valuable experimental paradigm to address these difficulties, as it allows experimenters to simulate a robot's autonomous behaviors. With WoZ, a human operator (the \emph{``wizard''}) can operate the robot remotely, essentially simulating its autonomous behavior during user studies. This enables the rapid prototyping and continuous refinement of human-robot interactions postponing to later the full development of complex robot behaviors. + +While WoZ is a powerful paradigm, it does not eliminate all experimental challenges. Researchers may face barriers related to the use of specialized tools and methodologies involved in WoZ user studies and also find difficulties in creating fully reproducible experiments. Existing solutions often rely on low-level robot operating systems, limited proprietary platforms, or require extensive custom coding, which can restrict their use to domain experts with extensive technical backgrounds. + +Through a comprehensive review of current literature, we have identified a pressing need for a platform that simplifies the process of designing, executing, analyzing, and recording WoZ-based user studies. To address this gap, we are developing \emph{HRIStudio}, a novel web-based platform that enables the intuitive configuration and operation of WoZ studies for HRI research. Our contribution leverages the \emph{Robot Operating System} (ROS) to handle the complexities of interfacing with different robotics platforms. HRIStudio presents users with a high-level, user-friendly interface for experimental design, live control and monitoring during execution runs (which we call \emph{live experiment sessions}), and comprehensive post-study analysis. The system offers drag-and-drop visual programming for describing experiments without extensive coding, real-time control and observation capabilities during live experiment sessions, as well as comprehensive data logging and playback tools for analysis and enhanced reproducibility. We expect that with these features, HRIStudio will make the application of the WoZ paradigm more systematic thereby increasing the scientific rigor of this type of HRI experiment. The following sections present a brief review of the relevant literature, outline the design of HRIStudio and its experimental workflow, and offer implementation details and future directions for this work. + +\section{State-of-the-Art} + +The importance of the WoZ paradigm for user studies in social robotics is illustrated by the several frameworks that have been developed to support it. We describe some of the most notable as follows. + +\emph{Polonius}~\cite{Lu2011}, which is based on the modular ROS platform, offers a graphical user interface for wizards to define finite state machine scripts that drive the behavior of robots during experiments. \emph{NottReal}~\cite{Porcheron2020} was designed for WoZ studies of voice user interfaces. It provides scripting capabilities and visual feedback to simulate autonomous behavior for participants. \emph{WoZ4U}~\cite{Rietz2021} presents a user-friendly GUI that makes HRI studies more accessible to non-programmers. The tight hardware focus on Aldebaran's Pepper, however, constrains the tool's applicability. \emph{OpenWoZ}~\cite{Hoffman2016} proposes a runtime-configurable framework with a multi-client architecture, enabling evaluators to modify robot behaviors during experiments. The platform allows one with programming expertise to create standard, customized robot behaviors for user studies. + +In addition to the aforementioned frameworks, we considered Riek's systematic analysis of published WoZ experiments, which stresses the need for increased methodological rigor, transparency and reproducibility of WoZ studies.~\cite{Riek2012} Altogether, the literature inspired us to design HRIStudio as a platform that offers comprehensive support for WoZ studies in social robotics. Our design goals include offering a platform that is as ``robot-agnostic'' as possible and which offers its users guidance to specify and execute WoZ studies that are methodologically sound and maximally reproducible. As such, HRIStudio aims to offer an easy user interface that allows for experiments to be scripted and executed easily and which allows for the aggregation of experimental data and other assets generated in a study. + +\section{Overarching Design Goals} + +We have identified several guiding design principles to maximize HRIStudio's effectiveness, usefulness, and usability. Foremost, we want HRIStudio to be accessible to users with and without deep robot programming expertise so that we may lower the barrier to entry for those conducting HRI studies. The platform should provide an intuitive graphical user interface that obviates the need for describing robot behaviors in a programming language. The user should be able to focus on describing sequences of robot behaviors without getting bogged down by all the details of specific robots. To this end, we determined that the framework should offer users the means by which to describe experiments and robot behaviors, while capturing and storing all data generated including text-based logs, audio, video, IRB materials, and user consent forms. + +Furthermore, we determined that the framework should also support multiple user accounts and data sharing to enable collaborations between the members of a team and the dissemination of experiments across different teams. By incorporating these design goals, HRIStudio prioritizes experiment design, collaborative workflows, methodological rigor, and scientific reproducibility. + +\section{Design of the Experimental Workflow} + +\subsection{Organization of a user study} + +With HRIStudio, we define a hierarchical organization of elements to express WoZ user studies for HRI research. An experimenter starts by creating and configuring a \emph{study} element, which will comprise multiple instantiations of one same experimental script encapsulated in an element called \emph{experiment}, which captures the experiences of a specific human subject with the robot designated in the script. + +Each \emph{experiment} comprises a sequence of one or more \emph{step} elements. Each \emph{step} models a phase of the experiment and aggregates a sequence of \emph{action} elements, which are fine-grained, specific tasks to be executed either by the wizard or by the robot. An \emph{action} targeted at the wizard provides guidance and maximizes the chances of consistent behavior. An \emph{action} targeted at the robot causes it to execute movements or verbal interactions, or causes it to wait for a human subject's input or response. + +The system executes the \emph{actions} in an experimental script asynchronously and in an event-driven manner, guiding the wizard's behavior and allowing them to simulate the robot's autonomous intelligence by responding to the human subject in real time based on the human's actions and reactions. This event-driven approach allows for flexible and spontaneous reactions by the wizard, enabling a more natural and intelligent interaction with the human subject. In contrast, a time-driven script with rigid, imposed timing would show a lack of intelligence and autonomy on the part of the robot. + +In order to enforce consistency across multiple runs of the \emph{experiment}, HRIStudio uses specifications encoded in the \emph{study} element to inform the wizard on how to constrain their behavior to a set of possible types of interventions. Although every experiment is potentially unique due to the unlikely perfect match of reactions between human subjects, this mechanism allows for annotating the data feed and capturing the nuances of each unique interaction. + +Figure~\ref{fig:userstudy} illustrates this hierarchy of elements with a practical example. We argue that this hierarchical structure for the experimental procedure in a user study benefits methodological rigor and reproducibility while affording the researcher the ability to design complex HRI studies while guiding the wizard to follow a consistent set of instructions. + +\begin{figure}[ht] + \vskip -0.4cm + \begin{center} + \includegraphics[width=0.4\paperwidth]{assets/diagrams/userstudy} + \vskip -0.5cm + \caption{A sample user study.} + \label{fig:userstudy} + \end{center} +\vskip -0.7cm +\end{figure} + +\subsection{System interfaces} + +HRIStudio features a user-friendly graphical interface for designing WoZ experiments. This interface provides a visual programming system that allows one to build their experiments using a drag-and-drop approach. The core of the experiment creation process offers a library of actions including common tasks and behaviors executed in the experiment such as robot movements, speech synthesis, and instructions for the wizard. One can drag and drop action components onto a canvas and arrange them into sequences that define study, experiment, steps, and action components. The interface provides configuration options that allow researchers to customize parameters in each element. This configuration system offers contextual help and documentation to guide researchers through the process while providing examples or best practices for designing studies. + +\subsection{Live experiment operation} + +During live experiment sessions, HRIStudio offers multiple synchronized views for experiment execution and observation, and data collection. The wizard's \emph{Execute} view gives the wizard control over the robot's actions and behaviors. Displaying the current step of the experiment along with associated actions, this interface facilitates intuitive navigation through the structural elements of the experiments and allows for the creation of annotations on a timeline. The wizard can advance through actions sequentially or manually trigger specific actions based on contextual cues or responses from the human subject. During the execution of an experiment, the interface gives the wizard manual controls to insert unscripted robot movements, speech synthesis, and other functions dynamically. These events are recorded in persistent media within the sequence of actions in the experimental script. + +The observer's \emph{Execute} view supports live monitoring, note-taking, and potential interventions by additional researchers involved in the experiment. This feature ensures the option of continuous oversight without disrupting the experience of human subjects or the wizard's control. Collaboration on an experiment is made possible by allowing multiple observers to concurrently access the \emph{Execute} view. + +\subsection{Data logging, playback, and annotation} + +Throughout the live experiment session, the platform automatically logs various data streams, including timestamped records of all executed actions and experimental events, exposed robot sensor data, and audio and video recordings of the participant's interactions with the robot. Logged data is stored in JavaScript Object Notation (JSON) encrypted files in secure storage, enabling efficient post-experiment data analysis to ensure the privacy of human subjects. + +After a live experiment session, researchers may use a \emph{Playback} view to inspect the recorded data streams and develop a holistic understanding of the experiment's progression. This interface supports features such as playback of recorded data such as audio, video, and sensor data streams, scrubbing of recorded data with the ability to mark and note significant events or observations, and export options for selected data segments or annotations. + +\section{Implementation} + +The realization of the proposed platform is a work in progress. So far, we have made significant advances on the design of the overall framework and of its several components while exploring underlying technologies, wireframing user views and interfaces, and establishing a development roadmap. + +\subsection{Core technologies used} + +We are leveraging the \emph{Next.js React} \cite{next} framework for building our framework as a web application. Next.js provides server-side rendering, improved performance, and enhanced security. By making HRIStudio a web application, we achieve independence from hardware and operating system. We are building into the framework support for API routes and integration with \emph{TypeScript Remote Procedure Call} (tRPC), which simplifies the development of APIs for interfacing with the ROS interface. + +For the robot control layer, we utilize ROS as the communication and control interface. ROS offers a modular and extensible architecture, enabling seamless integration with a multitude of consumer and research robotics platforms. Thanks to the widespread adoption of ROS in the robotics community, HRIStudio will be able to support a wide range robots out-of-the-box by leveraging the efforts of the ROS community for new robot platforms. +\vspace{-0.3cm} +\subsection{High-level architecture} + +We have designed our system as a full-stack web application. The frontend handles user interface components such as the experiment \emph{Design} view, the experiment \emph{Execute} view, and the \emph{Playback} view. The backend API logic manages experiment data, user authentication, and communication with a ROS interface component. In its turn, the ROS interface is implemented as a separate C++ node and translates high-level actions from the web application into low-level robot commands, sensor data, and protocols, abstracting the complexities of different robotics platforms. This modular architecture leverages the benefits of Next.js' server-side rendering, improved performance, and security, while enabling integration with various robotic platforms through ROS. Fig.~\ref{fig:systemarch} shows the structure of the application. + +\begin{figure} + \begin{center} + \includegraphics[width=0.35\paperwidth]{assets/diagrams/systemarch} + \vskip -0.5cm + \caption{The high-level system architecture of HRIStudio.} + \label{fig:systemarch} + \vskip -0.8cm + \end{center} +\end{figure} + +\subsection{User interface mockups} + +A significant portion of our efforts have been dedicated to designing intuitive and user-friendly interface mockups for the platform's key components. We have created wireframes and prototypes for the study \emph{Dashboard}, \emph{Design} view, \emph{Execute} view, and the \emph{Playback} view. + +The study \emph{Dashboard} mockups (see Figure~\ref{fig:dashboard}) display an intuitive overview of a project's status, including platform information, collaborators, completed and upcoming trials, subjects, and a list of pending issues. This will allow a researcher to quickly see what needs to be done, or easily navigate to a previous trial's data for analysis. + +\begin{figure} +% \vskip -.2cm + \centering + \includegraphics[width=0.35\paperwidth]{assets/mockups/dashboard} + \vskip -0.3cm + \caption{A sample project's \emph{Dashboard} view within HRIStudio.} + \label{fig:dashboard} + \vskip -.2cm +\end{figure} + +The \emph{Design} view mockups depicted in Figure~\ref{fig:design} feature a visual programming canvas where researchers can construct their experiments by dragging and dropping pre-defined action components. These components represent common tasks and behaviors, such as robot movements, speech synthesis, and instructions for the wizard. The mockups also include configuration panels for customizing the parameters of each action component. + +\begin{figure} +\vskip -0.1cm + \centering + \includegraphics[width=0.35\paperwidth]{assets/mockups/design} + \vskip -0.3cm + \caption{A sample project's \emph{Design} view in HRIStudio.} + \label{fig:design} + \vskip -.3cm +\end{figure} + +For the \emph{Execute} view, we have designed mockups that provide synchronized views for the wizard and observers. The wizard's view (see Figure~\ref{fig:execute}) presents an intuitive step-based interface that walks the wizard through the experiment as specified by the designer, triggering actions, and controlling the robot, while the observer view facilitates real-time monitoring and note taking. + +\begin{figure} +\vskip -0.3cm + \centering + \includegraphics[width=0.35\paperwidth]{assets/mockups/execute} + \vskip -0.3cm + \caption{The wizard's \emph{Execute} view during a live experiment.} + \label{fig:execute} +% \vskip -0.9cm +\end{figure} + +Fig.~\ref{fig:playback} shows \emph{Playback} mockups for synchronized playback of recorded data streams, including audio, video, and applicable sensor data. The features include visual and textual annotations, scrubbing capabilities, and data export options to support comprehensive post-experiment analysis and reproducibility. + +\begin{figure} + \centering + \includegraphics[width=0.35\paperwidth]{assets/mockups/playback} + \vskip -0.3cm + \caption{The \emph{Playback} view of an experiment within a study.} + \label{fig:playback} + \vskip -0.4cm +\end{figure} + +\subsection{Development roadmap} + +While the UI mockups have laid a solid foundation, we anticipate challenges in transforming these designs into a fully functional platform, such as integrating the Next.js web application with the ROS interface and handling bi-directional communication between the two. We plan to leverage tRPC for real-time data exchange and robot control. + +Another key challenge is developing the \emph{Design} view's visual programming environment, and encoding procedures into a shareable format. We will explore existing visual programming libraries and develop custom components for intuitive experiment construction. + +Implementing robust data logging and synchronized playback of audio, video, and sensor data while ensuring efficient storage and retrieval is also crucial. + +To address these challenges, our development roadmap includes: +\begin{itemize} + \item Establishing a stable Next.js codebase with tRPC integration, + \item Implementing a ROS interface node for robot communication, + \item Developing the visual experiment designer, + \item Integrating data logging for capturing experimental data streams, + \item Building playback and annotation tools with export capabilities, + \item Creating tutorials and documentation for researcher adoption. +\end{itemize} + +This roadmap identifies some of the challenges ahead. We expect that this plan will fully realize HRIStudio into a functional and accessible tool for conducting WoZ experiments. We hope for this tool to become a significant aid in HRI research, empowering researchers and fostering collaboration within the community. + +\bibliography{refs} +\bibliographystyle{plain} + +\end{document} diff --git a/context/hristudio-sp2025.tex b/context/hristudio-sp2025.tex new file mode 100644 index 0000000..ff26844 --- /dev/null +++ b/context/hristudio-sp2025.tex @@ -0,0 +1,287 @@ +% Standard Paper +\documentclass[letterpaper, 10 pt, conference]{subfiles/ieeeconf} + +% A4 Paper +%\documentclass[a4paper, 10pt, conference]{ieeeconf} + +% Only needed for \thanks command +\IEEEoverridecommandlockouts + +% Needed to meet printer requirements. +\overrideIEEEmargins + +%In case you encounter the following error: +%Error 1010 The PDF file may be corrupt (unable to open PDF file) OR +%Error 1000 An error occurred while parsing a contents stream. Unable to analyze the PDF file. +%This is a known problem with pdfLaTeX conversion filter. The file cannot be opened with acrobat reader +%Please use one of the alternatives below to circumvent this error by uncommenting one or the other +%\pdfobjcompresslevel=0 +%\pdfminorversion=4 + +% See the \addtolength command later in the file to balance the column lengths +% on the last page of the document + +% The following packages can be found on http:\\www.ctan.org +\usepackage{graphicx} % for pdf, bitmapped graphics files +%\usepackage{epsfig} % for postscript graphics files +%\usepackage{mathptmx} % assumes new font selection scheme installed +%\usepackage{times} % assumes new font selection scheme installed +%\usepackage{amsmath} % assumes amsmath package installed +%\usepackage{amssymb} % assumes amsmath package installed +\usepackage{url} +\usepackage{float} + +\hyphenation{analysis} + +\title{\LARGE \bf A Web-Based Wizard-of-Oz Platform for Collaborative and Reproducible Human-Robot Interaction Research} + +\author{Sean O'Connor and L. Felipe Perrone$^{*}$ + \thanks{$^{*}$Both authors are with the Department of Computer Science at + Bucknell University in Lewisburg, PA, USA. They can be reached at {\tt\small sso005@bucknell.edu} and {\tt\small perrone@bucknell.edu}}% +} + +\begin{document} + + + +\maketitle +\thispagestyle{empty} +\pagestyle{empty} + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{abstract} + +Human-robot interaction (HRI) research plays a pivotal role in shaping how robots communicate and collaborate with humans. However, conducting HRI studies can be challenging, particularly those employing the Wizard-of-Oz (WoZ) technique. WoZ user studies can have technical and methodological complexities that may render the results irreproducible. We propose to address these challenges with HRIStudio, a modular web-based platform designed to streamline the design, the execution, and the analysis of WoZ experiments. HRIStudio offers an intuitive interface for experiment creation, real-time control and monitoring during experimental runs, and comprehensive data logging and playback tools for analysis and reproducibility. By lowering technical barriers, promoting collaboration, and offering methodological guidelines, HRIStudio aims to make human-centered robotics research easier and empower researchers to develop scientifically rigorous user studies. + +\end{abstract} + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\section{Introduction} + +Human-robot interaction (HRI) is an essential field of study for understanding how robots should communicate, collaborate, and coexist with people. The development of autonomous behaviors in social robot applications, however, offers a number of challenges. The Wizard-of-Oz (WoZ) technique has emerged as a valuable experimental paradigm to address these difficulties, as it allows experimenters to simulate a robot's autonomous behaviors. With WoZ, a human operator (the \emph{``wizard''}) can operate the robot remotely, essentially simulating its autonomous behavior during user studies. This enables the rapid prototyping and continuous refinement of human-robot interactions postponing to later the full development of complex robot behaviors. + +While WoZ is a powerful paradigm, it does not eliminate all experimental challenges. The paradigm is centered on the wizard who must carry out scripted sequences of actions. Ideally, the wizard should execute their script identically across runs of the experiment with different participants. Deviations from the script in one run or another may change experimental conditions significantly decreasing the methodological rigor of the larger study. This kind of problem can be minimized by instrumenting the wizard with a system that prevents deviations from the prescribed interactions with the participant. In addition to the variability that can be introduced by wizard behaviors, WoZ studies can be undermined by technical barriers related to the use of specialized equipment and tools. Different robots may be controlled or programmed through different systems requiring expertise with a range of technologies such as programming languages, development environments, and operating systems. + +The elaboration and the execution of rigorous and reproducible WoZ experiments can be challenging for HRI researchers. Although there do exist solutions to support this kind of endeavor, they often rely on low-level robot operating systems, limited proprietary platforms, or require extensive custom coding, which can restrict their use to domain experts with extensive technical backgrounds. The development of our work was motivated by the desire to offer a platform that would lower the barriers to entry in HRI research with the WoZ paradigm. + +Through the literature review described in the next section, we identified six categories of desirables to be included in a modern system that streamlines the WoZ experimental process: an environment that integrates all the functionalities of the system; mechanisms for the description of WoZ experiments which require minimal to no coding expertise; fine grained, real-time control of scripted experimental runs with a variety of robotic platforms; comprehensive data collection and logging; a platform-agnostic approach to support a wide range of robot hardware; and collaborative features that allow research teams to work together effectively. + +The design and development of HRIStudio were driven by the desirables enumerated above and described in \cite{OConnor2024}, our preliminary report. In this work, our main contribution is to demonstrate how a system such the one we are developing has significant potential to make WoZ experiments easier to carry out, more rigorous, and ultimately reproducible. The remainder of this paper is structured as follows. In Section~\ref{sota}, we establish the context for our contribution through a review of recent literature. In Section~\ref{repchallenges}, we discuss the aspects of the WoZ paradigm that can lead to reproducibility challenges and in Section~\ref{arch} we propose solutions to address these challenges. Subsequently, in Section~\ref{workflow}, we describe our solution to create a structure for the experimental workflow. Finally, in Section~\ref{conclusion}, we conclude the paper with a summary of our contributions, a reflection on the current state of our project, and directions for the future. + +\section{Assessment of the State-of-the-Art} +\label{sota} + +Over the last two decades, multiple frameworks to support and automate the WoZ paradigm have been reported in the literature. These frameworks can be categorized according to how they focus on four primary areas of interest, which we discuss below as we expose some of the most important contributions to the field. + +\subsection{Technical Infrastructure and Architectures} + +The foundation of any WoZ framework lies in its technical infrastructure and architectural design. These elements determine not only the system's capabilities but also its longevity and adaptability to different research needs. Several frameworks have focused on providing robust technical infrastructures for WoZ experiments. + +\emph{Polonius}~\cite{Lu2011} utilizes the modular \emph{Robot Operating System} (ROS) platform as its foundation, offering a graphical user interface for wizards to define finite-state machine scripts that drive robot behaviors. A notable feature is its integrated logging system that eliminates the need for post-experiment video coding, allowing researchers to record human-robot interactions in real-time as they occur. Polonius was specifically designed to be accessible to non-programming collaborators, addressing an important accessibility gap in HRI research tools. + +\emph{OpenWoZ}~\cite{Hoffman2016} takes a different approach with its runtime-configurable framework and multi-client architecture, enabling evaluators to modify robot behaviors during experiments without interrupting the flow. This flexibility allows for dynamic adaptation to unexpected participant responses, though it requires programming expertise to create customized robot behaviors. The system's architecture supports distributed operation, where multiple operators can collaborate during an experiment. + +%% Runtime configuration- how much does this ruin reproducibility? + +\subsection{Interface Design and User Experience} + +The design of an interface for the wizard to control the execution of an experiment is important. The qualities of the interface can significantly impact both the quality of data collected and the longevity of the tool itself. \emph{NottReal}~\cite{Porcheron2020} exemplifies careful attention to interface design in its development for voice user interface studies. The system makes it easier for the wizard to play their role featuring tabbed lists of pre-scripted messages, slots for customization, message queuing capabilities, and comprehensive logging. Its visual feedback mechanisms mimic commercial voice assistants, providing participants with familiar interaction cues such as dynamic ``orbs'' that indicate the system is listening and processing states. + +\emph{WoZ4U}~\cite{Rietz2021} prioritizes usability with a GUI specifically designed to make HRI studies accessible to non-programmers. While its tight integration with Aldebaran's Pepper robot constrains generalizability, it demonstrates how specialized interfaces can lower barriers to entry for conducting WoZ studies with specific platforms. + +\subsection{Domain Specialization vs. Generalizability} + +A key tension in WoZ framework development exists between domain specialization and generalizability. Some systems are designed for specific types of interactions or robot platforms, offering deep functionality within a narrow domain. Others aim for broader applicability across various robots and interaction scenarios, potentially sacrificing depth of functionalities for breadth. + +Pettersson and Wik's~\cite{Pettersson2015} systematic review identified this tension as central to the longevity of WoZ tools, that is, their ability to remain operational despite changes in underlying technologies. Their analysis of 24 WoZ systems revealed that most general-purpose tools have a lifespan of only 2-3 years. Their own tool, Ozlab, achieved exceptional longevity (15+ years) through three factors: (1) a truly general-purpose approach from inception, (2) integration into HCI curricula ensuring institutional support, and (3) a flexible wizard interface design that adapts to specific experimental needs rather than forcing standardization. + +\subsection{Standardization Efforts and Methodological Approaches} + +The tension between specialization and generalizability has led to increased interest in developing standardized approaches to WoZ experimentation. Recent efforts have focused on developing standards for HRI research methodology and interaction specification. Porfirio et al.~\cite{Porfirio2023} proposed guidelines for an \emph{interaction specification language} (ISL), emphasizing the need for standardized ways to define and communicate robot behaviors across different platforms. Their work introduces the concept of \emph{Application Development Environments} (ADEs) for HRI and details how hierarchical modularity and formal representations can enhance the reproducibility of robot behaviors. These ADEs would provide structured environments for creating robot behaviors with varying levels of expressiveness while maintaining platform independence. + +This standardization effort addresses a critical gap identified in Riek's~\cite{Riek2012} systematic analysis of published WoZ experiments. Riek's work revealed concerning methodological deficiencies: 24.1\% of papers clearly described their WoZ simulation as part of an iterative design process, 5.4\% described wizard training procedures, and 11\% constrained what the wizard could recognize. This lack of methodological transparency hinders reproducibility and, therefore, scientific progress in the field. + +Methodological considerations extend beyond wizard protocols to the fundamental approaches in HRI evaluation. Steinfeld et al.~\cite{Steinfeld2009} introduced a complementary framework to the traditional WoZ method, which they termed ``the Oz of Wizard.'' While WoZ uses human experimenters to simulate robot capabilities, the Oz of Wizard approach employs simplified human models to evaluate robot behaviors and technologies. Their framework systematically describes various permutations of real versus simulated components in HRI experiments, establishing that both approaches serve valid research objectives. They contend that technological advances in HRI constitute legitimate research even when using simplified human models rather than actual participants, provided certain conditions are met. This framework establishes an important lesson for the development of new WoZ platforms like HRIStudio which must balance standardization with flexibility in experimental design. + +The interdisciplinary nature of HRI creates methodological inconsistencies that Belhassein et al.~\cite{Belhassein2019} examine in depth. Their analysis identifies recurring challenges in HRI user studies: limited participant pools, insufficient reporting of wizard protocols, and barriers to experiment replication. They note that self-assessment measures like questionnaires, though commonly employed, often lack proper validation for HRI contexts and may not accurately capture the participants' experiences. Our platform's design goals align closely with their recommendations to combine multiple evaluation approaches, thoroughly document procedures, and develop validated HRI-specific assessment tools. + +Complementing these theoretical frameworks, Fraune et al.~\cite{Fraune2022} provide practical methodological guidance from an HRI workshop focused on study design. Their work organizes expert insights into themes covering study design improvement, participant interaction strategies, management of technical limitations, and cross-field collaboration. Key recommendations include pre-testing with pilot participants and ensuring robot behaviors are perceived as intended. Their discussion of participant expectations and the ``novelty effect" in first-time robot interactions is particularly relevant for WoZ studies, as these factors can significantly influence experimental outcomes. + +\subsection{Challenges and Research Gaps} + +Despite these advances, significant challenges remain in developing accessible and rigorous WoZ frameworks that can remain usable over non-trivial periods of time. Many existing frameworks require significant programming expertise, constraining their usability by interdisciplinary teams. While technical capabilities have advanced, methodological standardization lags behind, resulting in inconsistent experimental practices. Few platforms provide comprehensive data collection and sharing capabilities that enable robust meta-analyses across multiple studies. We are challenged to create tools that provide sufficient structure for reproducibility while allowing the flexibility needed for the pursuit of answers to diverse research questions. + +HRIStudio aims to address these challenges with a platform that is robot-agnostic, methodologically rigorous, and eminently usable by those with less honed technological skills. By incorporating lessons from previous frameworks and addressing the gaps identified in this section, we designed a system that supports the full lifecycle of WoZ experiments, from design through execution to analysis, with an emphasis on usability, reproducibility, and collaboration. + +\section{Reproducibility Challenges in WoZ Studies} +\label{repchallenges} + +Reproducibility is a cornerstone of scientific research, yet it remains a significant challenge in human-robot interaction studies, particularly those centered on the Wizard-of-Oz methodology. Before detailing our platform design, we first examine the critical reproducibility issues that have informed our approach. + +The reproducibility challenges affecting many scientific fields are particularly acute in HRI research employing WoZ techniques. Human wizards may respond differently to similar situations across experimental trials, introducing inconsistency that undermines reproducibility and the integrity of collected data. Published studies often provide insufficient details about wizard protocols, decision-making criteria, and response timing, making replication by other researchers nearly impossible. Without standardized tools, research teams create custom setups that are difficult to recreate, and ad-hoc changes during experiments frequently go unrecorded. Different data collection methodologies and metrics further complicate cross-study comparisons. + +As previously discussed, Riek's~\cite{Riek2012} systematic analysis of WoZ research exposed significant methodological transparency issues in the literature. These documented deficiencies in reporting experimental procedures make replication challenging, undermining the scientific validity of findings and slowing progress in the field as researchers cannot effectively build upon previous work. + +We have identified five key requirements for enhancing reproducibility in WoZ studies. First, standardized terminology and structure provide a common vocabulary for describing experimental components, reducing ambiguity in research communications. Second, wizard behavior formalization establishes clear guidelines for wizard actions that balance consistency with flexibility, enabling reproducible interactions while accommodating the natural variations in human-robot exchanges. Third, comprehensive data capture through time-synchronized recording of all experimental events with precise timestamps allows researchers to accurately analyze interaction patterns. Fourth, experiment specification sharing capabilities enable researchers to package and distribute complete experimental designs, facilitating replication by other teams. Finally, procedural documentation through automatic logging of experimental parameters and methodological details preserves critical information that might otherwise be omitted in publications. These requirements directly informed HRIStudio's architecture and design principles, ensuring that reproducibility is built into the platform rather than treated as an afterthought. + +\section{The Design and Architecture of HRIStudio} +\label{arch} + +Informed by our analysis of both existing WoZ frameworks and the reproducibility challenges identified in the previous section, we have developed several guiding design principles for HRIStudio. Our primary goal is to create a platform that enhances the scientific rigor of WoZ studies while remaining accessible to researchers with varying levels of technical expertise. We have been drive by the goal of prioritizing ``accessibility'' in the sense that the platform should be usable by researchers without deep robot programming expertise so as to lower the barrier to entry for HRI studies. Through abstraction, users can focus on experimental design without getting bogged down by the technical details of specific robot platforms. Comprehensive data management enables the system to capture and store all generated data, including logs, audio, video, and study materials. To facilitate teamwork, the platform provides collaboration support through multiple user accounts, role-based access control, and data sharing capabilities that enable effective knowledge transfer while restricting access to sensitive data. Finally, methodological guidance is embedded throughout the platform, directing users toward scientifically sound practices through its design and documentation. These principles directly address the reproducibility requirements identified earlier, particularly the need for standardized terminology, wizard behavior formalization, and comprehensive data capture. + +We have implemented HRIStudio as a modular web application with explicit separation of concerns in accordance with these design principles. The structure of the application into client and server components creates a clear separation of responsibilities and functionalities. While the client exposes interactive elements to users, the server handles data processing, storage, and access control. This architecture provides a foundation for implementing data security through role-based interfaces in which different members of a team have tailored views of the same experimental session. + +As shown in Figure~\ref{fig:system-architecture}, the architecture consists of three main functional layers that work in concert to provide a comprehensive experimental platform. The \emph{User Interface Layer} provides intuitive, browser-based interfaces for three components: an \emph{Experiment Designer} with visual programming capabilities for one to specify experimental details, a \emph{Wizard Interface} that grants real-time control over the execution of a trial, and a \emph{Playback \& Analysis} module that supports data exploration and visualization. + +The \emph{Data Management Layer} provides database functionality to organize, store, and retrieve experiment definitions, metadata, and media assets generated throughout an experiment. Since HRIStudio is a web-based application, users can access this database remotely through an access control system that defines roles such as \emph{researcher}, \emph{wizard}, and \emph{observer} each with appropriate capabilities and constraints. This fine-grained access control protects sensitive participant data while enabling appropriate sharing within research teams, with flexible deployment options either on-premise or in the cloud depending on one's needs. The layer enables collaboration among the parties involved in conducting a user study while keeping information compartmentalized and secure according to each party's requirements. + +The third major component is the \emph{Robot Integration Layer}, which is responsible for translating our standardized abstractions for robot control to the specific commands accepted by different robot platforms. HRIStudio relies on the assumption that at least one of three different mechanisms is available for communication with a robot: a {RESTful API}, standard communication structures provided by ROS, or a plugin that is custom-made for that platform. The \emph{Robot Integration Layer} serves as an intermediary between the \emph{Data Management Layer} with \emph{External Systems} such as robot hardware, external sensors, and analysis tools. This layer allows the main components of the system to remain ``robot-agnostic'' pending the identification or the creation of the correct communication method and changes to a configuration file. + +% System architecture figure +\begin{figure}[ht] + \centering + \includegraphics[width=1\columnwidth]{assets/diagrams/system-architecture.pdf} + \caption{HRIStudio's three-layer architecture.} + \label{fig:system-architecture} +\end{figure} + +In order to facilitate the deployment of our application, we leverage containerization with Docker to ensure that every component of HRIStudio will be supported by their system dependencies on different environments. This is an important step toward extending the longevity of the tool and toward guaranteeing that experimental environments remain consistent across different platforms. Furthermore, it allows researchers to share not only experimental designs, but also their entire execution environment should a third party wish to reproduce an experimental study. + +\section{Experimental Workflow Support} +\label{workflow} + +The experimental workflow in HRIStudio directly addresses the reproducibility challenges identified in Section~\ref{repchallenges} by providing standardized structures, explicit wizard guidance, and comprehensive data capture. This section details how the platform's workflow components implement solutions for each key reproducibility requirement. + +\subsection{Embracing a Hierarchical Structure for WoZ Studies} + +HRIStudio defines its own standard terminology with a hierarchical organization of the elements in WoZ studies as follows. +\begin{itemize} +\item At the top level, an experiment designer defines a \emph{study} element, which comprises one or more \emph{experiment} elements. + +\item Each \emph{experiment} specifies the experimental protocol for a discrete subcomponent of the overall study and comprises one or more \emph{step} elements, each representing a distinct phase in the execution sequence. The \emph{experiment} functions as a parameterized template. + +\item Defining all the parameters in an \emph{experiment}, one creates a \emph{trial}, which is an executable instance involving a specific participant and conducted under predefined conditions. The data generated by each \emph{trial} is recorded by the system so that later one can examine how the experimental protocol was applied to each participant. The distinction between experiment and trial enables a clear separation between the abstract protocol specification and its concrete instantiation and execution. + +\item Each \emph{step} encapsulates instructions that are meant either for the wizard or for the robot thereby creating the concept of ``type'' for this element. The \emph{step} is a container for a sequence of one or more \emph{action} elements. + +\item Each \emph{action} represents a specific, atomic task for either the wizard or the robot, according to the nature of the \emph{step} element that contains it. An \emph{action} for the robot may represent commands for input gathering, speech, waiting, movement, etc., and may be configured by parameters specific for the \emph{trial}. +\end{itemize} + +Figure~\ref{fig:experiment-architecture} illustrates this hierarchical structure through a fictional study. In the diagram, we see a ``Social Robot Greeting Study'' containing an experiment with a specific robot platform, steps containing actions, and a trial with a participant. Note that each trial event is a traceable record of the sequence of actions defined in the experiment. HRIStudio enables researchers to collect the same data across multiple trials while adhering to consistent experimental protocols and recording any reactions the wizard may inject into the process. + +% Experiment architecture figure showing hierarchical organization +\begin{figure}[ht] + \centering + \includegraphics[width=1\columnwidth]{assets/diagrams/experiment-architecture.pdf} + \caption{Hierarchical organization of a sample user study in HRIStudio.} + \label{fig:experiment-architecture} +\end{figure} + +This standardized hierarchical structure creates a common vocabulary for experimental elements, eliminating ambiguity in descriptions and enabling clearer communication among researchers. Our approach aligns with the guidelines proposed by Porfirio et al.~\cite{Porfirio2023} for an HRI specification language, particularly in regards to standardized formal representations and hierarchical modularity. Our system uses the formal study definitions to create comprehensive procedural documentation requiring no additional effort by the researcher. Beyond this documentation, a study definition can be shared with other researchers for the faithful reproduction of experiments. + +Figure~\ref{fig:study-details} shows how the system displays the data of an experimental study in progress. In this view, researchers can inspect summary data about the execution of a study and its trials, find a list of human subjects (``participants'') and go on to see data and documents associated with them such as consent forms, find the list of teammates collaborating in this study (``members''), read descriptive information on the study (``metadata''), and inspect an audit log that records work that has been done toward the completion of the study (``activity''). + +% Study details figure showing hierarchical organization +\begin{figure}[t] + \centering +\includegraphics[width=1\columnwidth]{assets/mockups/study-details.png} +\caption{Summary view of system data on an example study.} +\label{fig:study-details} +\end{figure} + +\subsection{Collaboration and Knowledge Sharing} + +Experiments are reproducible when they are thoroughly documented and when that documentation is easily disseminated. To support this, HRIStudio includes features that enable collaborative experiment design and streamlined sharing of assets generated during experimental studies. The platform provides a dashboard that offers an overview of project status, details about collaborators, a timeline of completed and upcoming trials, and a list of pending tasks. + +As previously noted, the \emph{Data Management Layer} incorporates a role-based access control system that defines distinct user roles aligned with specific responsibilities within a study. This role structure enforces a clear separation of duties and enables fine-grained, need-to-know access to study-related information. This design supports various research scenarios, including double-blind studies where certain team members have restricted access to information. The pre-defined roles are as follows: +\begin{itemize} +\item \emph{Administrator}, a ``super user'' who can manage the installation and the configuration of the system, +\item \emph{Researcher}, a user who can create and configure studies and experiments, +\item \emph{Observer}, a user role with read-only access, allowing inspection of experiment assets and real-time monitoring of experiment execution, and +\item \emph{Wizard}, a user role that allows one to execute an experiment. +\end{itemize} +For maximum flexibility, the system allows additional roles with different sets of permissions to be created by the administrator as needed. + +The collaboration system allows multiple researchers to work together on experiment designs, review each other's work, and build shared knowledge about effective methodologies. This approach also enables the packaging and dissemination of complete study materials, including experimental designs, configuration parameters, collected data, and analysis results. By making all aspects of the research process shareable, HRIStudio facilitates replication studies and meta-analyses, enhancing the cumulative nature of scientific knowledge in HRI. + +\subsection{Visual Experiment Design} + +HRIStudio implements an \emph{Experiment Development Environment} (EDE) that builds on Porfirio et al.'s~\cite{Porfirio2023} concept of Application Development Environment. Figure~\ref{fig:experiment-designer} shows how this EDE is implemented as a visual programming, drag-and-drop canvas for sequencing steps and actions. In this example, we see a progression of steps (``Welcome'' and ``Robot Approach'') where each step is customized with specific actions. Robot actions issue abstract commands, which are then translated into platform-specific concrete commands by components known as \emph{plugins}, which are tailored to each type of robot and discussed later in this section. + +% Experiment designer figure +\begin{figure}[ht] + \centering +\includegraphics[width=1\columnwidth]{assets/mockups/experiment-designer.png} +\caption{View of experiment designer.} +\label{fig:experiment-designer} +\end{figure} + +Our EDE was inspired by Choregraphe~\cite{Pot2009} which enables researchers without coding expertise to build the steps and actions of an experiment visually as flow diagrams. The robot control components shown in the interface are automatically added to the inventory of options according to the experiment configuration, which specifies the robot to be used. We expect that this will make experiment design more accessible to those with reduced programming skills while maintaining the expressivity required for sophisticated studies. Conversely, to support those without knowledge of best practices for WoZ studies, the EDE offers contextual help and documentation as guidance for one to stay on the right track. + +\subsection{The Wizard Interface and Experiment Execution} + +We built into HRIStudio an interface for the wizard to execute experiments and to interact with them in real time. In the development of this component, we drew on lessons from Pettersson and Wik's~\cite{Pettersson2015} work on WoZ tool longevity. From them we have learned that a significant factor that determines the short lifespan of WoZ tools is the trap of a fixed, one-size-fits-all wizard interface. Following the principle incorporated into their Ozlab, we have incorporated into our framework functionality that allows the wizard interface to be adapted to the specific needs of each experiment. One can configure wizard controls and visualizations for their specific study, while keeping other elements of the framework unchanged. + +Figure~\ref{fig:experiment-runner} shows the wizard interface for the fictional experiment ``Initial Greeting Protocol.'' This view shows the current step with an instruction for the wizard that corresponds to an action they will carry out. These instructions are presented one at a time so as not to overwhelm the wizard, but one can also use the ``View More'' button when it becomes desirable to see the complete experimental script. The view also includes a window for the captured video feed showing the robot and the participant, a timestamped log of recent events, and various interaction controls for unscripted actions that can be applied in real time (``quick actions''). By following the instructions which are provided incrementally, the wizard is guided to execute the experimental procedure consistently across its different trials with different participants. To provide live monitoring functionalities to users in the role of \emph{observer}, a similar view is presented to them without the controls that might interfere with the execution of an experiment. + +When a wizard initiates an action during a trial, the system executes a three-step process to implement the command. First, it translates the high-level action into specific API calls as defined by the relevant plugin, converting abstract experimental actions into concrete robot instructions. Next, the system routes these calls to the robot's control system through the appropriate communication channels. Finally, it processes any feedback received from the robot, logs this information in the experimental record, and updates the experiment state accordingly to reflect the current situation. This process ensures reliable communication between the wizard interface and the physical robot while maintaining comprehensive records of all interactions. + +% Experiment runner figure +\begin{figure}[ht] + \centering +\includegraphics[width=1\columnwidth]{assets/mockups/experiment-runner.png} +\caption{View of HRIStudio's wizard interface during experiment execution.} +\label{fig:experiment-runner} +\end{figure} + +\subsection{Robot Platform Integration} +\label{plugin-store} + +The three-step process described above relies on a modular, two-tier system for communication between HRIStudio and each specific robot platform. The EDE offers an experiment designer a number of pre-defined action components representing common tasks and behaviors such as robot movements, speech synthesis, and sensor controls. Although these components can accept parameters for the configuration of each action, they exist at a higher level of abstraction. When actions are executed, the system translates these abstractions so that they match the commands accepted by the robot selected for the experiment. This translation is achieved by a \emph{plugin} for the specific robot, which serves as the communication channel between HRIStudio and the physical robots. + +Each robot plugin contains detailed action definitions with multiple components: action identifiers and metadata such as title, description, and a graphical icon to be presented in the EDE. Additionally, the plugin is programmed with parameter schemas including data types, validation rules, and default values to ensure proper configuration. For robots running ROS2, we support mappings that connect HRIStudio to the robot middleware. This integration approach ensures that HRIStudio can be used with any robot for which a plugin has been built. + +As shown in Figure~\ref{fig:plugins-store}, we have developed a \emph{Plugin Store} to aggregate plugins available for an HRIStudio installation. Currently, it includes a plugin specifically for the TurtleBot3 Burger (illustrated in the figure) as well as a template to support the creation of additional plugins for other robots. Over time, we anticipate that the Plugin Store will expand to include a broader range of plugins, supporting robots of diverse types. In order to let users of the platform know what to expect of the plugins in the store, we have defined three different trust levels: + +\begin{itemize} +\item \emph{Official} plugins will have been created and tested by HRIStudio developers. +\item \emph{Verified} plugins will have different provenance, but will have undergone a validation process. +\item \emph{Community} plugins will have been developed by third-parties but will not yet have been validated. +\end{itemize} + +The Plugin Store provides access to the source version control \emph{repositories} which are used in the development of plugins allowing for the precise tracking of which plugin versions are used in each experiment. This system enables community contributions while maintaining reproducibility by documenting exactly which plugin versions were used for any given experiment. + +% Plugin store figure +\begin{figure}[ht] + \centering +\includegraphics[width=1\columnwidth]{assets/mockups/plugins-store.png} +\caption{The Plugin Store for plugin selection.} +\label{fig:plugins-store} +\end{figure} + +\subsection{Comprehensive Data Capture and Analysis} + +We have designed HRIStudio to create detailed logs of experiment executions and to capture and place in persistent storage all the data generated during each trial. The system keeps timestamped records of all executed actions and experimental events so that it is able to create an accurate timeline of the study. It collects robot sensor data including position, orientation, and various sensor readings that provide context about the robot's state throughout the experiment. + +The platform records audio and video of interactions between a robot and participant, enabling post-hoc analysis of verbal and non-verbal behaviors. The system also records wizard decisions and interventions, including any unplanned actions that deviate from the experimental protocol. Finally, it saves with the experiment the observer notes and annotations, capturing qualitative insights from researchers monitoring the study. Together, these synchronized data streams provide a complete record of experimental sessions. + +Experimental data is stored in structured formats to support long-term preservation and seamless integration with analysis tools. Sensitive participant data is encrypted at the database level to safeguard participant privacy while retaining comprehensive records for research use. To facilitate analysis, the platform allows trials to be studied with ``playback'' functionalities that allow one to review the steps in a trial and to annotate any significant events identified. + +\section{Conclusion and Future Directions} +\label{conclusion} + +Although Wizard-of-Oz (WoZ) experiments are a powerful method for developing human-robot interaction applications, they demand careful attention to procedural details. Trials involving different participants require wizards to consistently execute the same sequence of events, accurately log any deviations from the prescribed script, and systematically manage all assets associated with each participant. The reproducibility of WoZ experiments depends on the thoroughness of their documentation and the ease with which their experimental setup can be disseminated. + +To support these efforts, we drew on both existing literature and our own experience to develop HRIStudio, a modular platform designed to ease the burden on wizards while enhancing the reproducibility of experiments. \mbox{HRIStudio} maintains detailed records of experimental designs and results, facilitating dissemination and helping third parties interested in replication. The platform offers a hierarchical framework for experiment design and a visual programming interface for specifying sequences of events. By minimizing the need for programming expertise, it lowers the barrier to entry and broadens access to WoZ experimentation. + +HRIStudio is built using a variety of web application and database technologies, which introduce certain dependencies for host systems. To simplify deployment, we are containerizing the platform and developing comprehensive, interface-integrated documentation to guide users through installation and operation. Our next development phase focuses on enhancing execution and analysis capabilities, including advanced wizard guidance, dynamic adaptation, and improved real-time feedback. We are also implementing playback functionality for reviewing synchronized data streams and expanding integration with hardware commonly used HRI research. + +Ongoing engagement with the research community has played a key role in shaping HRIStudio. Feedback from the reviewers of our RO-MAN 2024 late breaking report and conference participants directly influenced our design choices, particularly around integration with existing research infrastructures and workflows. We look forward to creating more systematic opportunities to engage researchers to guide and refine our development as we prepare for an open beta release. + +\bibliography{subfiles/refs} +\bibliographystyle{plain} + +\end{document} diff --git a/context/proposal.tex b/context/proposal.tex new file mode 100644 index 0000000..ef239a7 --- /dev/null +++ b/context/proposal.tex @@ -0,0 +1,141 @@ +% Thesis Proposal +%\documentclass{buthesis_p} %Default is author-year citation style +\documentclass[numbib]{buthesis_p} %Gives numerical citation style +%\documentclass[twoadv, numbib]{buthesis_p} %Allows entry of second advisor +\usepackage{graphics} %Select graphics package +%\usepackage{graphicx} % +\usepackage{amsthm} %Add other packages as necessary +\usepackage{setspace} %For double spacing +\usepackage{geometry} %For margin control +\usepackage{tabularx} +\geometry{ + left=1in, + right=1in, + top=1in, + bottom=1in +} +\begin{document} +\butitle{A Web-Based Wizard-of-Oz Platform for Collaborative and Reproducible Human-Robot Interaction Research} +\author{Sean O'Connor} +\degree{Bachelor of Science} +\department{Computer Science} +\adviser{L. Felipe Perrone} +%\adviserb{Jane Doe} %Second adviser if necessary +\secondreader{Brian King} +\maketitle + +\doublespacing + +\section{Introduction} + +To build the social robots of tomorrow, researchers must find ways to convincingly simulate them today. The process of designing and optimizing interactions between human and robot is essential to the Human-Robot Interaction (HRI) field, a discipline dedicated to ensuring these technologies are safe, effective, and accepted by the public. Yet, conducting rigorous research in social robotics remains hindered by complex technical requirements and inconsistent methodologies. + +In a typical social robotics interaction, a robot operates autonomously based on pre-programmed behaviors. However, human interaction can be unpredictable. When a robot fails to respond appropriately to a social cue, the interaction can degrade, causing the human partner to lose trust or disengage. + +To overcome the limitations of pre-programmed autonomy, researchers often use the Wizard-of-Oz (WoZ) technique to test prototypes of robot behaviors before the underlying technology is fully developed. In this method, a human operator (the ``wizard'') observes the interaction from a separate room via cameras and microphones, controlling the robot's actions in real-time. To the person interacting with the robot, it appears fully autonomous, creating a convincing simulation that is helpful for rapid prototyping and testing of interaction designs. + +Despite its conceptual simplicity, conducting WoZ research presents two challenges. The first is a technical barrier that prevents many non-programmers, such as experts in psychology or sociology, from conducting their own studies. This accessibility problem is compounded by a second challenge: a fragmented hardware landscape. Because different labs use different robot platforms, researchers often must build their own custom control tools for each study. These bespoke systems are rarely shared, making it difficult for scientists to replicate and build upon each other's findings, which hinders the development of a reliable and verifiable body of knowledge. + +To address these challenges, I am developing HRIStudio, a web-based platform for designing, executing, and analyzing WoZ experiments in social robotics. I argue that by lowering technical barriers and providing a common experimental platform, a web-based framework can significantly improve both the disciplinary accessibility and scientific reproducibility of research in social robotics. + +\section{Context} + +The challenges of disciplinary accessibility and scientific reproducibility in WoZ research have been explored in HRI literature. In a foundational systematic review of 54 HRI studies, Riek \cite{Riek2012} discovered a widespread lack of methodological consistency, noting that very few researchers reported standardized wizard training or measurement of wizard error. This stems from a landscape of specialized, ``in-house'' systems, where individual labs develop their own custom software for each study, tools that are rarely shared with other researchers. This forces labs to constantly reinvent control interfaces, hindering the replication and verification of scientific findings. + +In response, the research community has developed several specialized WoZ platforms. A first wave of tools focused on creating powerful, flexible architectures. Polonius was designed as a robust interface for robotics engineers to create experiments for their non-programmer collaborators, featuring an integrated logging system to streamline data analysis \cite{Lu2011}. Similarly, OpenWoZ introduced an adaptable framework that used web protocols to allow different control interfaces to easily connect to the robot, empowering technical users to create deviations from the pre-programmed interaction scripts in real-time \cite{Hoffman2016}. While architecturally sophisticated, these tools still required significant technical expertise to set up and configure, keeping the accessibility barrier high. + +A second wave of tools shifted focus to prioritize usability for a broader audience. WoZ4U was explicitly designed to be an ``easy-to-use tool for the Pepper robot'' that makes it easier for ``non-technical researchers to conduct Wizard-of-Oz experiments'' \cite{Rietz2021}. WoZ4U successfully lowered the accessibility barrier with an intuitive graphical interface. However, this usability was achieved by tightly coupling the software to a single type of robot. This approach creates a significant risk to platform longevity. As Pettersson and Wik note in a review of generic WoZ tools, systems that are too specialized often fall out of use as hardware becomes obsolete \cite{Pettersson2015}. This trade-off between capability, usability, and sustainability reveals a critical gap in the literature. No available tool exists that is simultaneously flexible, accessible, and can endure over time. + +In response to this lack of an adequate tool, I designed HRIStudio by combining an intuitive web-based interface with a flexible architecture that allows it to support a wide range of current and future robots. The result is a single, sustainable platform that is both powerful enough for complex experiments and accessible enough for interdisciplinary research teams. + +\section{Description} + +I created HRIStudio as an integrated, web-based platform designed to manage the entire lifecycle of a WoZ experiment in social robotics: from interaction design, through live execution, to final analysis. I designed the platform around three core principles: making research accessible to non-programmers, ensuring the experiments are reproducible, and providing a time-enduring tool for the HRI community. + +To solve the challenge of accessibility, I provide researchers with tools to visually map out an experiment's flow, much like creating a storyboard for a film. This intuitive approach allows a social scientist, for example, to design a complex HRI without writing a single line of code. The platform provides different interfaces to facilitate collaboration between the members of a team: A researcher gets a design canvas to build the study, the wizard gets a streamlined control panel to run the experiment, and an observer gets a tool for taking timestamped notes. + +To enable experiment reproducibility, I designed HRIStudio to mitigate key methodological challenges inherent in WoZ research. The first challenge is inconsistent wizard behavior; a tired or distracted human operator can unintentionally introduce errors, compromising a study's validity. HRIStudio's wizard interface acts as a ``smart co-pilot,'' guiding the operator through the pre-designed script with clear prompts for what to do and say next. This minimizes human error and increases the likelihood of a standardized experience for every participant. The second challenge lies in the complex task of managing experimental data. A typical study generates multiple streams of data that are difficult to synchronize manually, including video, audio, robot sensor logs, and wizard actions. The platform acts as a central recorder, automatically capturing and timestamping every data stream into a single, unified timeline. This simplifies analysis and allows another researcher to ``replay'' the entire experiment to verify and analyze the findings of another's study. + +Finally, to ensure the platform will be a time-enduring tool for the community, I designed the system to be robot-agnostic. Rather than being constrained to operate with a single kind of robot, the platform uses a system of standardized ``connectors,'' like a universal remote programmable for any television. This flexible architecture ensures that the platform will remain a valuable tool for the community long after any specific robot becomes obsolete, providing a stable, lasting foundation for future research. + +\section{Significance} + +This work is significant because it accelerates the foundational research needed to deploy social robots in critical societal roles, such as providing companionship for the elderly in assisted living facilities or acting as classroom aides for children with autism. My tool directly enables and accelerates the rigorous, human-centered research on which the success and public acceptance of these technologies depend. + +The primary significance of HRIStudio is its potential to lower the barrier to entry for HRI research. By allowing for visual programming, the platform removes technical barriers that have traditionally limited this research to engineering disciplines. It invites the domain experts who should be leading these studies to design and execute their own experiments, leading to better research questions and effective robot behaviors. + +My goal with HRIStudio is to elevate the scientific rigor of the field. By promoting a common structure for designing experiments and support for data collection, HRIStudio allows researchers to more easily replicate, verify, and build upon each other's work. This work supports the ongoing effort to make HRI a more cumulative science, where findings can be more easily verified and built upon by other researchers. + +Ultimately, my work contributes a piece of critical, open-source infrastructure to the HRI community that directly addresses the documented challenges of accessibility, reproducibility, and sustainability. Beyond its immediate utility, the platform's architecture also serves as a tangible blueprint for web-based scientific tools, demonstrating a successful model for bridging the gap between an intuitive user interface and the complexity of controlling live robotic hardware. + +The foundational concepts of this work have already been reported in two peer-reviewed publications at the IEEE International Conference on Robot and Human Interactive Communication \cite{OConnor2024, OConnor2025}. This work represents the culmination of that research, delivering the platform's full implementation, a critical evaluation by real users, and its release as a tool for the community. + +\section{Independent Contribution} + +This work builds upon a foundational collaboration with my adviser that led to two publications and the initial high-level design of the HRIStudio platform. For this work, my primary intellectual contribution is the independent execution of the project; I am the sole developer responsible for the complete software implementation and for the design and execution of the user study. + +\section{Methods} + +The foundational concepts and early architecture of HRIStudio have been established in prior work \cite{OConnor2024, OConnor2025}. The primary goal of this work is to translate that foundation into a complete, stable, and usable platform, and then rigorously evaluate its success. Therefore, the work is divided into two key phases: first, the final implementation of the platform's core features as outlined in the project timeline, and second, a formal user study to validate its impact on experimental consistency and efficiency. + +The study will involve recruiting approximately 10-12 participants from non-engineering fields (e.g., Psychology, Education) who have experience designing experiments but little to no programming background. The core task will be to recreate a well-documented experiment from the HRI literature using the NAO6 robot. To ensure a level playing field, all participants will first attend a workshop on the software package they are assigned. The participants will be divided into two groups: a control group will use the manufacturer-provided Choregraphe software \cite{Pot2009}, and an experimental group will use HRIStudio. + +My evaluation will focus on two primary outcomes. The first is methodological consistency: I will quantitatively assess the accuracy of each group's recreated experiment by comparing their final implementation against the original study's protocol. This will involve a detailed scoring rubric that measures discrepancies in robot behaviors, trigger logic, and dialogue. The second outcome is user experience: after the task, participants will complete a survey to provide qualitative and quantitative feedback on their assigned software. This mixed-methods approach will provide robust evidence to assess HRIStudio's effectiveness in making HRI research more accessible and reproducible. + +A detailed project schedule, outlining all key milestones and deadlines, is provided in Appendix A. + +\section{Conclusion} + +This work addresses a significant bottleneck in HRI research. By creating HRIStudio, a web-based platform for Wizard-of-Oz experimentation, this work confronts the interconnected challenges of disciplinary accessibility and scientific reproducibility. The platform provides publicly available infrastructure that empowers non-technical domain experts to conduct rigorous HRI studies. Ultimately a common, accessible, and sustainable tool does more than just simplify experiments. It fosters a more collaborative and scientifically robust approach to the entire field of HRI. +\newpage +\bibliography{refs} +\bibliographystyle{plain} + +\newpage +\appendix +\section*{Appendix A: Project Timeline} +\label{app:timeline} + +\begin{table}[h!] +\centering +\renewcommand{\arraystretch}{1.5} +\begin{tabularx}{\textwidth}{|l|X|} +\hline +\textbf{Timeframe} & \textbf{Milestones \& Key Tasks} \\ +\hline +\multicolumn{2}{|l|}{\textbf{Fall 2025: Development and Preparation}} \\ +\hline +September & Finalize and submit this proposal (Due: Sept. 20). + +Submit IRB application for the user study. \\ +\hline +Oct -- Nov & Complete final implementation of core HRIStudio features. + +Conduct extensive testing and bug-fixing to ensure platform stability. \\ +\hline +December & Finalize all user study materials (consent forms, protocols, etc.). + +Begin recruiting participants. \\ +\hline +\multicolumn{2}{|l|}{\textbf{Spring 2026: Execution, Analysis, and Writing}} \\ +\hline +Jan -- Feb & Upon receiving IRB approval, conduct all user study sessions. \\ +\hline +March & Analyze all data from the user study. + +Draft Results and Discussion sections. + +Submit ``Intent to Defend'' form (Due: March 1). \\ +\hline +April & Submit completed thesis draft to the defense committee (Due: April 1). + +Prepare for and complete the oral defense (Due: April 20). \\ +\hline +May & Incorporate feedback from the defense committee. + +Submit the final, approved thesis by the university deadline. \\ +\hline +\end{tabularx} +\end{table} + +\end{document} diff --git a/context/study_draft.md b/context/study_draft.md new file mode 100644 index 0000000..0d4bfa4 --- /dev/null +++ b/context/study_draft.md @@ -0,0 +1,145 @@ +Study Protocol Draft + +# Study Information + +**Principal Investigator:** @Sean O'Connor +**Thesis Adviser:** L. Felipe Perrone +**Robot Platform:** Aldebaran NAO6 +**Date:** January 23, 2026 + +# **1. Experimental Overview** + +**Objective:** To quantitatively and qualitatively compare two Wizard-of-Oz (WoZ) interfaces- **Choregraphe** (Industry Standard) and **HRIStudio** (Proposed Platform)- to determine their impact on: + +1. **Disciplinary Accessibility:** Can non-technical domain experts successfully design a robot interaction? +2. **Scientific Reproducibility:** Does the tool minimize human error and data loss during experiment execution? + +## **Experimental Design:** + +- **Method:** Between-Subjects User Study. +- **Participants:** N=20 (10 Wizards, 10Subjects). + - *Wizards:* Faculty or Graduate Students from non-Computer Science disciplines (e.g., Psychology, Education). + - *Subjects:* Undergraduate students (confederates or recruited volunteers). +- **Conditions:** + - **Group A (Control):** Choregraphe + - **Group B (Experimental):** HRIStudio + +# **2. Task Specification ("Paper Spec")** + +*This is the document handed to the Wizard at the start of the Design Phase. It serves as the "Independent Variable" control—every participant attempts to build this exact scenario.* + +## **Scenario: "The Geography Quiz Proctor"** + +**Goal:** You must program the robot to act as a quiz proctor for a single student. The robot will ask a question, wait for an answer, and provide feedback based on whether the student is right or wrong. + +### **Script & Logic Flow:** + +1. **Start State:** + - Robot is standing. + - Robot LED eyes are Blue. +2. **Interaction Step 1: Introduction** + - **Robot Speech:** "Hello. I am the Quiz Bot. Please sit down so we can begin.” + - **Action:** Robot waves its right hand. +3. **Interaction Step 2: The Question** + - **Robot Speech:** "Question One. What is the capital city of Pennsylvania?" + - **Wizard Action:** Wait for the student to speak. +4. **Interaction Step 3: Branching Logic (The Wizard's Choice)** + - *If Student says "Harrisburg" (Correct):* + - **Robot Speech:** "That is correct. Well done." + - **Action:** Robot nods head. + - *If Student says anything else (Incorrect):* + - **Robot Speech:** "I am sorry, that is incorrect. The answer is Harrisburg." + - **Action:** Robot shakes head. +5. **Interaction Step 4: Conclusion** + - **Robot Speech:** "Thank you for participating. Goodbye." + - **System Action:** Save the log of the interaction. + +# **3. Study Procedure (Timeline: 75 Minutes)** + +## **Phase 1: Training (15 Minutes)** + +- Participant (Wizard) is seated at the control computer. +- **Script:** "You have 15 minutes to learn this software. I will show you how to: 1) Make the robot speak, 2) Make the robot move, and 3) Create a 'Trigger' button." +- *Researcher demonstrates these three core functions only.* + +## **Phase 2: Design Challenge (30 Minutes)** + +- **Script:** "Here is the 'Paper Spec' for the Geography Quiz. Your task is to implement this interaction in the software. You have 30 minutes. If you get completely stuck, you may ask for a hint, but I will record it." +- **Data Collection:** + - Start Timer. + - Count **Help Requests**. + - Stop Timer when they say "I'm finished" or when 30 mins elapses. + +## **Phase 3: The Live Trial (15 Minutes)** + +- A "Student Subject" enters the room. +- **Script:** "Please use the interface you just built to run the experiment with this student. Do your best to follow the script exactly." +- **Data Collection:** + - Video record the robot and the student. + - Researcher notes **Wizard Errors** (e.g., Wizard clicks 'Correct' when student was 'Wrong'). + +## **Phase 4: Data Hand-off & Survey (15 Minutes)** + +- **Task:** Ask the Wizard to export the video/logs from the session. +- **Survey:** Administer the Post-Study Questionnaire. + +# **4. Measurement Rubrics** + +*These rubrics generate the quantitative data for the Results section.* + +## **A. Design Fidelity Score (Max 10 Points)** + +| Component | Criteria | Points | +| --- | --- | --- | +| Completeness | All 4 steps of the script are present in the interface. | 0-3 | +| Logic | Interface allows for *both* Correct and Incorrect paths (Branching). | 0-3 | +| Behaviors | Robot speech and gestures match the Paper Spec exactly. | 0-2 | +| Usability | The interface is labeled clearly (e.g., buttons named "Correct" vs "Button 1"). | 0-2 | +| Total | | / 10 | + +## **B. Execution Reliability Score (Max 10 Points)** + +| Component | Criteria | Points | +| --- | --- | --- | +| Timing | Robot responded to the student within 3 seconds. | 0-3 | +| Accuracy | Wizard triggered the correct branch (Correct vs Incorrect). | 0-3 | +| Stability | No software crashes, freezes, or connection losses. | 0-2 | +| Fideilty | Robot did not improvise or say things outside the script. | 0-2 | +| Total | | / 10 | + +# **5. Post-Study Questionnaire** + +## **Part 1: System Usability Scale (SUS)** + +*Participants rate the following on a scale of 1 (Strongly Disagree) to 5 (Strongly Agree).* + +1. I think that I would like to use this system frequently. +2. I found the system unnecessarily complex. +3. I thought the system was easy to use. +4. I think that I would need the support of a technical person to be able to use this system. +5. I found the various functions in this system were well integrated. +6. I thought there was too much inconsistency in this system. +7. I would imagine that most people would learn to use this system very quickly. +8. I found the system very cumbersome to use. +9. I felt very confident using the system. +10. I needed to learn a lot of things before I could get going with this system. + +## **Part 2: Reproducibility & Confidence** + +*Specific to the Thesis goals.* + +1. **Data Confidence:** "If I had to analyze the data from this session tomorrow, I am confident I could find the logs and video." (1-5) +2. **Sharing:** "If I sent this project file to a colleague at another university, how confident are you that they could run it without my help?" (1-5) +3. **Role Fit:** "I felt that I could focus on the *student* rather than fighting the *software*." (1-5) + +# **6. Required Equipment & Setup** + +- **Room:** Quiet lab space with two desks (one for Wizard, one for Subject). +- **Hardware:** + - 1x Softbank NAO6 Robot. + - 1x Wizard Laptop (Pre-loaded with HRIStudio Server). + - 1x Control Laptop (Pre-loaded with Choregraphe). + - 1x External Camera (for recording the session). +- **Software:** + - **Group A:** Choregraphe 2.8. + - **Group B:** HRIStudio (v1.0 Release Candidate). \ No newline at end of file diff --git a/irbapplication.tex b/irbapplication.tex index 28954c7..1afb7d4 100644 --- a/irbapplication.tex +++ b/irbapplication.tex @@ -21,31 +21,19 @@ \begin{enumerate}[start=1] \item[\textbf{1.}] The research \textbf{will not} involve prisoners, individuals with impaired decision-making capacity, or economically or educationally disadvantaged persons. - \item[\textbf{2.}] The research \textbf{will not} involve subjects under the age of 18. - \item[\textbf{3.}] The research \textbf{will not} involve collection of information regarding sensitive aspects of the subjects' lives. - -\item[\textbf{4.}] The information obtained \textbf{will} be recorded by the investigator in such a manner that the identity of the subjects can readily be ascertained either directly or through identifiers linked to the subjects. - +\item[\textbf{4.}] The information obtained \textbf{will} be recorded by the investigator in such a manner that the identity of the subjects can readily be ascertained either directly or through identifiers linked to the subjects (Video recordings). +% CHANGE: Deception is now NO because the student is a real volunteer. \item[\textbf{5.}] The research \textbf{will not} involve either deception or incomplete disclosure of the purpose, methods, or other relevant aspects of the research. - -\item[\textbf{5a.}] Subjects \textbf{will not} be informed prior to participating that they will be deceived, mislead, or otherwise not fully informed of all relevant aspects of the study. \textit{Only necessary if the answer to \textbf{5} is \textbf{will}.} - -\item[\textbf{6.}] \textbf{The procedures of this research present no more than minimal risk to the subject} (where minimal risk means that the probability and magnitude of harm or discomfort anticipated in the proposed research are no greater than those ordinarily encountered in daily life or during the performance of routine physical/psychological examinations or tests). - -\item[\textbf{7.}] The research \textbf{will not} be conducted in established or commonly accepted educational settings and will involve normal educational practices (e.g., research on regular and special education instructional strategies, research on instructional techniques, curricula, or classroom management methods). - -\item[\textbf{8.}] The research \textbf{will} involve survey or interview procedures, observation of public behavior (including visual or auditory recording), or educational tests (e.g., cognitive, diagnostic, aptitude, or achievement tests). - -\item[\textbf{9.}] The research \textbf{will} involve benign behavioral interventions in conjunction with the collection of information from an adult subject through verbal or written responses (including data entry) or audiovisual recording if the subject prospectively agrees to the intervention and information collection. - -\item[\textbf{10.}] The research (including demonstration projects) \textbf{will not} involve already existing identifiable private information that has been collected or will be collected solely for non-research purposes. This information may include documents, records, or biological specimens (including pathological or diagnostic specimens). - -\item[\textbf{11.}] The research \textbf{will not} be designed to study, evaluate, improve, or otherwise examine public benefit or service programs, including procedures for obtaining benefits or services under those programs, possible changes in or alternatives to those programs or procedures, or possible changes in methods or levels of payment for benefits or services under those programs where such research is conducted or supported by a Federal department or agency, or otherwise subject to the approval of department or agency heads (or the approval of the heads of bureaus or other subordinate agencies that have been delegated authority to conduct the research and demonstration projects). - -\item[\textbf{12.}] The research \textbf{will not} involve taste and food quality evaluation and consumer acceptance studies. - +\item[\textbf{5a.}] N/A +\item[\textbf{6.}] \textbf{The procedures of this research present no more than minimal risk to the subject.} +\item[\textbf{7.}] The research \textbf{will not} be conducted in established or commonly accepted educational settings. +\item[\textbf{8.}] The research \textbf{will} involve survey or interview procedures, observation of public behavior, or educational tests. +\item[\textbf{9.}] The research \textbf{will} involve benign behavioral interventions. +\item[\textbf{10.}] The research \textbf{will not} involve already existing identifiable private information. +\item[\textbf{11.}] The research \textbf{will not} be designed to study public benefit or service programs. +\item[\textbf{12.}] The research \textbf{will not} involve taste and food quality evaluation. \item[\textbf{13.}] The data collected in this research \textbf{will not} involve secondary analysis for which broad consent is required. \end{enumerate} @@ -54,89 +42,113 @@ \begin{enumerate}[start=1] \item[\textbf{1.}] \textbf{Study Purpose and Research Question} -Please describe in some detail the purpose of the proposed study (including, as appropriate, information about the research question and relevant hypothesis or, if the research is exploratory, what the researchers hope to learn). - \vspace{0.3cm} -For my honors thesis project, I am evaluating HRIStudio, a web-based platform I developed for designing and executing Wizard-of-Oz experiments in Human-Robot Interaction research. My research question asks whether HRIStudio improves methodological consistency and user experience for non-technical researchers compared to traditional robot programming environments that come with common robots, like Choregraphe for the NAO robot. I hypothesize that participants using HRIStudio will achieve higher accuracy when recreating published HRI experiments, report better usability experiences, and complete tasks more efficiently than those using manufacturer-provided tools. This study will provide empirical evidence about whether visual programming interfaces can lower technical barriers in HRI research while maintaining experimental rigor. +For my honors thesis project, I am evaluating HRIStudio, a web-based platform I developed for designing and executing Wizard-of-Oz experiments in Human-Robot Interaction (HRI). +My research question compares HRIStudio against the industry standard (Choregraphe) to determine their impact on: +\begin{enumerate} + \item \textbf{Disciplinary Accessibility:} Can non-technical domain experts successfully design a robot interaction? + \item \textbf{Scientific Reproducibility:} Does the tool minimize human error and data loss during experiment execution? +\end{enumerate} +I hypothesize that participants using HRIStudio will achieve higher Design Fidelity Scores and Execution Reliability Scores than those using the control software. \item[\textbf{2.}] \textbf{Subject Sample Description} -Describe the proposed subject sample. If subjects under the age of 18 will participate in your research, indicate the expected age range of the samples. If your research involves a category of subjects that is vulnerable to coercion or undue influence, such as children, prisoners, individuals with impaired decision-making capacity, or economically or educationally disadvantaged persons, you must indicate clearly why the use of these subjects is scientifically necessary. - \vspace{0.3cm} -I will recruit 10-12 students and faculty from non-engineering departments at Bucknell University, specifically Psychology, Education, and Sociology. All participants will be adults aged 18-65 with experience designing behavioral experiments but little to no programming background. I will exclude anyone with engineering or computer science background, prior experience with Choregraphe or similar robot programming tools, or undergraduate students to avoid any perceived academic coercion. This population is scientifically necessary because my research specifically evaluates tools designed to make HRI research accessible to non-technical domain experts, so using participants with programming backgrounds would not test the platform's core value proposition of lowering technical barriers for interdisciplinary researchers. +I will recruit a total of \textbf{N=20 participants}, divided into two distinct groups: + +\textbf{Group A: The "Wizards" (N=10)} +\begin{itemize} + \item \textbf{Population:} Faculty or Graduate Students from non-Computer Science disciplines (e.g., Psychology, Education, Sociology) at Bucknell University. + \item \textbf{Exclusion Criteria:} Must have NO prior experience with the NAO robot or Choregraphe software. +\end{itemize} + +\textbf{Group B: The "Test Subjects" (N=10)} +\begin{itemize} + \item \textbf{Population:} Undergraduate students at Bucknell University. + \item \textbf{Role:} They will serve as the "student" taking the geography quiz administered by the Wizard. + \item \textbf{Exclusion Criteria:} Must not be enrolled in a course taught by the "Wizard" participant they are paired with (to prevent coercion). +\end{itemize} \item[\textbf{3.}] \textbf{Recruitment and Selection Methods} -How will subjects be recruited and selected? - \vspace{0.3cm} -I will recruit participants through email outreach to department chairs in Psychology, Education, and Sociology, requesting permission to contact faculty and graduate students. I will also use faculty referrals and graduate student networks in relevant departments. All recruitment materials will clearly describe the study purpose, 3-hour time commitment, and emphasize that no prior technical experience is required. Interested participants will complete a brief screening survey to confirm eligibility criteria, then be randomly assigned to either control group (Choregraphe) or experimental group (HRIStudio). Sessions will be scheduled based on participant availability with no deception involved - participants will know they are evaluating research software tools. Participants will be entered into a raffle to win one of four \$25 gift cards. All recruitment materials will emphasize that participation is entirely voluntary, can be withdrawn at any time without consequence, and has no impact on academic standing or department relationships. +\textbf{Wizards:} Recruited via email outreach to Department Chairs in the target disciplines and physical flyers in academic buildings. +\textbf{Test Subjects:} Recruited via general campus announcements (Message Center, flyers) calling for volunteers to "Interact with a robot for 15 minutes." + +Recruitment materials will clearly state the time commitment (75 minutes for Wizards, 15 minutes for Test Subjects) and compensation. Participants will be screened via email for eligibility and randomly assigned to the Control or Experimental condition (for Wizards). \item[\textbf{4.}] \textbf{Detailed Research Methods and Procedures} -Describe fully the following: - \vspace{0.3cm} -\textbf{a) Research methods and procedures that will be employed in this study} +\textbf{a) Research methods and procedures} -I will conduct a randomized controlled trial comparing two software platforms for robot programming. Each participant will attend a 2-hour training workshop on their assigned platform (HRIStudio or Choregraphe), then complete a 3-hour individual session where they recreate a published HRI experiment using the NAO6 humanoid robot. The task involves programming the robot to perform a standardized greeting interaction sequence. I will measure completion accuracy against the original published protocol and collect user experience data through post-task surveys and brief interviews. +This is a between-subjects user study comparing two software interfaces. +\textbf{For "Wizard" Participants (75 minutes):} +\begin{enumerate} + \item \textbf{Training (15 mins):} Participant receives a standardized tutorial on their assigned software (HRIStudio or Choregraphe) covering speech, motion, and triggers. + \item \textbf{Design Challenge (30 mins):} Participant is given a "Paper Specification" (a storyboard for a 'Geography Quiz Proctor' scenario) and must implement it on the robot. The researcher tracks time-to-completion and help requests. + \item \textbf{Live Trial (15 mins):} A "Test Subject" (Group B participant) enters the room. The Wizard uses their interface to administer the quiz to the Test Subject. + \item \textbf{Debrief (15 mins):} Wizard exports the data and completes the System Usability Scale (SUS) survey. +\end{enumerate} + +\textbf{For "Test Subject" Participants (15 minutes):} +The Test Subject enters the lab, consents to participate, and interacts with the robot for the duration of the "Geography Quiz" (approx. 5-10 minutes). They answer the robot's questions naturally. \vspace{0.2cm} -\textbf{b) Approximately how much time each subject is expected to devote to the research} - -Each participant will spend approximately 5 hours total: 2 hours in a group training workshop and 3 hours in an individual task session, scheduled within one week of each other. +\textbf{b) Time Commitment} +\begin{itemize} + \item Wizards: 75 minutes (One session). + \item Test Subjects: 15 minutes (One session). +\end{itemize} \vspace{0.2cm} -\textbf{c) How data will be collected and recorded} - -Data will be collected with participant identifiers initially (for scheduling and compensation) but will be de-identified for analysis using numerical codes. I will use: (1) Screen recording software to capture participant interactions with the programming interfaces, (2) Timestamped logs from both software platforms showing programming actions, (3) Post-task questionnaires measuring usability, confidence, and satisfaction (Likert scales and open-ended questions), (4) Brief semi-structured interviews (10-15 minutes) about their experience, and (5) Objective scoring rubrics comparing their final robot programs to the target protocol. No audio/video recording of participants themselves, only screen capture of their computer interactions. +\textbf{c) Data Collection} +\begin{itemize} + \item \textbf{Screen Recording:} Captures the Wizard's workflow and design errors. + \item \textbf{Video Recording:} Captures the robot and the Test Subject during the Live Trial (to measure response timing and script adherence). + \item \textbf{Surveys:} SUS scores and confidence ratings from the Wizards. + \item \textbf{Rubrics:} "Design Fidelity Score" (scored from the saved project file) and "Execution Reliability Score" (scored from the video). +\end{itemize} \vspace{0.2cm} -\textbf{d) Methods for obtaining and documenting informed consent of subjects} - -I will obtain written informed consent at the beginning of each participant's first session (the training workshop). Participants will receive consent forms via email 24 hours prior to review, and I will review all elements verbally before obtaining signatures. The consent form will clearly explain the study purpose, procedures, time commitment, voluntary nature, right to withdraw, data handling, and compensation details. +\textbf{d) Informed Consent} +Written informed consent will be obtained from ALL participants (Wizards and Test Subjects) prior to their involvement. +\textit{Note:} Wizards will consent to the full 75-minute procedure. Test Subjects will sign a simplified consent form specific to the 15-minute interaction and video recording. \vspace{0.2cm} -\textbf{e) Any use of deception in the proposed study and justification for its use} - -No deception will be used. Participants will be fully informed that they are evaluating robot programming software tools as part of a comparative study for my honors thesis research. +\textbf{e) Deception} +No deception is used in this study. Both the Wizard and the Test Subject are aware of their roles in the experiment. \vspace{0.2cm} -\textbf{f) Methods for preserving confidentiality} - -All data will be stored on password-protected, encrypted university computers in locked offices. Participant names will be replaced with numerical codes within 48 hours of data collection. Screen recordings and interview notes will be stored separately from identifying information. Data will be retained for 3 years following thesis completion, then permanently deleted. Only I and my thesis advisor will have access to identifiable data during the active research period. +\textbf{f) Confidentiality} +All video and screen data will be stored on an encrypted drive kept in the PI's locked lab space. Participant names will be replaced with numerical codes (W-01 for Wizards, S-01 for Subjects). Data will be retained for 3 years following thesis completion. \item[\textbf{5.}] \textbf{Benefits and Payment Arrangements} -Indicate any benefits that are expected to accrue to subjects as a result of their participation in the research. In the event that subjects will be paid, describe all payment arrangements, including how much subjects will be paid should they choose to withdraw from the study prior to completion of the research. +\vspace{0.3cm} +\textbf{Wizards:} Will receive a \$15 Amazon gift card for the 75-minute session. +\textbf{Test Subjects:} Will receive a \$5 Amazon gift card (or equivalent snack/token) for the 15-minute session. +Compensation is provided immediately upon completion. If a participant withdraws early, they still receive full compensation. + +\item[\textbf{6.}] \textbf{Researcher-Subject Relationships} \vspace{0.3cm} -Participants will be entered into a raffle to win one of four \$25 gift cards as compensation for their time and effort. All participants who complete the informed consent process and attend at least one session will be eligible for the raffle, regardless of whether they complete the full study. The raffle will be conducted after all data collection is complete, and winners will be notified within one week of the final session. The compensation is intended to acknowledge their time contribution rather than serve as an incentive to participate or continue in the study. Participants may also gain exposure to robot programming concepts and tools that could be useful in their own research, though this is not a guaranteed outcome. There are no direct personal benefits beyond the potential raffle compensation and potential learning experience. +The PI is an undergraduate student. To mitigate coercion: +1. Faculty Wizards are recruited from unrelated departments. +2. Test Subjects are screened to ensure they are not current students of the Faculty Wizard they are paired with. +3. Participation is strictly voluntary with no impact on academic standing. -\item[\textbf{6.}] \textbf{Researcher-Subject Relationships and Coercion Mitigation} - -Describe any pre-existing relationships between researcher and subjects --- such as teacher--student, superintendent--principal--teacher, employer--employee --- that might impact subjects' ability to participate in the research voluntarily. How will any potential for coercion be mitigated by the researchers? - -\vspace{0.3cm} -I do not have direct supervisory, teaching, or grading relationships with potential participants since I am recruiting from departments outside of Computer Science. However, as a fellow student at Bucknell University, there may be informal academic or social connections. To mitigate any potential coercion: (1) I will emphasize in all communications that participation is entirely voluntary with no academic or professional consequences for declining or withdrawing, (2) I will not recruit through my own classes or research groups, (3) I will use department chairs and faculty as intermediaries for initial contact rather than direct personal outreach, (4) I will ensure participants understand that their decision to participate will not be shared with faculty in their departments, and (5) I will remind participants at each session that they can withdraw at any time while still remaining eligible for the raffle compensation. \end{enumerate} \section{PART III - Supporting Documents} \begin{itemize} -\item \textbf{Informed Consent Form} (unless not required) -\item \textbf{Debriefing} (required if using deception) -\item \textbf{Research Materials} +\item \textbf{Informed Consent Form (Wizard)} +\item \textbf{Informed Consent Form (Test Subject)} +\item \textbf{Recruitment Materials} +\item \textbf{Paper Specification (The "Geography Quiz")} +\item \textbf{Post-Study Questionnaire (SUS)} +\item \textbf{CITI Completion Report} \end{itemize} -Please note that all materials to which subjects will be exposed should be included (this includes surveys, interview or focus group outlines, visual stimuli, and so on). If using online surveys, please include as a PDF or a text document rather than simply providing a URL. You may upload multiple supporting documents using the 'Add Another' button below. - -\begin{itemize} -\item \textbf{Your CITI Completion Report} -\end{itemize} - -For more information, see Training Requirements. -Please upload a CITI report for any additional research personnel (PIs, co-PIs, research assistants, and so on). - -\end{document} +\end{document} \ No newline at end of file