honors-thesis/thesis/chapters/app_tech_docs.tex

\chapter{Technical Documentation}
\label{app:tech_docs}

This appendix documents the technical implementation details of HRIStudio for researchers who wish to deploy, extend, or build upon the platform. The main text focuses on the conceptual framework and architectural decisions; this appendix preserves the implementation specifics.

\section{System Architecture Overview}

HRIStudio consists of three primary components:

\begin{enumerate}
\item \textbf{Web Application}: A Next.js application with TypeScript running in the browser, providing the user interface for design, execution, and analysis.
\item \textbf{Application Server}: A Node.js server handling API requests, session management, and orchestration.
\item \textbf{Data Layer}: PostgreSQL for structured data (studies, experiments, trials) and MinIO (S3-compatible) for unstructured media files.
\end{enumerate}

Communication between the web application and the robot is mediated through a rosbridge WebSocket server, which translates between the browser's WebSocket protocol and ROS topics and services~\cite{Quigley2009}.

\section{Deployment}

HRIStudio is distributed as Docker containers, enabling reproducible deployment across computing environments. The deployment stack consists of three services defined in \texttt{docker-compose.yml}:

\subsection{Database Service}

PostgreSQL stores all structured data: user accounts, study metadata, experiment protocols, trial sessions, and event logs. The database schema follows a hierarchical structure matching the Study/Experiment/Trial/Step/Action data model described in Chapter~\ref{ch:design}.

The service is configured with persistent storage to preserve data across restarts:

\begin{verbatim}
services:
  db:
    image: postgres:15
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: hristudio
    volumes:
      - postgres_data:/var/lib/postgresql/data
\end{verbatim}

\subsection{File Storage Service}

MinIO provides S3-compatible object storage for media files (video recordings, audio captures). Video and audio are stored separately from the database to keep query performance high while preserving complete multimedia records. The separation of concerns between database and file storage reflects the architectural principle that structured queries and unstructured binary data have different access patterns.

\subsection{Robot Communication}

Robot control flows through a rosbridge WebSocket connection (\texttt{ws://localhost:9090}). The web client connects directly to rosbridge, which handles translation to ROS-specific protocols. This design means HRIStudio itself does not need to be ROS-aware; it speaks the standard rosbridge JSON protocol over WebSocket.

For deployment without physical robot hardware, a mock robot server provides simulated sensor data and action responses, enabling development and testing of experiment protocols.

\section{Rosbridge-WebSocket Protocol}

HRIStudio communicates with robots using the rosbridge protocol, a JSON-based WebSocket specification for ROS communication. The protocol defines several operations; HRIStudio uses a subset for robot control:

\subsection{Subscribe}

Subscribe to a ROS topic to receive sensor data:

\begin{verbatim}
{
  "op": "subscribe",
  "topic": "/joint_states",
  "type": "sensor_msgs/JointState",
  "id": "sub_1"
}
\end{verbatim}

HRIStudio subscribes to sensor topics including joint states, battery status, bumpers, touch sensors, and sonar readings. Each message received updates the robot state displayed in the wizard interface.

\subsection{Publish}

Send commands to robot topics:

\begin{verbatim}
{
  "op": "publish",
  "topic": "/speech",
  "type": "std_msgs/String",
  "msg": { "data": "Hello, how are you?" }
}
\end{verbatim}

Robot actions are published to appropriate topics based on the action type. Speech uses \texttt{/speech}, movement uses \texttt{/cmd_vel}, and joint positions use \texttt{/joint\_angles}.

\subsection{Service Calls}

Request robot information via ROS services:

\begin{verbatim}
{
  "op": "call_service",
  "service": "/naoqi_driver/get_robot_info",
  "args": {},
  "id": "call_1"
}
\end{verbatim}

Service calls are used for queries like battery level or joint names that require a request-response pattern rather than continuous streaming.

\section{Plugin System}

The plugin architecture enables HRIStudio to support different robot platforms without modifying core code. Each robot is described by a JSON plugin file that maps abstract actions to platform-specific commands.

\subsection{Plugin Structure}

A plugin file defines:

\begin{itemize}
\item \textbf{Metadata}: Robot identifier, name, manufacturer, model, version compatibility
\item \textbf{Topic Configuration}: Default ROS topic names for the robot's sensors and actuators
\item \textbf{Actions}: Available behaviors, each with parameter schemas and ROS topic mappings
\item \textbf{Sensors}: Available sensor streams with their ROS topic and message type
\item \textbf{Specifications}: Physical properties (dimensions, weight, degrees of freedom)
\end{itemize}

\subsection{Action Definition Example}

The following excerpt shows how a ``Say Text'' action is defined for the NAO6 mock robot:

\begin{verbatim}
{
  "id": "say_text",
  "name": "Say Text",
  "category": "speech",
  "parameterSchema": {
    "type": "object",
    "properties": {
      "text": {
        "type": "string",
        "description": "Text to speak",
        "default": "Hello"
      }
    },
    "required": ["text"]
  },
  "ros2": {
    "messageType": "std_msgs/String",
    "topic": "/speech",
    "payloadMapping": {
      "type": "template",
      "payload": {
        "data": "{{text}}"
      }
    }
  }
}
\end{verbatim}

The plugin specifies that executing the \texttt{say\_text} action should publish to the \texttt{/speech} topic with a \texttt{std_msgs/String} message containing the text parameter. The template syntax (\texttt{\{\{text\}\}}) enables parameter substitution at runtime.

\subsection{Supported Actions}

The NAO6 plugin defines the following action categories:

\begin{description}
\item[Speech:] Say text with optional emotion markup
\item[Movement:] Walk forward/backward, turn left/right, stop
\item[Gestures:] Wave, point, custom animations
\item[Sensors:] Get battery level, read joint states
\end{description}

The mock robot plugin implements these same actions with simulated responses, enabling testing without physical hardware.

\section{Event Logging}

Every action during a trial is logged with precise timestamps. The event log captures:

\begin{itemize}
\item Action executions: what was commanded, when, and the result
\item Wizard inputs: button clicks, step advancement, manual overrides
\item Robot state changes: joint positions, sensor readings at key moments
\item Timing metadata: when actions were requested, when they began, when they completed
\end{itemize}

The logging system is event-driven: rather than polling for state, the system responds to ROS topic messages and user interface events, writing each to the log with a millisecond-precision timestamp. This approach ensures comprehensive capture without introducing artificial delays into the real-time control loop.

The complete event log for a trial is stored as part of the trial record, making the entire execution trace available for analysis and verification.

\section{WebSocket Connection Management}

The wizard interface maintains a persistent WebSocket connection to rosbridge throughout a trial session. Connection management includes:

\begin{itemize}
\item \textbf{Automatic reconnection}: If the connection drops, the system attempts to reconnect with exponential backoff, up to a maximum of 5 attempts
\item \textbf{Connection state tracking}: The interface displays current connection status (connected, connecting, disconnected)
\item \textbf{Simulation mode}: When enabled, the client simulates robot responses without requiring rosbridge, useful for development and training
\end{itemize}

The simulation mode is particularly useful for wizard training: new operators can practice with the mock robot before conducting live sessions with participants.

\section{Repository Structure}

The HRIStudio source code is organized as follows:

\begin{verbatim}
hristudio/
├── docker-compose.yml        # Production deployment
├── src/
│   ├── app/                 # Next.js pages and API routes
│   ├── lib/
│   │   └── ros/             # ROS communication library
│   │       └── wizard-ros-service.ts
│   └── components/          # React UI components
└── robot-plugins/
    ├── plugins/             # Robot plugin definitions
    │   ├── nao6-mock.json
    │   ├── nao6-ros2.json
    │   └── turtlebot3-*.json
    └── package.json
\end{verbatim}

The separation between the main application and robot plugins enables the platform to be extended for new robots without modifying the core codebase.