Human-to-Robot Handovers
Toolkit

The evaluation toolkit is available at the following GitHub repository: rgmc2025-handover-track.


Scoring and ranking
  • The benchmark scores a total of 100 points.
  • The final score is computed as the aggregation (weighted average) of the scores across the task configurations.
  • The overall ranking is based on the final score.
  • A point-based score is assigned to each task configuration based on the level of difficulty. There are four levels of difficulty. An easy configuration has 10 points. A medium configuration has 15 points. A difficult configuration has 20 points. A hard configuration has 25 points. If the handover fails (i.e., the robot cannot grasp and/or hold the container during the delivery phase or the object falls after the robot places it on the table), the configuration receives 0 points.
  • To assign score points, each configuration should be performed within 5 seconds and containers should be delivered within the predefined area.

Performance score

Note that the following score is a revisitation of the performance scores provided in the CORSMAL Benchmark.

Teams are ranked based on a 100-points score \(S\) that results from the aggregation (weighted average) across all task configurations. Each task configuration \(i\) has a pre-assigned score \(\omega_i\) and accounts for the delivery location \(d_i\), the total execution time \(t_i\), and the final mass of the delivered container \(m_i\). The task score is computed as follows:
$$ S = \frac{1}{3} \sum_i \left[ \omega_i \Psi_i \frac{\delta(d_i, \rho) + \gamma(t_i, \tau, \eta) + \mu(m_i, \hat{m}_i)}{3}\right],$$ where \( [ \cdot ]\) is the rounding operation to the nearest integer. The score of a configuration is set 0 (not computed) if the container is not delivered within the area or within the maximum allowed time (indicator function \( \Psi_i \in {0,1} \) . The score of all configurations sums up to 300 points and hence we divide by 3 to obtain the 100-point score \(S\).

For the delivery location, we define \(d_i\) as the distance between the position of the centre of the base of the container at the end of the task with respect to the target position (in millimetres or mm). We therefore compute a score based on the following normalisation function: $$ \delta(d_i, \rho) = \begin{cases} \displaystyle 1 - \frac{d_i}{\rho} & \text{if} \quad d_i < \rho \\ \displaystyle 0 & \text{otherwise,} \end{cases} $$ where \( \rho \) is a threshold that defines when an algorithm is unsuccessful for that measure and here represents the radius of the concentric delivery area where the object must be delivered. Specifically, we use as value \( \rho=500 \) mm.

For the total execution time, we define \(t_i\) (in milliseconds or ms) as the time from the moment the subject is instructed to grasp the container to the moment the robot releases the gripper at the delivery location to place the container after the handover (unless the handover failed). We therefore compute a score based on the same normalisation function as before: $$ \gamma(t_i, \tau, \eta) = \begin{cases} \displaystyle 1 - \frac{\max(t_i, \eta) - \eta}{\tau - \eta} & \text{if} \quad t_i < \tau \\ \displaystyle 0 & \text{otherwise,} \end{cases} $$ where \( \tau \) is a threshold that defines when an algorithm is unsuccessful for that measure and here represents the maximum allowed execution time, and \( \eta \) is the minimum expected time to perform a handover. Specifically, we use as values \(\tau=5000\) ms and \(\eta=1000\) ms.

For the final mass of the delivered container, we compute a different normalisation function to account for the measured mass of the filled container - if not empty - before (\(\hat{m}_i\)) and after (\(m_i\)) the execution of the task: $$ \mu(m_i, \hat{m}_i) = \begin{cases} \displaystyle 1 - \frac{|m_i - \hat{m}_i|}{\hat{m}_i} & \text{if} \quad |m_i - \hat{m}_i| < \hat{m}_i \\ \displaystyle 0 & \text{otherwise,} \end{cases} $$ to assess the amount of content that was spilled (in grams, or g) due to an inaccurate robot grasp or unstable delivery phase.