BRINGING ROBOTICS TAXONOMIES TO CONTINUOUS DOMAINS VIA GPLVM ON HYPERBOLIC MANIFOLDS

Abstract

Robotic taxonomies have appeared as high-level hierarchical abstractions that classify how humans move and interact with their environment. They have proven useful to analyse grasps, manipulation skills, and whole-body support poses. Despite the efforts devoted to design their hierarchy and underlying categories, their use in application fields remains scarce. This may be attributed to the lack of computational models that fill the gap between the discrete hierarchical structure of the taxonomy and the high-dimensional heterogeneous data associated to its categories. To overcome this problem, we propose to model taxonomy data via hyperbolic embeddings that capture the associated hierarchical structure. To do so, we formulate a Gaussian process hyperbolic latent variable model and enforce the taxonomy structure through graph-based priors on the latent space and distance-preserving back constraints. We test our model on the whole-body support pose taxonomy to learn hyperbolic embeddings that comply with the original graph structure. We show that our model properly encodes unseen poses from existing or new taxonomy categories, it can be used to generate trajectories between the embeddings, and it outperforms its Euclidean counterparts.

1. INTRODUCTION

Roboticists are often inspired by biological insights to create robotic systems that exhibit human-or animal-like capabilities (Siciliano & Khatib, 2016) . In particular, it is first necessary to understand how humans move and interact with their environment to then generate biologically-inspired motions and behaviors of robotics hands, arms or humanoids. In this endeavor, researchers proposed to structure and categorize human hand postures and body poses into hierarchical classifications known as taxonomies. Their structure depends on the variables considered to categorize human motions and their interactions with the environment, as well as on associated qualitative measures. Different taxonomies have been proposed in the area of human and robot grasping (Cutkosky, 1989; Feix et al., 2016; Abbasi et al., 2016; Stival et al., 2019 ). Feix et al. (2016) introduced a taxonomy of hand grasps whose structure was mainly defined by the hand pose and the type of contact with the object. Later, Stival et al. (2019) claimed that the taxonomy designed in (Feix et al., 2016) heavily depended on subjective qualitative measures, and proposed a quantitative tree-like taxonomy of hand grasps based on muscular and kinematic patterns. A similar data-driven approach was used to design a grasp taxonomy based on sensed contact forces in (Abbasi et al., 2016) . Robotic manipulation also gave rise to various taxonomies. Bullock et al. (2013) introduced a hand-centric manipulation taxonomy that classifies manipulation skills according to the type of contact with the objects and the object motion imparted by the hand. A different strategy was developed in (Paulius et al., 2019) , where a manipulation taxonomy was designed based on a categorization of contacts and motion trajectories. Humanoid robotics also made significant efforts to analyze human motions, thus proposing taxonomies as high-level abstractions of human motion configurations. Borràs et al. (2017) analyzed the contacts of the human limbs with the environment and designed a taxonomy of whole-body support poses. In addition to being used for analysis purposes in robotics or biomechanics, some of the aforementioned taxonomies were leveraged for modeling grasp actions (Romero et al., 2010; Lin & Sun, 2015) , for planning contact-aware whole-body pose sequences (Mandery et al., 2016) , and for learning manipulation skills embeddings (Paulius et al., 2020) . However, despite most taxonomies carry a well-defined hierarchical structure, it is often overlooked. First, these taxonomies are usually exploited for classification tasks whose target classes are mainly the tree leaves, disregarding the full taxonomy structure (Feix et al., 2016; Abbasi et al., 2016) . Second, the discrete representation of the taxonomy categories hinders their use for motion generation (Romero et al., 2010) . We believe that the difficulty of leveraging robotic taxonomies is due to the lack of computational models that exploit (i) the domain knowledge encoded in the hierarchy, and (ii) the information of the high-dimensional data associated to the taxonomy categories. We tackle this problem from a representation learning perspective by modeling taxonomy data as embeddings that capture the associated hierarchical structure. Inspired by recent advances on word embeddings (Nickel & Kiela, 2017; 2018; Mathieu et al., 2019) , we propose to leverage the hyperbolic manifold (Ratcliffe, 2019) to learn such embeddings. An important property of the hyperbolic manifold is that distances grow exponentially when moving away from the origin, and shortest paths between distant points tend to pass through it, resembling a continuous hierarchical structure. Therefore, we hypothesize that the geometry of the hyperbolic manifold allows us to learn embeddings that comply with the original graph structure of robotic taxonomies. Specifically, we propose a Gaussian process hyperbolic latent variable model (GPHLVM) to learn embeddings of taxonomy data on the hyperbolic manifold. To do so, we impose a hyperbolic geometry to the latent space of the well-known GPLVM (Lawrence, 2003; Titsias & Lawrence, 2010) . This demands to reformulate the Gaussian distribution, the kernel, and the optimization process of the vanilla GPLVM to account for the geometry of the hyperbolic latent space. To do so, we leverage the hyperbolic wrapped Gaussian distribution (Nagano et al., 2019) , and provide a positive-definiteguaranteed approximation of the hyperbolic kernel proposed by McKean (1970) . Moreover, we resort to Riemannian optimization (Absil et al., 2007; Boumal, 2022) to optimize the GPHLVM parameters. We enforce the taxonomy graph structure in the learned embeddings through graphbased priors on the latent space and via graph-distance-preserving back constraints (Lawrence & Quiñonero Candela, 2006; Urtasun et al., 2008) . Our GPHLVM is conceptually similar to the GPLVM for Lie groups introduced in (Jensen et al., 2020), which also imposes geometric properties to the GPLVM latent space. However, our formulation is specifically designed for the hyperbolic manifold and fully built on tools from Riemannian geometry. Moreover, unlike (Tosi et al., 2014) and (Jørgensen & Hauberg, 2021) , where the latent space was endowed with a pullback Riemannian metric learned via the GPLVM mapping, we impose the hyperbolic geometry to the GPHLVM latent space as an inductive bias adapted to our targeted applications. We test our approach on graphs extracted from the whole-body support pose taxonomy (Borràs et al., 2017) . The proposed GPHLVM learns hyperbolic embeddings of the body support poses that comply with the original graph structure, and properly encodes unseen poses from existing or new taxonomy nodes. Moreover, we show how we can exploit the continuous geometry of the hyperbolic manifold to generate trajectories between different embeddings pairs, which comply with the taxonomy graph structure. To the best of our knowledge, this paper is the first to leverage the hyperbolic manifold for robotic applications.



Figure 1: Left: Illustration of the Lorentz L 2 and Poincaré P 2 models of the hyperbolic manifold. The former is depicted as the gray hyperboloid, while the latter is represented by the blue circle. Both models show a geodesic ( ) between two points x1 ( ) and x2 ( ). The vector u ( ) lies on the tangent space of x1 such that u = Log x 1 (x2). Right: Subset of the whole-body support pose taxonomy (Borràs et al., 2017) used in our experiments. Each node is a support pose defined by the type of contacts (foot F, hand H, knee K). The lines represent graph transitions between the taxonomy nodes. Contacts are depicted by grey dots.

