AIR-Act2Act: Human–human interaction dataset for teaching non-verbal social behaviors to robots
To better interact with users, a social robot should understand the users’ behavior, infer the intention, and respond appropriately. Machine learning is one way of implementing robot intelligence. It provides the ability to automatically learn and improve from experience instead of explicitly telling the robot what to do. Social skills can also be learned through watching human–human interaction videos. However, human–human interaction datasets are relatively scarce to learn interactions that occur in various situations. Moreover, we aim to use service robots in the elderly care domain; however, there has been no interaction dataset collected for this domain. For this reason, we introduce a human–human interaction dataset for teaching non-verbal social behaviors to robots. It is the only interaction dataset that elderly people have participated in as performers. We recruited 100 elderly people and 2 college students to perform 10 interactions in an indoor environment. The entire dataset has 5,000 interaction samples, each of which contains depth maps, body indexes, and 3D skeletal data that are captured with three Microsoft Kinect v2 sensors. In addition, we provide the joint angles of a humanoid NAO robot which are converted from the human behavior that robots need to learn. The dataset and useful Python scripts are available for download at https://github.com/ai4r/AIR-Act2Act . It can be used to not only teach social skills to robots but also benchmark action recognition algorithms.