Several studies have found a delay in the development of facial emotion recognition and expression in children with an autism spectrum condition (ASC). Several interventions have been designed to help children to fill this gap. Most of them adopt technological devices (i.e., robots, computers, and avatars) as social mediators and reported evidence of improvement. Few interventions have aimed at promoting emotion recognition and expression abilities and, among these, most have focused on emotion recognition. Moreover, a crucial point is the generalization of the ability acquired during treatment to naturalistic interactions. This study aimed to evaluate the effectiveness of two technological-based interventions focused on the expression of basic emotions comparing a robot-based type of training with a “hybrid” computer-based one. Furthermore, we explored the engagement of the hybrid technological device introduced in the study as an intermediate step to facilitate the generalization of the acquired competencies in naturalistic settings. A two-group pre-post-test design was applied to a sample of 12 children (M = 9.33; ds = 2.19) with autism. The children were included in one of the two groups: group 1 received a robot-based type of training (n = 6); and group 2 received a computer-based type of training (n = 6). Pre- and post-intervention evaluations (i.e., time) of facial expression and production of four basic emotions (happiness, sadness, fear, and anger) were performed. Non-parametric ANOVAs found significant time effects between pre- and post-interventions on the ability to recognize sadness [t(1) = 7.35, p = 0.006; pre: M (ds) = 4.58 (0.51); post: M (ds) = 5], and to express happiness [t(1) = 5.72, p = 0.016; pre: M (ds) = 3.25 (1.81); post: M (ds) = 4.25 (1.76)], and sadness [t(1) = 10.89, p < 0; pre: M (ds) = 1.5 (1.32); post: M (ds) = 3.42 (1.78)]. The group*time interactions were significant for fear [t(1) = 1.019, p = 0.03] and anger expression [t(1) = 1.039, p = 0.03]. However, Mann–Whitney comparisons did not show significant differences between robot-based and computer-based training. Finally, no difference was found in the levels of engagement comparing the two groups in terms of the number of voice prompts given during interventions. Albeit the results are preliminary and should be interpreted with caution, this study suggests that two types of technology-based training, one mediated via a humanoid robot and the other via a pre-settled video of a peer, perform similarly in promoting facial recognition and expression of basic emotions in children with an ASC. The findings represent the first step to generalize the abilities acquired in a laboratory-trained situation to naturalistic interactions.