scholarly journals A discriminative learning framework with pairwise constraints for video object classification

2006 ◽  
Vol 28 (4) ◽  
pp. 578-593 ◽  
Author(s):  
Rong Yan ◽  
Jian Zhang ◽  
Jie Yang ◽  
A.G. Hauptmann
2022 ◽  
Vol 18 (1) ◽  
pp. 1-27
Author(s):  
Ran Xu ◽  
Rakesh Kumar ◽  
Pengcheng Wang ◽  
Peter Bai ◽  
Ganga Meghanath ◽  
...  

Videos take a lot of time to transport over the network, hence running analytics on the live video on embedded or mobile devices has become an important system driver. Considering such devices, e.g., surveillance cameras or AR/VR gadgets, are resource constrained, although there has been significant work in creating lightweight deep neural networks (DNNs) for such clients, none of these can adapt to changing runtime conditions, e.g., changes in resource availability on the device, the content characteristics, or requirements from the user. In this article, we introduce ApproxNet, a video object classification system for embedded or mobile clients. It enables novel dynamic approximation techniques to achieve desired inference latency and accuracy trade-off under changing runtime conditions. It achieves this by enabling two approximation knobs within a single DNN model rather than creating and maintaining an ensemble of models, e.g., MCDNN [MobiSys-16]. We show that ApproxNet can adapt seamlessly at runtime to these changes, provides low and stable latency for the image and video frame classification problems, and shows the improvement in accuracy and latency over ResNet [CVPR-16], MCDNN [MobiSys-16], MobileNets [Google-17], NestDNN [MobiCom-18], and MSDNet [ICLR-18].


Sensors ◽  
2021 ◽  
Vol 21 (18) ◽  
pp. 6108
Author(s):  
Sukhan Lee ◽  
Yongjun Yang

Deep learning approaches to estimating full 3D orientations of objects, in addition to object classes, are limited in their accuracies, due to the difficulty in learning the continuous nature of three-axis orientation variations by regression or classification with sufficient generalization. This paper presents a novel progressive deep learning framework, herein referred to as 3D POCO Net, that offers high accuracy in estimating orientations about three rotational axes yet with efficiency in network complexity. The proposed 3D POCO Net is configured, using four PointNet-based networks for independently representing the object class and three individual axes of rotations. The four independent networks are linked by in-between association subnetworks that are trained to progressively map the global features learned by individual networks one after another for fine-tuning the independent networks. In 3D POCO Net, high accuracy is achieved by combining a high precision classification based on a large number of orientation classes with a regression based on a weighted sum of classification outputs, while high efficiency is maintained by a progressive framework by which a large number of orientation classes are grouped into independent networks linked by association subnetworks. We implemented 3D POCO Net for full three-axis orientation variations and trained it with about 146 million orientation variations augmented from the ModelNet10 dataset. The testing results show that we can achieve an orientation regression error of about 2.5° with about 90% accuracy in object classification for general three-axis orientation estimation and object classification. Furthermore, we demonstrate that a pre-trained 3D POCO Net can serve as an orientation representation platform based on which orientations as well as object classes of partial point clouds from occluded objects are learned in the form of transfer learning.


1969 ◽  
Vol 12 (1) ◽  
pp. 185-192 ◽  
Author(s):  
John L. Locke

Ten children with high scores on an auditory memory span task were significantly better at imitating three non-English phones than 10 children with low auditory memory span scores. An additional 10 children with high scores on an oral stereognosis task were significantly better at imitating two of the three phones than 10 children with low oral stereognosis scores. Auditory memory span and oral stereognosis appear to be important subskills in the learning of new articulations, perhaps explaining their appearance in the literature as “etiologies” of disordered articulation. Although articulation development and the experimental acquisition of non-English phones have certain obvious differences, they seem to share some common processes, suggesting that the sound learning framework may be an efficacious technique for revealing otherwise inaccessible information.


Sign in / Sign up

Export Citation Format

Share Document