CNN Architectures: Alex Net, Le Net, VGG, Google Net, Res Net

The global development and progress in scientific paraphernalia and technology is the fundamental reason for the rapid increasein the data volume. Several significant techniques have been introducedfor image processing and object detection owing to this advancement. The promising features and transfer learning of ConvolutionalNeural Network (CNN) havegained much attention around the globe by researchers as well as computer vision society, as a result of which, several remarkable breakthroughs were achieved. This paper comprehensively reviews the data classification, history as well as architecture of CNN and well-known techniques bytheir boons and absurdities. Finally, a discussion for implementation of CNN over object detection for effectual results based on their critical analysis and performances is presented

Download Full-text

Image classification using Deep learning

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.7.10892 ◽

2018 ◽

Vol 7 (2.7) ◽

pp. 614 ◽

Cited By ~ 5

Author(s):

M Manoj krishna ◽

M Neelima ◽

M Harshali ◽

M Venu Gopala Rao

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Image Processing ◽

Computer Vision ◽

Deep Learning ◽

Image Classification ◽

Convolutional Neural Networks ◽

Classical Problem

The image classification is a classical problem of image processing, computer vision and machine learning fields. In this paper we study the image classification using deep learning. We use AlexNet architecture with convolutional neural networks for this purpose. Four test images are selected from the ImageNet database for the classification purpose. We cropped the images for various portion areas and conducted experiments. The results show the effectiveness of deep learning based image classification using AlexNet.

Download Full-text

Assistant robot through deep learning

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i1.pp1053-1062 ◽

2020 ◽

Vol 10 (1) ◽

pp. 1053

Author(s):

Robinson Jiménez-Moreno ◽

Javier Orlando Pinzón-Arenas ◽

César Giovany Pachón-Suescún

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Speech Recognition ◽

Real Time ◽

Convolutional Neural Networks ◽

Spanish Language ◽

Assistive Robotics ◽

The One

This article presents a work oriented to assistive robotics, where a scenario is established for a robot to reach a tool in the hand of a user, when they have verbally requested it by his name. For this, three convolutional neural networks are trained, one for recognition of a group of tools, which obtained an accuracy of 98% identifying the tools established for the application, that are scalpel, screwdriver and scissors; one for speech recognition, trained with the names of the tools in Spanish language, where its validation accuracy reach a 97.5% in the recognition of the words; and another for recognition of the user's hand, taking in consideration the classification of 2 gestures: Open and Closed hand, where a 96.25% accuracy was achieved. With those networks, tests in real time are performed, presenting results in the delivery of each tool with a 100% of accuracy, i.e. the robot was able to identify correctly what the user requested, recognize correctly each tool and deliver the one need when the user opened their hand, taking an average time of 45 seconds in the execution of the application.

Download Full-text

Object Detectors’ Convolutional Neural Networks backbones : a review and a comparative study

International Journal of Emerging Trends in Engineering Research ◽

10.30534/ijeter/2021/039112021 ◽

2021 ◽

Vol 9 (11) ◽

pp. 1379-1386

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Object Detection ◽

Convolutional Neural Networks ◽

Crucial Role ◽

Extended Version ◽

Backbone Networks ◽

Detection Algorithms ◽

Wide Range

Computer vision is a scientific field that deals with how computers can acquire significant level comprehension from computerized images or videos. One of the keystones of computer vision is object detection that aims to identify relevant features from video or image to detect objects. Backbone is the first stage in object detection algorithms that play a crucial role in object detection. Object detectors are usually provided with backbone networks designed for image classification. Object detection performance is highly based on features extracted by backbones, for instance, by simply replacing a backbone with its extended version, a large accuracy metric grows up. Additionally, the backbone's importance is demonstrated by its efficiency in real-time object detection. In this paper, we aim to accumulate the crucial role of the deep learning era and convolutional neural networks in particular in object detection tasks. We have analyzed and have been concentrating on a wide range of reviews on convolutional neural networks used as the backbone of object detection models. Building, therefore, a review of backbones that help researchers and scientists to use it as a guideline for their works.

Download Full-text

A Study of The Convolutional Neural Networks Applications

UKH Journal of Science and Engineering ◽

10.25079/ukhjse.v3n2y2019.pp31-40 ◽

2019 ◽

Vol 3 (2) ◽

pp. 31-40 ◽

Cited By ~ 2

Author(s):

Ahmed Shamsaldin ◽

Polla Fattah ◽

Tarik Rashid ◽

Nawzad Al-Salihi

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Deep Learning ◽

Natural Language Processing ◽

Face Recognition ◽

Convolutional Neural Networks ◽

Language Processing ◽

Text Classification ◽

Scene Labeling ◽

Real World Problems

At present, deep learning is widely used in a broad range of arenas. A convolutional neural networks (CNN) is becoming the star of deep learning as it gives the best and most precise results when cracking real-world problems. In this work, a brief description of the applications of CNNs in two areas will be presented: First, in computer vision, generally, that is, scene labeling, face recognition, action recognition, and image classification; Second, in natural language processing, that is, the fields of speech recognition and text classification.

Download Full-text

What You See Is What You Transform: Foveated Spatial Transformers as a bio-inspired attention mechanism

10.36227/techrxiv.16550391 ◽

2021 ◽

Author(s):

Ghassan Dabane ◽

Laurent Perrinet ◽

Emmanuel Daucé

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Object Recognition ◽

Convolutional Neural Networks ◽

Visual Space ◽

Attention Mechanism ◽

Classical Approach ◽

Weak Point ◽

Spatial Transformations ◽

Training Scheme

Convolutional Neural Networks have been considered the go-to option for object recognition in computer vision for the last couple of years. However, their invariance to object’s translations is still deemed as a weak point and remains limited to small translations only via their max-pooling layers. One bio-inspired approach considers the What/Where pathway separation in Mammals to overcome this limitation. This approach works as a nature-inspired attention mechanism, another classical approach of which is Spatial Transformers. These allow an adaptive endto-end learning of different classes of spatial transformations throughout training. In this work, we overview Spatial Transformers as an attention-only mechanism and compare them with the What/Where model. We show that the use of attention restricted or “Foveated” Spatial Transformer Networks, coupled alongside a curriculum learning training scheme and an efficient log-polar visual space entry, provides better performance when compared to the What/Where model, all this without the need for any extra supervision whatsoever.

Download Full-text

Learning Cartographic Building Generalization with Deep Convolutional Neural Networks

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8060258 ◽

2019 ◽

Vol 8 (6) ◽

pp. 258 ◽

Cited By ~ 13

Author(s):

Yu Feng ◽

Frank Thiemann ◽

Monika Sester

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Physical Reality ◽

Learning Approaches ◽

Deep Convolutional Neural Networks ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Cartographic Generalization

Cartographic generalization is a problem, which poses interesting challenges to automation. Whereas plenty of algorithms have been developed for the different sub-problems of generalization (e.g., simplification, displacement, aggregation), there are still cases, which are not generalized adequately or in a satisfactory way. The main problem is the interplay between different operators. In those cases the human operator is the benchmark, who is able to design an aesthetic and correct representation of the physical reality. Deep learning methods have shown tremendous success for interpretation problems for which algorithmic methods have deficits. A prominent example is the classification and interpretation of images, where deep learning approaches outperform traditional computer vision methods. In both domains-computer vision and cartography-humans are able to produce good solutions. A prerequisite for the application of deep learning is the availability of many representative training examples for the situation to be learned. As this is given in cartography (there are many existing map series), the idea in this paper is to employ deep convolutional neural networks (DCNNs) for cartographic generalizations tasks, especially for the task of building generalization. Three network architectures, namely U-net, residual U-net and generative adversarial network (GAN), are evaluated both quantitatively and qualitatively in this paper. They are compared based on their performance on this task at target map scales 1:10,000, 1:15,000 and 1:25,000, respectively. The results indicate that deep learning models can successfully learn cartographic generalization operations in one single model in an implicit way. The residual U-net outperforms the others and achieved the best generalization performance.

Download Full-text

Multiple Flames Recognition Using Deep Learning

Handbook of Research on Multimedia Cyber Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-2701-6.ch015 ◽

2020 ◽

pp. 296-307

Author(s):

Chen Xin ◽

Minh Nguyen ◽

Wei Qi Yan

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Object Recognition ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

Detection Scheme ◽

Image Dataset ◽

Flame Detection ◽

Nvidia Gpu

Identifying fire flames is based on object recognition which has valuable applications in intelligent surveillance. This chapter focuses on flame recognition using deep learning and its evaluations. For achieving this goal, authors design a Multi-Flame Detection scheme (MFD) which utilises Convolutional Neural Networks (CNNs). Authors take use of TensorFlow in deep learning with an NVIDIA GPU to train an image dataset and constructed a model for flame recognition. The contributions of this book chapter are: (1) data augmentation for flame recognition, (2) model construction for deep learning, and (3) result evaluations for flame recognition using deep learning.

Download Full-text

What You See Is What You Transform: Foveated Spatial Transformers as a bio-inspired attention mechanism

10.36227/techrxiv.16550391.v1 ◽

2021 ◽

Author(s):

Ghassan Dabane ◽

Laurent Perrinet ◽

Emmanuel Daucé

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Object Recognition ◽

Convolutional Neural Networks ◽

Visual Space ◽

Attention Mechanism ◽

Classical Approach ◽

Weak Point ◽

Spatial Transformations ◽

Training Scheme

Convolutional Neural Networks have been considered the go-to option for object recognition in computer vision for the last couple of years. However, their invariance to object’s translations is still deemed as a weak point and remains limited to small translations only via their max-pooling layers. One bio-inspired approach considers the What/Where pathway separation in Mammals to overcome this limitation. This approach works as a nature-inspired attention mechanism, another classical approach of which is Spatial Transformers. These allow an adaptive endto-end learning of different classes of spatial transformations throughout training. In this work, we overview Spatial Transformers as an attention-only mechanism and compare them with the What/Where model. We show that the use of attention restricted or “Foveated” Spatial Transformer Networks, coupled alongside a curriculum learning training scheme and an efficient log-polar visual space entry, provides better performance when compared to the What/Where model, all this without the need for any extra supervision whatsoever.

Download Full-text

Automated object detection of mechanical fasteners using faster region based convolutional neural networks

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i6.pp5430-5437 ◽

2021 ◽

Vol 11 (6) ◽

pp. 5430

Author(s):

M. Karthikeyan ◽

T. S. Subashini

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Object Detection ◽

Convolutional Neural Networks ◽

Learning Algorithm ◽

Region Of Interest ◽

Vital Role ◽

Robot Arm ◽

Sensors And Actuators ◽

Mechanical Fasteners

Mechanical fasteners are widely used in manufacturing of hardware and mechanical components such as automobiles, turbine & power generation and industries. Object detection method play a vital role to make a smart system for the society. Internet of things (IoT) leads to automation based on sensors and actuators not enough to build the systems due to limitations of sensors. Computer vision is the one which makes IoT too much smarter using deep learning techniques. Object detection is used to detect, recognize and localize the object in an image or a real time video. In industry revolution, robot arm is used to fit the fasteners to the automobile components. This system will helps the robot to detect the object of fasteners such as screw and nails accordingly to fit to the vehicle moved in the assembly line. Faster R-CNN deep learning algorithm is used to train the custom dataset and object detection is used to detect the fasteners. Region based convolutional neural networks (Faster R-CNN) uses a region proposed network (RPN) network to train the model efficiently and also with the help of Region of Interest able to localize the screw and nails objects with a mean average precision of 0.72 percent leads to accuracy of 95 percent object detection

Download Full-text