SlideShare uma empresa Scribd logo
1 de 8
Baixar para ler offline
https://iaeme.com/Home/journal/IJECET 11 editor@iaeme.com
International Journal of Electronics and Communication Engineering and
Technology (IJECET)
Volume 12, Issue 3, September–December 2021, pp. 11-18. Article ID: IJECET_12_03_002
Available online at https://iaeme.com/Home/issue/IJECET?Volume=12&Issue=3
ISSN Print: 0976-6464 and ISSN Online: 0976-6472
DOI: https://doi.org/10.34218/IJECET.12.3.2021.002
© IAEME Publication
ACCELERATED DEEP LEARNING INFERENCE
FROM CONSTRAINED EMBEDDED DEVICES
Bhargav Bhat1 and Abhay A Deshpande1
1
Department of Electronics and Communication Engineering,
RV College of Engineering (RVCE), Bengaluru, Karnataka, India
ABSTRACT
Hardware looping is a feature of some processor instruction sets whose hardware
can repeat the body of a loop automatically, rather than requiring software instructions
which take up cycles (and therefore time) to do so. Loop Unrolling is a loop
transformation technique that attempts to advance a program's execution speed to the
detriment of its twofold size, which is a methodology known as space–time tradeoff. A
convolutional neural network is created with simple loops, with hardware looping, with
loop unrolling and with both hardware looping and loop unrolling, and a comparison
is made to evaluate the effectiveness of hardware looping and loop unrolling. The
hardware loops alone will add to a cycle check decline, while the mix of hardware loops
and dot product instructions will decrease the clock cycle tally further. The CNN is
simulated on Xilinx Vivado 2021.1 running on Zync-7000 FPGA.
Keywords: Convolutional Neural Network, Deep Learning, FPGA, Hardware Looping,
Loop Unrolling, Vivado.
Cite this Article: Bhargav Bhat and Abhay A Deshpande, Accelerated Deep Learning
Inference from Constrained Embedded Devices, International Journal of Electronics
and Communication Engineering and Technology, 12(3), 2021, pp. 11-18.
https://iaeme.com/Home/issue/IJECET?Volume=12&Issue=3
1. INTRODUCTION
Profound learning calculations have seen achievement in a wide assortment of uses, for
example, machine interpretation, picture and discourse acknowledgment, and self-driving
vehicles. In any case, these calculations have as of late acquired traction in the installed
frameworks space. Most implanted frameworks depend on modest microcontrollers with
restricted memory limit, and, subsequently, are commonly seen as not equipped for running
profound learning calculations. Nevertheless, we think about that progressions in pressure of
neural organizations and neural organization engineering, combined with an improved guidance
set design, could make microcontroller-grade processors appropriate for explicit low-force
profound learning applications. Such intricacy is substantially a lot for memory obliged
microcontrollers that have memory sizes indicated in kilobytes. Some embedded system
designer’s work around the quandary of restricted resources by processing neural networks in
the cloud. Nonetheless, this arrangement is restricted to regions with access to the Web. Cloud
Accelerated Deep Learning Inference from Constrained Embedded Devices
https://iaeme.com/Home/journal/IJECET 12 editor@iaeme.com
preparing likewise has different burdens, for example, protection concerns, security, high
idleness, correspondence power utilization, and dependability. These calculations perform
massive arithmetic computations. To accelerate these calculations at a sensible cost in
equipment, we can utilize an instruction set extension comprised of two instruction types-
hardware loops and dot product instructions. The primary commitments of this paper are as per
the following:
• We propose an approach for computing neural network functions that are advanced for
the utilization of hardware loops and dot product instructions.
• The effectiveness of hardware loops and dot product instructions for performing deep
learning functions are evaluated, and
• To perform Lenet-5 Neural Network on Zync-7000
There have been different ways to deal with accelerate profound learning capacities. The
methodologies can be sorted into two sets. In the first set, the approaches which which attempt
to enhance the size of the neural organizations, or, all in all, upgrade the software. Approaches
in the second set try to optimize the equipment or hardware on which neural networks are
running. As the set used here deals with hardware optimization, we will focus on the related
approaches for hardware optimization, and only momentarily observe the progress in software
optimizations. A simple instruction set is proposed to evaluate the effectiveness of hardware
loops and dot product instructions with fully upgraded assembly functions for the fully
connected convolutional neural network [1]. A custom instruction set architecture is used for
efficient realization of artificial neural networks and can be parameterized to a subjective fixed-
point design [2]. A CNN –specific Instruction set architecture is used which deploys the
instruction parameters with high flexibility which embeds parallel computation and data reuse
parameters in the instructions [3]. The instruction extensions and micro architectural
advancements to increase computational thickness and to limit the amount of pressure towards
shared memory hierarchy in RISC Processors [4]. A repetitive neural organization into a
convolutional neural network, and the profound provisions of the picture were learnt in equal
utilizing the convolutional neural network and the intermittent neural organization [5]. The
framework of Complex Network Classifier (CNC) is used by integrating network embedding
and convolutional neural network to tackle the problem of network classification. By training
the classifier on synthetic complex network data, they showed that CNC can not only classify
networks with high accuracy and robustness but can also extract the features of the networks
automatically [6]. A valued prediction method is used to exploit the spatial correlation of zero-
valued activations within the CNN output feature maps, thereby saving convolution operations
[7]. The impact of packet loss on data integrity is reduced by taking advantage of the deep
network’s ability to understand neural data and by using a data repair method based on
convolutions neural network [8]. The instruction set simulation process is used soft-core
Reduced Instruction Set Computer (RISC) processor. They provided reliable simulation
platform in creating customizable instruction set for Application Specific Instruction Set
Processor (ASIP) [9]. RISC-V ISA compatible processor and effects of instruction set is
analyzed on the pipeline/micro-architecture design in terms of instructions encoding,
functionality of instructions, instruction types, decoder logic complexity, data hazard detection,
register file organization and access, functioning of pipeline, effect of branch instructions,
control flow, data memory access, operating modes and execution unit hardware resources [10].
Deep learning algorithms are used increasingly in smart applications. Some of them also run in
Internet of Things (IoT) devices. IoT Analytics reports that, by 2025, the number of IoT devices
will rise to 22 billion. The motivation for our work stems from the fact that the rise of the IoT
will increase the need for low-cost devices built around a single microcontroller capable of
Bhargav Bhat and Abhay A Deshpande
https://iaeme.com/Home/journal/IJECET 13 editor@iaeme.com
supporting deep learning algorithms. Accelerating deep learning inference in constrained
embedded devices, presented in this paper, is our attempt in this direction.
The remainder of this paper is organized as follows. Section II presents the related work in
hardware and software enhancements aimed at accelerating the neural network computation.
Section III shows the methodology concerned with the project. Section IV shows the results
and the subsequent discussions regarding the obtained results. Section V presents the
conclusion and plans for additional work for the future.
2. BACKGROUND
2.1. Convolutional Neural Networks
A convolutional neural organization (CNN, or ConvNet) is a class of artificial neural
organization, most generally applied to understand visual imagery. A CNN is a deep learning
algorithm which can can take in an information picture, allot significance, to different
viewpoints/objects in the picture and have the option to separate or differentiate one from the
other. The pre-processing needed in a CNN is a lot of lower when contrasted with other
characterization calculations. While in crude techniques channels are hand-designed, with
enough preparing, CNNs can gain proficiency with these channels/qualities. CNN has a lot of
applications in decoding facial recognition, analyzing documents, historic and environmental
collections, understanding climate, grey areas to see holistic view of what a human sees,
advertising and other fields.
2.2. Lenet
Lenet (also called Lenet-5) is an exemplary convolutional neural network which utilizes
convolutions, pooling and fully connected layers. Lenet is used for handwritten digit
recognition with the MNIST dataset.
2.3. MNSIT Dataset
MNIST is the acronym for Modified National Institute of Standards and Technology database.
The MNIST database contains 60,000 training images and 10,000 testing images. MNIST
dataset pictures have the measurements 28 x 28. To get the MNIST pictures measurement to
the meet the necessities of the input layer, the 28 x 28 pictures are cushioned or padded. Some
of the test pictures from MNIST test dataset are as shown in Fig 1.
Figure 1 Some Test pictures from MNIST dataset
Accelerated Deep Learning Inference from Constrained Embedded Devices
https://iaeme.com/Home/journal/IJECET 14 editor@iaeme.com
2.4. Building Blocks of Convolutional Neural Network
• Convolutional Layer is the core center structure of CNN. In a CNN, the information is
a tensor with a shape: (number of data sources) x (input height) x (input width) x (input
channels). In the wake of going through a convolutional layer, the picture becomes
disconnected to an element map, likewise called an actuation map, with shape: (number
of data sources) x (highlight map height)) x (include map width) x (feature map
channels). Convolutional layers convolve the information and pass its outcome to the
following layer.
• Pooling Layer: Convolutional networks may incorporate local and/or global pooling
layers alongside traditional convolutional layers. Pooling layers diminish the
components of information by combining the outputs of neuron clusters at one layer
into a single neuron in the following layer. Local pooling combines small clusters, tiling
sizes such as 2 x 2 are commonly used. Global pooling acts on all the neurons of the
feature map. There are two common types of pooling in popular use: max and average.
Max pooling uses the maximum value of each local cluster of neurons in the feature
map, while average pooling takes the average value.
• Fully Connected Layers: Fully connected layers connect every neuron in one layer to
each neuron in another layer. It is equivalent to a conventional multi-layer perceptron
neural network (MLP). The smoothed network goes through a fully connected layer to
characterize the pictures.
• Receptive field: In neural networks, each neuron gets input from some number of
locations in the past layer. In a convolutional layer, each neuron receives input from
only a restricted area of the previous layer called the neuron's receptive field. Ordinarily
the area is a square (e.g. 5 by 5 neurons). Whereas, in a fully connected layer, the
receptive field is the entire past layer. Accordingly, in each convolutional layer, each
neuron takes input from a bigger region in the input than previous layers. This is
expected due to applying the convolution over and over, which takes into account the
value of a pixel, as well as its encompassing pixels. When utilizing dilated layers, the
number of pixels in the receptive field remains constant, but the field is more scantily
populated as its measurements grow when combining the impact of several layers.
• Weights: Every neuron in a neural network registers an output value by applying a
particular function to the input values received from the receptive field in the past layer.
The function that is applied to the input values is dictated by a vector of weights and a
bias (ordinarily real numbers). Learning consists of iteratively adjusting these biases
and weights.
• The vector of weights and the bias are called filters and represent specific features of
the input (e.g., a specific shape). A distinctive feature of CNNs is that many neurons
can share the same filter. This decreases the memory footprint because a single bias and
a single vector of weights are utilized across all receptive fields that share that filter, as
opposed to each receptive field having its own bias and vector weighting.
3. LENET ARCHITECTURE
Lenet is a pre-trained Convolutional Neural Network Model used for recognizing handwritten
and machine-printed characters. The organization has 5 layers with learnable boundaries and
thus named Lenet-5. It has three arrangements of convolution layers with a blend of average
pooling. After the convolution and average pooling layers, we have two fully connected layers.
Finally, a Softmax classifier is used to characterize the pictures into separate class. The lenet
layers are depicted in the following fig 2.
Bhargav Bhat and Abhay A Deshpande
https://iaeme.com/Home/journal/IJECET 15 editor@iaeme.com
Figure 2 Architecture of Lenet-5 Model
The following table is used to understand the architecture in more detail.
Table 1
Layer #filters/
Neuron
Filter
Size
Stride Size of
Feature map
Activation
Function
Input - - - 32x32x1
Conv 1 6 5 x 5 1 28x28x6 tanh
Avg Pooling 1 2 x 2 2 14x14x6
Conv 2 16 5 x 5 1 10x10x16 tanh
Avg Pooling 2 2 x 2 2 5x5x16
Conv 3 120 5 x 5 1 120 tanh
Fully Connected 1 - - - 84 tanh
Fully Connected 2 - - - 10 Softmax
The first layer is the input layer with highlight map size 32 x 32 x 1.
Then, at that point, we have the first convolution layer with 6 channels of size 5 x 5 and
step is 1. The initiation work utilized at his layer is hyperbolic tangent (tanh). The output feature
map is 28 x 28 x 6.
Then, we have an average pooling layer with channel size 2 x 2 and step 1. The subsequent
component map is 14 x 14 x 6. Since the pooling layer doesn't influence the quantity of
channels.
After this comes the second convolution layer with 16 filters of 5 x 5 and step 1. Also, the
activation function is tanh. Now the output size is 10 x 10 x 16.
Again comes the other average pooling layer of 2 x 2 with step 2. As a result, the size of the
feature map reduced to 5 x5 x16.
The final pooling layer has 120 filters of 5 x5 with stride 1 and activation function tanh.
Now the output size is 120.
The next is a fully connected layer with 84 neurons that result in the output to 84 values and
the activation function used here is again tanh.
The last layer is the output layer with 10 neurons and Softmax function. The Softmax gives
the likelihood that an information point has a place with a specific class. The most elevated
worth is then anticipated.
Accelerated Deep Learning Inference from Constrained Embedded Devices
https://iaeme.com/Home/journal/IJECET 16 editor@iaeme.com
This is the entire architecture of the Lenet-5 model. The number of trainable parameters of
this architecture is around 60,000.
4. RESULTS
The following table gives the hardware cost occupied by our design on the Zync-7000 board.
It shows that the design occupies 47% of LUTs, 19% of LUTRAMs, 28% of FFs, 59% of
BRAMs, 54% of DSP, and 3% of BUFG. Although, the implemented design can be displayed
to gives us idea on how the design have been distributed, placed and routed on the selected
Zync-7000 board.
Table 2
Resource Utilization Available Utilization %
LUT 25456 53200 47.85
LUTRAM 3478 17400 19.99
FF 30456 106400 28.62
BRAM 83.5 140 59.64
DSP 120 220 54.55
BUFG 1 32 3.13
4.1. Software Model (MATLAB)
In the software model goes through the whole LeNet CNN and give us its prediction. Fig 3
depicts result of the software model on matlab.
Figure 3 Matlab Result
4.2. Hardware Model
For the Hardware model, it gives us 10 outputs for the score of all 10 digits which are digit 0,
1, 2, 3, 4, 5, 6, 7, 8, and 9. The highest score should be the correct answer. So, you can see from
all images the the highest score is the correct value. Fig 4 shows the simulation result from
Xilinx Vivado. In the below example where the number 0 has been taken, the highest score 5
shows that 0 is the predicted number.
Bhargav Bhat and Abhay A Deshpande
https://iaeme.com/Home/journal/IJECET 17 editor@iaeme.com
Figure 4 Behavioral simulation on Xilinx Vivado
4.3. Implementation
Once the implementation is reached the implementation summary that summarizes all
implantation report will be provided. The Fig 5 depicts the implemented design.
Figure 5 Implemented Design of Lenet-5
5. CONCLUSION
In this paper, we proposed Lenet-5 Convolutional Neural Network with optimizations done
using hardware looping and dot product units, which provided high accuracy when recognizing
handwritten data using the MNSIT dataset. The hardware loops alone contribute to 24% cycle
count decrease, while the dot products reduce the cycle count by 27%. As embedded systems
are highly price-sensitive, this is an important consideration. Getting the sizes of neural
networks down is an essential step in expanding the possibilities for neural networks in
embedded systems.
Accelerated Deep Learning Inference from Constrained Embedded Devices
https://iaeme.com/Home/journal/IJECET 18 editor@iaeme.com
An interesting topic for further research is Posit - an alternative floating-point number
format that may offer additional advantages, as it has an increased dynamic range at the same
word size. Because of the improved dynamic range, weights could be stored in lower precision,
thus, again, decreasing the memory requirements. Combining the reduced size requirements
with low-cost ISA improvements could make neural networks more ubiquitous in the price-
sensitive embedded systems market.
REFERENCES
[1] J. Vreca et al., "Accelerating Deep Learning Inference in Constrained Embedded Devices Using
Hardware Loops and a Dot Product Unit," in IEEE Access, vol. 8, pp. 165913-165926, 2020,
doi: 10.1109/ACCESS.2020.3022824.
[2] D. Valencia, S. F. Fard and A. Ali mohammad, "An Artificial Neural Network Processor with
a Custom Instruction Set Architecture for Embedded Applications," in IEEE Transactions on
Circuits and Systems I: Regular Papers, vol. 67, no. 12, pp. 5200-5210, Dec. 2020, doi:
10.1109/TCSI.2020.3003769.
[3] X. Chen and Z. Yu, "A Flexible and Energy-Efficient Convolutional Neural Network
Acceleration with Dedicated ISA and Accelerator," in IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, vol. 26, no. 7, pp. 1408-1412, July 2018, doi:
10.1109/TVLSI.2018.2810831.
[4] M. Gautschi, Pasquale Davide Schiavone, Andreas Traber, Igor Loi, "Near-Threshold RISC-V
Core with DSP Extensions for Scalable IoT Endpoint Devices," in IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, vol. 25, no. 10, pp. 2700-2713, Oct. 2017, doi:
10.1109/TVLSI.2017.2654506.
[5] Y. Tian, "Artificial Intelligence Image Recognition Method Based on Convolutional Neural
Network Algorithm," in IEEE Access, vol. 8, pp. 125731-125744, 2020, doi:
10.1109/ACCESS.2020.3006097.
[6] Xin, J. Zhang and Y. Shao, "Complex network classification with convolutional neural
network," in Tsinghua Science and Technology, vol. 25, no. 4, pp. 447-457, Aug. 2020, doi:
10.26599/TST.2019.9010055.
[7] Shomron and U. Weiser, "Spatial Correlation and Value Prediction in Convolutional Neural
Networks," in IEEE Computer Architecture Letters, vol. 18, no. 1, pp. 10-13, 1 Jan.-June 2019,
doi: 10.1109/LCA.2018.2890236
[8] Y. Qie, P. Song and C. Hao, "Data Repair Without Prior Knowledge Using Deep Convolutional
Neural Networks," in IEEE Access, vol. 8, pp. 105351-105361, 2020, doi:
10.1109/ACCESS.2020.2999960.
[9] A. J. Salim, S. I. M. Salim, N. R. Samsudin and Y. Soo, "Customized instruction set simulation
for soft-core RISC processor," 2012 IEEE Control and System Graduate Research Colloquium,
Shah Alam, Selangor, 2012, pp. 38-42, doi: 10.1109/ICSGRC.2012.6287132.
[10] A. Raveendran, V. B. Patil, D. Selvakumar and V. Desalphine, "A RISC-V instruction set
processor-micro-architecture design and analysis," 2016 International Conference on VLSI
Systems, Architectures, Technology and Applications (VLSI-SATA), Bangalore, 2016, pp. 1-7,
doi: 10.1109/VLSI-SATA.2016.7593047.

Mais conteúdo relacionado

Mais procurados

A simplified design of multiplier for multi layer feed forward hardware neura...
A simplified design of multiplier for multi layer feed forward hardware neura...A simplified design of multiplier for multi layer feed forward hardware neura...
A simplified design of multiplier for multi layer feed forward hardware neura...eSAT Publishing House
 
Implementing Neural Networks Using VLSI for Image Processing (compression)
Implementing Neural Networks Using VLSI for Image Processing (compression)Implementing Neural Networks Using VLSI for Image Processing (compression)
Implementing Neural Networks Using VLSI for Image Processing (compression)IJERA Editor
 
Eee iv-microcontrollers [10 es42]-notes
Eee iv-microcontrollers [10 es42]-notesEee iv-microcontrollers [10 es42]-notes
Eee iv-microcontrollers [10 es42]-notesGopinath.B.L Naidu
 
Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...ijsrd.com
 
Ieeepro techno solutions ieee 2014 embedded prokect emb base paper 43
Ieeepro techno solutions  ieee 2014 embedded prokect emb base paper 43Ieeepro techno solutions  ieee 2014 embedded prokect emb base paper 43
Ieeepro techno solutions ieee 2014 embedded prokect emb base paper 43srinivasanece7
 
IRJET- Significant Neural Networks for Classification of Product Images
IRJET- Significant Neural Networks for Classification of Product ImagesIRJET- Significant Neural Networks for Classification of Product Images
IRJET- Significant Neural Networks for Classification of Product ImagesIRJET Journal
 
The pattern and realization of zigbee wi-fi
The pattern and realization of zigbee  wi-fiThe pattern and realization of zigbee  wi-fi
The pattern and realization of zigbee wi-fieSAT Publishing House
 
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...IRJET Journal
 
Smartphone fpga based balloon payload using cots components
Smartphone fpga based balloon payload using cots componentsSmartphone fpga based balloon payload using cots components
Smartphone fpga based balloon payload using cots componentseSAT Journals
 
Seminar report of mems
Seminar report of memsSeminar report of mems
Seminar report of memsGovind kumawat
 
Remote temperature and humidity monitoring system using wireless sensor networks
Remote temperature and humidity monitoring system using wireless sensor networksRemote temperature and humidity monitoring system using wireless sensor networks
Remote temperature and humidity monitoring system using wireless sensor networkseSAT Journals
 
Design of Tele command SOC-IP by AES Cryptographic Method Using VHDL
Design of Tele command SOC-IP by AES Cryptographic Method Using VHDLDesign of Tele command SOC-IP by AES Cryptographic Method Using VHDL
Design of Tele command SOC-IP by AES Cryptographic Method Using VHDLdbpublications
 
LT Pillar Auomation_modifieddocx
LT Pillar Auomation_modifieddocxLT Pillar Auomation_modifieddocx
LT Pillar Auomation_modifieddocxGaurav Patwa
 
An energy efficient protocol with static clustering for wsn
An energy efficient protocol with static clustering for wsnAn energy efficient protocol with static clustering for wsn
An energy efficient protocol with static clustering for wsnambitlick
 

Mais procurados (18)

A simplified design of multiplier for multi layer feed forward hardware neura...
A simplified design of multiplier for multi layer feed forward hardware neura...A simplified design of multiplier for multi layer feed forward hardware neura...
A simplified design of multiplier for multi layer feed forward hardware neura...
 
Implementing Neural Networks Using VLSI for Image Processing (compression)
Implementing Neural Networks Using VLSI for Image Processing (compression)Implementing Neural Networks Using VLSI for Image Processing (compression)
Implementing Neural Networks Using VLSI for Image Processing (compression)
 
Eee iv-microcontrollers [10 es42]-notes
Eee iv-microcontrollers [10 es42]-notesEee iv-microcontrollers [10 es42]-notes
Eee iv-microcontrollers [10 es42]-notes
 
Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...
 
61 66
61 6661 66
61 66
 
Ieeepro techno solutions ieee 2014 embedded prokect emb base paper 43
Ieeepro techno solutions  ieee 2014 embedded prokect emb base paper 43Ieeepro techno solutions  ieee 2014 embedded prokect emb base paper 43
Ieeepro techno solutions ieee 2014 embedded prokect emb base paper 43
 
IRJET- Significant Neural Networks for Classification of Product Images
IRJET- Significant Neural Networks for Classification of Product ImagesIRJET- Significant Neural Networks for Classification of Product Images
IRJET- Significant Neural Networks for Classification of Product Images
 
neural-control-drone
neural-control-droneneural-control-drone
neural-control-drone
 
The pattern and realization of zigbee wi-fi
The pattern and realization of zigbee  wi-fiThe pattern and realization of zigbee  wi-fi
The pattern and realization of zigbee wi-fi
 
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
 
E010412941
E010412941E010412941
E010412941
 
Smartphone fpga based balloon payload using cots components
Smartphone fpga based balloon payload using cots componentsSmartphone fpga based balloon payload using cots components
Smartphone fpga based balloon payload using cots components
 
G010334554
G010334554G010334554
G010334554
 
Seminar report of mems
Seminar report of memsSeminar report of mems
Seminar report of mems
 
Remote temperature and humidity monitoring system using wireless sensor networks
Remote temperature and humidity monitoring system using wireless sensor networksRemote temperature and humidity monitoring system using wireless sensor networks
Remote temperature and humidity monitoring system using wireless sensor networks
 
Design of Tele command SOC-IP by AES Cryptographic Method Using VHDL
Design of Tele command SOC-IP by AES Cryptographic Method Using VHDLDesign of Tele command SOC-IP by AES Cryptographic Method Using VHDL
Design of Tele command SOC-IP by AES Cryptographic Method Using VHDL
 
LT Pillar Auomation_modifieddocx
LT Pillar Auomation_modifieddocxLT Pillar Auomation_modifieddocx
LT Pillar Auomation_modifieddocx
 
An energy efficient protocol with static clustering for wsn
An energy efficient protocol with static clustering for wsnAn energy efficient protocol with static clustering for wsn
An energy efficient protocol with static clustering for wsn
 

Semelhante a ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES

A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING mlaij
 
Compact optimized deep learning model for edge: a review
Compact optimized deep learning model for edge: a reviewCompact optimized deep learning model for edge: a review
Compact optimized deep learning model for edge: a reviewIJECEIAES
 
An octa core processor with shared memory and message-passing
An octa core processor with shared memory and message-passingAn octa core processor with shared memory and message-passing
An octa core processor with shared memory and message-passingeSAT Journals
 
International Journal of Engineering Research and Development (IJERD)
 International Journal of Engineering Research and Development (IJERD) International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
IRJET- Survey on SDN based Network Intrusion Detection System using Machi...
IRJET-  	  Survey on SDN based Network Intrusion Detection System using Machi...IRJET-  	  Survey on SDN based Network Intrusion Detection System using Machi...
IRJET- Survey on SDN based Network Intrusion Detection System using Machi...IRJET Journal
 
IRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations SystemsIRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations SystemsIRJET Journal
 
TASK & RESOURCE SELF-ADAPTIVE EMBEDDED REAL-TIME OPERATING SYSTEM MICROKERNEL...
TASK & RESOURCE SELF-ADAPTIVE EMBEDDED REAL-TIME OPERATING SYSTEM MICROKERNEL...TASK & RESOURCE SELF-ADAPTIVE EMBEDDED REAL-TIME OPERATING SYSTEM MICROKERNEL...
TASK & RESOURCE SELF-ADAPTIVE EMBEDDED REAL-TIME OPERATING SYSTEM MICROKERNEL...cscpconf
 
Task & resource self adaptive
Task & resource self adaptiveTask & resource self adaptive
Task & resource self adaptivecsandit
 
LEARNING-BASED ORCHESTRATOR FOR INTELLIGENT SOFTWARE-DEFINED NETWORKING CONTR...
LEARNING-BASED ORCHESTRATOR FOR INTELLIGENT SOFTWARE-DEFINED NETWORKING CONTR...LEARNING-BASED ORCHESTRATOR FOR INTELLIGENT SOFTWARE-DEFINED NETWORKING CONTR...
LEARNING-BASED ORCHESTRATOR FOR INTELLIGENT SOFTWARE-DEFINED NETWORKING CONTR...ijseajournal
 
Learning-based Orchestrator for Intelligent Software-defined Networking Contr...
Learning-based Orchestrator for Intelligent Software-defined Networking Contr...Learning-based Orchestrator for Intelligent Software-defined Networking Contr...
Learning-based Orchestrator for Intelligent Software-defined Networking Contr...ijseajournal
 
REVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNNREVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNNIRJET Journal
 
Data Analysis In The Cloud
Data Analysis In The CloudData Analysis In The Cloud
Data Analysis In The CloudMonica Carter
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptxachakracu
 
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...pijans
 
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...pijans
 
Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...IJECEIAES
 
Medical Herb Identification and It’s Benefits
Medical Herb Identification and It’s BenefitsMedical Herb Identification and It’s Benefits
Medical Herb Identification and It’s BenefitsIRJET Journal
 

Semelhante a ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES (20)

A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING
 
Compact optimized deep learning model for edge: a review
Compact optimized deep learning model for edge: a reviewCompact optimized deep learning model for edge: a review
Compact optimized deep learning model for edge: a review
 
An octa core processor with shared memory and message-passing
An octa core processor with shared memory and message-passingAn octa core processor with shared memory and message-passing
An octa core processor with shared memory and message-passing
 
International Journal of Engineering Research and Development (IJERD)
 International Journal of Engineering Research and Development (IJERD) International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
IRJET- Survey on SDN based Network Intrusion Detection System using Machi...
IRJET-  	  Survey on SDN based Network Intrusion Detection System using Machi...IRJET-  	  Survey on SDN based Network Intrusion Detection System using Machi...
IRJET- Survey on SDN based Network Intrusion Detection System using Machi...
 
IRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations SystemsIRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
 
TASK & RESOURCE SELF-ADAPTIVE EMBEDDED REAL-TIME OPERATING SYSTEM MICROKERNEL...
TASK & RESOURCE SELF-ADAPTIVE EMBEDDED REAL-TIME OPERATING SYSTEM MICROKERNEL...TASK & RESOURCE SELF-ADAPTIVE EMBEDDED REAL-TIME OPERATING SYSTEM MICROKERNEL...
TASK & RESOURCE SELF-ADAPTIVE EMBEDDED REAL-TIME OPERATING SYSTEM MICROKERNEL...
 
Task & resource self adaptive
Task & resource self adaptiveTask & resource self adaptive
Task & resource self adaptive
 
LEARNING-BASED ORCHESTRATOR FOR INTELLIGENT SOFTWARE-DEFINED NETWORKING CONTR...
LEARNING-BASED ORCHESTRATOR FOR INTELLIGENT SOFTWARE-DEFINED NETWORKING CONTR...LEARNING-BASED ORCHESTRATOR FOR INTELLIGENT SOFTWARE-DEFINED NETWORKING CONTR...
LEARNING-BASED ORCHESTRATOR FOR INTELLIGENT SOFTWARE-DEFINED NETWORKING CONTR...
 
Learning-based Orchestrator for Intelligent Software-defined Networking Contr...
Learning-based Orchestrator for Intelligent Software-defined Networking Contr...Learning-based Orchestrator for Intelligent Software-defined Networking Contr...
Learning-based Orchestrator for Intelligent Software-defined Networking Contr...
 
M010237578
M010237578M010237578
M010237578
 
REVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNNREVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNN
 
50120140505008
5012014050500850120140505008
50120140505008
 
Data Analysis In The Cloud
Data Analysis In The CloudData Analysis In The Cloud
Data Analysis In The Cloud
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptx
 
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
 
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
SENSOR SIGNAL PROCESSING USING HIGH-LEVEL SYNTHESIS AND INTERNET OF THINGS WI...
 
Confidential Computing in Edge- Cloud Hierarchy
Confidential Computing in Edge- Cloud HierarchyConfidential Computing in Edge- Cloud Hierarchy
Confidential Computing in Edge- Cloud Hierarchy
 
Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...
 
Medical Herb Identification and It’s Benefits
Medical Herb Identification and It’s BenefitsMedical Herb Identification and It’s Benefits
Medical Herb Identification and It’s Benefits
 

Mais de IAEME Publication

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSIAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSIAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSIAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSIAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOIAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYIAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEIAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
 

Mais de IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Último

Beautiful Sapna Call Girls CP 9711199012 ☎ Call /Whatsapps
Beautiful Sapna Call Girls CP 9711199012 ☎ Call /WhatsappsBeautiful Sapna Call Girls CP 9711199012 ☎ Call /Whatsapps
Beautiful Sapna Call Girls CP 9711199012 ☎ Call /Whatsappssapnasaifi408
 
定制(UI学位证)爱达荷大学毕业证成绩单原版一比一
定制(UI学位证)爱达荷大学毕业证成绩单原版一比一定制(UI学位证)爱达荷大学毕业证成绩单原版一比一
定制(UI学位证)爱达荷大学毕业证成绩单原版一比一ss ss
 
RBS学位证,鹿特丹商学院毕业证书1:1制作
RBS学位证,鹿特丹商学院毕业证书1:1制作RBS学位证,鹿特丹商学院毕业证书1:1制作
RBS学位证,鹿特丹商学院毕业证书1:1制作f3774p8b
 
NO1 WorldWide kala jadu Love Marriage Black Magic Punjab Powerful Black Magic...
NO1 WorldWide kala jadu Love Marriage Black Magic Punjab Powerful Black Magic...NO1 WorldWide kala jadu Love Marriage Black Magic Punjab Powerful Black Magic...
NO1 WorldWide kala jadu Love Marriage Black Magic Punjab Powerful Black Magic...Amil baba
 
Dubai Call Girls O525547819 Spring Break Fast Call Girls Dubai
Dubai Call Girls O525547819 Spring Break Fast Call Girls DubaiDubai Call Girls O525547819 Spring Break Fast Call Girls Dubai
Dubai Call Girls O525547819 Spring Break Fast Call Girls Dubaikojalkojal131
 
Call Girls in Dwarka Sub City 💯Call Us 🔝8264348440🔝
Call Girls in Dwarka Sub City 💯Call Us 🔝8264348440🔝Call Girls in Dwarka Sub City 💯Call Us 🔝8264348440🔝
Call Girls in Dwarka Sub City 💯Call Us 🔝8264348440🔝soniya singh
 
Real Sure (Call Girl) in I.G.I. Airport 8377087607 Hot Call Girls In Delhi NCR
Real Sure (Call Girl) in I.G.I. Airport 8377087607 Hot Call Girls In Delhi NCRReal Sure (Call Girl) in I.G.I. Airport 8377087607 Hot Call Girls In Delhi NCR
Real Sure (Call Girl) in I.G.I. Airport 8377087607 Hot Call Girls In Delhi NCRdollysharma2066
 
Vip Udupi Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Vip Udupi Call Girls 7001305949 WhatsApp Number 24x7 Best ServicesVip Udupi Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Vip Udupi Call Girls 7001305949 WhatsApp Number 24x7 Best Servicesnajka9823
 
1:1原版定制美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
1:1原版定制美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree1:1原版定制美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
1:1原版定制美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
existing product research b2 Sunderland Culture
existing product research b2 Sunderland Cultureexisting product research b2 Sunderland Culture
existing product research b2 Sunderland CultureChloeMeadows1
 
Hifi Babe North Delhi Call Girl Service Fun Tonight
Hifi Babe North Delhi Call Girl Service Fun TonightHifi Babe North Delhi Call Girl Service Fun Tonight
Hifi Babe North Delhi Call Girl Service Fun TonightKomal Khan
 
专业一比一美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
专业一比一美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree专业一比一美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
专业一比一美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degreeyuu sss
 
Hifi Defence Colony Call Girls Service WhatsApp -> 9999965857 Available 24x7 ...
Hifi Defence Colony Call Girls Service WhatsApp -> 9999965857 Available 24x7 ...Hifi Defence Colony Call Girls Service WhatsApp -> 9999965857 Available 24x7 ...
Hifi Defence Colony Call Girls Service WhatsApp -> 9999965857 Available 24x7 ...srsj9000
 
Papular No 1 Online Istikhara Amil Baba Pakistan Amil Baba In Karachi Amil B...
Papular No 1 Online Istikhara Amil Baba Pakistan  Amil Baba In Karachi Amil B...Papular No 1 Online Istikhara Amil Baba Pakistan  Amil Baba In Karachi Amil B...
Papular No 1 Online Istikhara Amil Baba Pakistan Amil Baba In Karachi Amil B...Authentic No 1 Amil Baba In Pakistan
 
Erfurt FH学位证,埃尔福特应用技术大学毕业证书1:1制作
Erfurt FH学位证,埃尔福特应用技术大学毕业证书1:1制作Erfurt FH学位证,埃尔福特应用技术大学毕业证书1:1制作
Erfurt FH学位证,埃尔福特应用技术大学毕业证书1:1制作f3774p8b
 
Call Girls In Munirka>༒9599632723 Incall_OutCall Available
Call Girls In Munirka>༒9599632723 Incall_OutCall AvailableCall Girls In Munirka>༒9599632723 Incall_OutCall Available
Call Girls In Munirka>༒9599632723 Incall_OutCall AvailableCall Girls in Delhi
 
vip Krishna Nagar Call Girls 9999965857 Call or WhatsApp Now Book
vip Krishna Nagar Call Girls 9999965857 Call or WhatsApp Now Bookvip Krishna Nagar Call Girls 9999965857 Call or WhatsApp Now Book
vip Krishna Nagar Call Girls 9999965857 Call or WhatsApp Now Bookmanojkuma9823
 

Último (20)

Beautiful Sapna Call Girls CP 9711199012 ☎ Call /Whatsapps
Beautiful Sapna Call Girls CP 9711199012 ☎ Call /WhatsappsBeautiful Sapna Call Girls CP 9711199012 ☎ Call /Whatsapps
Beautiful Sapna Call Girls CP 9711199012 ☎ Call /Whatsapps
 
定制(UI学位证)爱达荷大学毕业证成绩单原版一比一
定制(UI学位证)爱达荷大学毕业证成绩单原版一比一定制(UI学位证)爱达荷大学毕业证成绩单原版一比一
定制(UI学位证)爱达荷大学毕业证成绩单原版一比一
 
RBS学位证,鹿特丹商学院毕业证书1:1制作
RBS学位证,鹿特丹商学院毕业证书1:1制作RBS学位证,鹿特丹商学院毕业证书1:1制作
RBS学位证,鹿特丹商学院毕业证书1:1制作
 
NO1 WorldWide kala jadu Love Marriage Black Magic Punjab Powerful Black Magic...
NO1 WorldWide kala jadu Love Marriage Black Magic Punjab Powerful Black Magic...NO1 WorldWide kala jadu Love Marriage Black Magic Punjab Powerful Black Magic...
NO1 WorldWide kala jadu Love Marriage Black Magic Punjab Powerful Black Magic...
 
Dubai Call Girls O525547819 Spring Break Fast Call Girls Dubai
Dubai Call Girls O525547819 Spring Break Fast Call Girls DubaiDubai Call Girls O525547819 Spring Break Fast Call Girls Dubai
Dubai Call Girls O525547819 Spring Break Fast Call Girls Dubai
 
Call Girls in Dwarka Sub City 💯Call Us 🔝8264348440🔝
Call Girls in Dwarka Sub City 💯Call Us 🔝8264348440🔝Call Girls in Dwarka Sub City 💯Call Us 🔝8264348440🔝
Call Girls in Dwarka Sub City 💯Call Us 🔝8264348440🔝
 
Low rate Call girls in Delhi Justdial | 9953330565
Low rate Call girls in Delhi Justdial | 9953330565Low rate Call girls in Delhi Justdial | 9953330565
Low rate Call girls in Delhi Justdial | 9953330565
 
Real Sure (Call Girl) in I.G.I. Airport 8377087607 Hot Call Girls In Delhi NCR
Real Sure (Call Girl) in I.G.I. Airport 8377087607 Hot Call Girls In Delhi NCRReal Sure (Call Girl) in I.G.I. Airport 8377087607 Hot Call Girls In Delhi NCR
Real Sure (Call Girl) in I.G.I. Airport 8377087607 Hot Call Girls In Delhi NCR
 
Vip Udupi Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Vip Udupi Call Girls 7001305949 WhatsApp Number 24x7 Best ServicesVip Udupi Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Vip Udupi Call Girls 7001305949 WhatsApp Number 24x7 Best Services
 
9953330565 Low Rate Call Girls In Jahangirpuri Delhi NCR
9953330565 Low Rate Call Girls In Jahangirpuri  Delhi NCR9953330565 Low Rate Call Girls In Jahangirpuri  Delhi NCR
9953330565 Low Rate Call Girls In Jahangirpuri Delhi NCR
 
1:1原版定制美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
1:1原版定制美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree1:1原版定制美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
1:1原版定制美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
CIVIL ENGINEERING
CIVIL ENGINEERINGCIVIL ENGINEERING
CIVIL ENGINEERING
 
existing product research b2 Sunderland Culture
existing product research b2 Sunderland Cultureexisting product research b2 Sunderland Culture
existing product research b2 Sunderland Culture
 
Hifi Babe North Delhi Call Girl Service Fun Tonight
Hifi Babe North Delhi Call Girl Service Fun TonightHifi Babe North Delhi Call Girl Service Fun Tonight
Hifi Babe North Delhi Call Girl Service Fun Tonight
 
专业一比一美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
专业一比一美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree专业一比一美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
专业一比一美国加州州立大学东湾分校毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
 
Hifi Defence Colony Call Girls Service WhatsApp -> 9999965857 Available 24x7 ...
Hifi Defence Colony Call Girls Service WhatsApp -> 9999965857 Available 24x7 ...Hifi Defence Colony Call Girls Service WhatsApp -> 9999965857 Available 24x7 ...
Hifi Defence Colony Call Girls Service WhatsApp -> 9999965857 Available 24x7 ...
 
Papular No 1 Online Istikhara Amil Baba Pakistan Amil Baba In Karachi Amil B...
Papular No 1 Online Istikhara Amil Baba Pakistan  Amil Baba In Karachi Amil B...Papular No 1 Online Istikhara Amil Baba Pakistan  Amil Baba In Karachi Amil B...
Papular No 1 Online Istikhara Amil Baba Pakistan Amil Baba In Karachi Amil B...
 
Erfurt FH学位证,埃尔福特应用技术大学毕业证书1:1制作
Erfurt FH学位证,埃尔福特应用技术大学毕业证书1:1制作Erfurt FH学位证,埃尔福特应用技术大学毕业证书1:1制作
Erfurt FH学位证,埃尔福特应用技术大学毕业证书1:1制作
 
Call Girls In Munirka>༒9599632723 Incall_OutCall Available
Call Girls In Munirka>༒9599632723 Incall_OutCall AvailableCall Girls In Munirka>༒9599632723 Incall_OutCall Available
Call Girls In Munirka>༒9599632723 Incall_OutCall Available
 
vip Krishna Nagar Call Girls 9999965857 Call or WhatsApp Now Book
vip Krishna Nagar Call Girls 9999965857 Call or WhatsApp Now Bookvip Krishna Nagar Call Girls 9999965857 Call or WhatsApp Now Book
vip Krishna Nagar Call Girls 9999965857 Call or WhatsApp Now Book
 

ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES

  • 1. https://iaeme.com/Home/journal/IJECET 11 editor@iaeme.com International Journal of Electronics and Communication Engineering and Technology (IJECET) Volume 12, Issue 3, September–December 2021, pp. 11-18. Article ID: IJECET_12_03_002 Available online at https://iaeme.com/Home/issue/IJECET?Volume=12&Issue=3 ISSN Print: 0976-6464 and ISSN Online: 0976-6472 DOI: https://doi.org/10.34218/IJECET.12.3.2021.002 © IAEME Publication ACCELERATED DEEP LEARNING INFERENCE FROM CONSTRAINED EMBEDDED DEVICES Bhargav Bhat1 and Abhay A Deshpande1 1 Department of Electronics and Communication Engineering, RV College of Engineering (RVCE), Bengaluru, Karnataka, India ABSTRACT Hardware looping is a feature of some processor instruction sets whose hardware can repeat the body of a loop automatically, rather than requiring software instructions which take up cycles (and therefore time) to do so. Loop Unrolling is a loop transformation technique that attempts to advance a program's execution speed to the detriment of its twofold size, which is a methodology known as space–time tradeoff. A convolutional neural network is created with simple loops, with hardware looping, with loop unrolling and with both hardware looping and loop unrolling, and a comparison is made to evaluate the effectiveness of hardware looping and loop unrolling. The hardware loops alone will add to a cycle check decline, while the mix of hardware loops and dot product instructions will decrease the clock cycle tally further. The CNN is simulated on Xilinx Vivado 2021.1 running on Zync-7000 FPGA. Keywords: Convolutional Neural Network, Deep Learning, FPGA, Hardware Looping, Loop Unrolling, Vivado. Cite this Article: Bhargav Bhat and Abhay A Deshpande, Accelerated Deep Learning Inference from Constrained Embedded Devices, International Journal of Electronics and Communication Engineering and Technology, 12(3), 2021, pp. 11-18. https://iaeme.com/Home/issue/IJECET?Volume=12&Issue=3 1. INTRODUCTION Profound learning calculations have seen achievement in a wide assortment of uses, for example, machine interpretation, picture and discourse acknowledgment, and self-driving vehicles. In any case, these calculations have as of late acquired traction in the installed frameworks space. Most implanted frameworks depend on modest microcontrollers with restricted memory limit, and, subsequently, are commonly seen as not equipped for running profound learning calculations. Nevertheless, we think about that progressions in pressure of neural organizations and neural organization engineering, combined with an improved guidance set design, could make microcontroller-grade processors appropriate for explicit low-force profound learning applications. Such intricacy is substantially a lot for memory obliged microcontrollers that have memory sizes indicated in kilobytes. Some embedded system designer’s work around the quandary of restricted resources by processing neural networks in the cloud. Nonetheless, this arrangement is restricted to regions with access to the Web. Cloud
  • 2. Accelerated Deep Learning Inference from Constrained Embedded Devices https://iaeme.com/Home/journal/IJECET 12 editor@iaeme.com preparing likewise has different burdens, for example, protection concerns, security, high idleness, correspondence power utilization, and dependability. These calculations perform massive arithmetic computations. To accelerate these calculations at a sensible cost in equipment, we can utilize an instruction set extension comprised of two instruction types- hardware loops and dot product instructions. The primary commitments of this paper are as per the following: • We propose an approach for computing neural network functions that are advanced for the utilization of hardware loops and dot product instructions. • The effectiveness of hardware loops and dot product instructions for performing deep learning functions are evaluated, and • To perform Lenet-5 Neural Network on Zync-7000 There have been different ways to deal with accelerate profound learning capacities. The methodologies can be sorted into two sets. In the first set, the approaches which which attempt to enhance the size of the neural organizations, or, all in all, upgrade the software. Approaches in the second set try to optimize the equipment or hardware on which neural networks are running. As the set used here deals with hardware optimization, we will focus on the related approaches for hardware optimization, and only momentarily observe the progress in software optimizations. A simple instruction set is proposed to evaluate the effectiveness of hardware loops and dot product instructions with fully upgraded assembly functions for the fully connected convolutional neural network [1]. A custom instruction set architecture is used for efficient realization of artificial neural networks and can be parameterized to a subjective fixed- point design [2]. A CNN –specific Instruction set architecture is used which deploys the instruction parameters with high flexibility which embeds parallel computation and data reuse parameters in the instructions [3]. The instruction extensions and micro architectural advancements to increase computational thickness and to limit the amount of pressure towards shared memory hierarchy in RISC Processors [4]. A repetitive neural organization into a convolutional neural network, and the profound provisions of the picture were learnt in equal utilizing the convolutional neural network and the intermittent neural organization [5]. The framework of Complex Network Classifier (CNC) is used by integrating network embedding and convolutional neural network to tackle the problem of network classification. By training the classifier on synthetic complex network data, they showed that CNC can not only classify networks with high accuracy and robustness but can also extract the features of the networks automatically [6]. A valued prediction method is used to exploit the spatial correlation of zero- valued activations within the CNN output feature maps, thereby saving convolution operations [7]. The impact of packet loss on data integrity is reduced by taking advantage of the deep network’s ability to understand neural data and by using a data repair method based on convolutions neural network [8]. The instruction set simulation process is used soft-core Reduced Instruction Set Computer (RISC) processor. They provided reliable simulation platform in creating customizable instruction set for Application Specific Instruction Set Processor (ASIP) [9]. RISC-V ISA compatible processor and effects of instruction set is analyzed on the pipeline/micro-architecture design in terms of instructions encoding, functionality of instructions, instruction types, decoder logic complexity, data hazard detection, register file organization and access, functioning of pipeline, effect of branch instructions, control flow, data memory access, operating modes and execution unit hardware resources [10]. Deep learning algorithms are used increasingly in smart applications. Some of them also run in Internet of Things (IoT) devices. IoT Analytics reports that, by 2025, the number of IoT devices will rise to 22 billion. The motivation for our work stems from the fact that the rise of the IoT will increase the need for low-cost devices built around a single microcontroller capable of
  • 3. Bhargav Bhat and Abhay A Deshpande https://iaeme.com/Home/journal/IJECET 13 editor@iaeme.com supporting deep learning algorithms. Accelerating deep learning inference in constrained embedded devices, presented in this paper, is our attempt in this direction. The remainder of this paper is organized as follows. Section II presents the related work in hardware and software enhancements aimed at accelerating the neural network computation. Section III shows the methodology concerned with the project. Section IV shows the results and the subsequent discussions regarding the obtained results. Section V presents the conclusion and plans for additional work for the future. 2. BACKGROUND 2.1. Convolutional Neural Networks A convolutional neural organization (CNN, or ConvNet) is a class of artificial neural organization, most generally applied to understand visual imagery. A CNN is a deep learning algorithm which can can take in an information picture, allot significance, to different viewpoints/objects in the picture and have the option to separate or differentiate one from the other. The pre-processing needed in a CNN is a lot of lower when contrasted with other characterization calculations. While in crude techniques channels are hand-designed, with enough preparing, CNNs can gain proficiency with these channels/qualities. CNN has a lot of applications in decoding facial recognition, analyzing documents, historic and environmental collections, understanding climate, grey areas to see holistic view of what a human sees, advertising and other fields. 2.2. Lenet Lenet (also called Lenet-5) is an exemplary convolutional neural network which utilizes convolutions, pooling and fully connected layers. Lenet is used for handwritten digit recognition with the MNIST dataset. 2.3. MNSIT Dataset MNIST is the acronym for Modified National Institute of Standards and Technology database. The MNIST database contains 60,000 training images and 10,000 testing images. MNIST dataset pictures have the measurements 28 x 28. To get the MNIST pictures measurement to the meet the necessities of the input layer, the 28 x 28 pictures are cushioned or padded. Some of the test pictures from MNIST test dataset are as shown in Fig 1. Figure 1 Some Test pictures from MNIST dataset
  • 4. Accelerated Deep Learning Inference from Constrained Embedded Devices https://iaeme.com/Home/journal/IJECET 14 editor@iaeme.com 2.4. Building Blocks of Convolutional Neural Network • Convolutional Layer is the core center structure of CNN. In a CNN, the information is a tensor with a shape: (number of data sources) x (input height) x (input width) x (input channels). In the wake of going through a convolutional layer, the picture becomes disconnected to an element map, likewise called an actuation map, with shape: (number of data sources) x (highlight map height)) x (include map width) x (feature map channels). Convolutional layers convolve the information and pass its outcome to the following layer. • Pooling Layer: Convolutional networks may incorporate local and/or global pooling layers alongside traditional convolutional layers. Pooling layers diminish the components of information by combining the outputs of neuron clusters at one layer into a single neuron in the following layer. Local pooling combines small clusters, tiling sizes such as 2 x 2 are commonly used. Global pooling acts on all the neurons of the feature map. There are two common types of pooling in popular use: max and average. Max pooling uses the maximum value of each local cluster of neurons in the feature map, while average pooling takes the average value. • Fully Connected Layers: Fully connected layers connect every neuron in one layer to each neuron in another layer. It is equivalent to a conventional multi-layer perceptron neural network (MLP). The smoothed network goes through a fully connected layer to characterize the pictures. • Receptive field: In neural networks, each neuron gets input from some number of locations in the past layer. In a convolutional layer, each neuron receives input from only a restricted area of the previous layer called the neuron's receptive field. Ordinarily the area is a square (e.g. 5 by 5 neurons). Whereas, in a fully connected layer, the receptive field is the entire past layer. Accordingly, in each convolutional layer, each neuron takes input from a bigger region in the input than previous layers. This is expected due to applying the convolution over and over, which takes into account the value of a pixel, as well as its encompassing pixels. When utilizing dilated layers, the number of pixels in the receptive field remains constant, but the field is more scantily populated as its measurements grow when combining the impact of several layers. • Weights: Every neuron in a neural network registers an output value by applying a particular function to the input values received from the receptive field in the past layer. The function that is applied to the input values is dictated by a vector of weights and a bias (ordinarily real numbers). Learning consists of iteratively adjusting these biases and weights. • The vector of weights and the bias are called filters and represent specific features of the input (e.g., a specific shape). A distinctive feature of CNNs is that many neurons can share the same filter. This decreases the memory footprint because a single bias and a single vector of weights are utilized across all receptive fields that share that filter, as opposed to each receptive field having its own bias and vector weighting. 3. LENET ARCHITECTURE Lenet is a pre-trained Convolutional Neural Network Model used for recognizing handwritten and machine-printed characters. The organization has 5 layers with learnable boundaries and thus named Lenet-5. It has three arrangements of convolution layers with a blend of average pooling. After the convolution and average pooling layers, we have two fully connected layers. Finally, a Softmax classifier is used to characterize the pictures into separate class. The lenet layers are depicted in the following fig 2.
  • 5. Bhargav Bhat and Abhay A Deshpande https://iaeme.com/Home/journal/IJECET 15 editor@iaeme.com Figure 2 Architecture of Lenet-5 Model The following table is used to understand the architecture in more detail. Table 1 Layer #filters/ Neuron Filter Size Stride Size of Feature map Activation Function Input - - - 32x32x1 Conv 1 6 5 x 5 1 28x28x6 tanh Avg Pooling 1 2 x 2 2 14x14x6 Conv 2 16 5 x 5 1 10x10x16 tanh Avg Pooling 2 2 x 2 2 5x5x16 Conv 3 120 5 x 5 1 120 tanh Fully Connected 1 - - - 84 tanh Fully Connected 2 - - - 10 Softmax The first layer is the input layer with highlight map size 32 x 32 x 1. Then, at that point, we have the first convolution layer with 6 channels of size 5 x 5 and step is 1. The initiation work utilized at his layer is hyperbolic tangent (tanh). The output feature map is 28 x 28 x 6. Then, we have an average pooling layer with channel size 2 x 2 and step 1. The subsequent component map is 14 x 14 x 6. Since the pooling layer doesn't influence the quantity of channels. After this comes the second convolution layer with 16 filters of 5 x 5 and step 1. Also, the activation function is tanh. Now the output size is 10 x 10 x 16. Again comes the other average pooling layer of 2 x 2 with step 2. As a result, the size of the feature map reduced to 5 x5 x16. The final pooling layer has 120 filters of 5 x5 with stride 1 and activation function tanh. Now the output size is 120. The next is a fully connected layer with 84 neurons that result in the output to 84 values and the activation function used here is again tanh. The last layer is the output layer with 10 neurons and Softmax function. The Softmax gives the likelihood that an information point has a place with a specific class. The most elevated worth is then anticipated.
  • 6. Accelerated Deep Learning Inference from Constrained Embedded Devices https://iaeme.com/Home/journal/IJECET 16 editor@iaeme.com This is the entire architecture of the Lenet-5 model. The number of trainable parameters of this architecture is around 60,000. 4. RESULTS The following table gives the hardware cost occupied by our design on the Zync-7000 board. It shows that the design occupies 47% of LUTs, 19% of LUTRAMs, 28% of FFs, 59% of BRAMs, 54% of DSP, and 3% of BUFG. Although, the implemented design can be displayed to gives us idea on how the design have been distributed, placed and routed on the selected Zync-7000 board. Table 2 Resource Utilization Available Utilization % LUT 25456 53200 47.85 LUTRAM 3478 17400 19.99 FF 30456 106400 28.62 BRAM 83.5 140 59.64 DSP 120 220 54.55 BUFG 1 32 3.13 4.1. Software Model (MATLAB) In the software model goes through the whole LeNet CNN and give us its prediction. Fig 3 depicts result of the software model on matlab. Figure 3 Matlab Result 4.2. Hardware Model For the Hardware model, it gives us 10 outputs for the score of all 10 digits which are digit 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. The highest score should be the correct answer. So, you can see from all images the the highest score is the correct value. Fig 4 shows the simulation result from Xilinx Vivado. In the below example where the number 0 has been taken, the highest score 5 shows that 0 is the predicted number.
  • 7. Bhargav Bhat and Abhay A Deshpande https://iaeme.com/Home/journal/IJECET 17 editor@iaeme.com Figure 4 Behavioral simulation on Xilinx Vivado 4.3. Implementation Once the implementation is reached the implementation summary that summarizes all implantation report will be provided. The Fig 5 depicts the implemented design. Figure 5 Implemented Design of Lenet-5 5. CONCLUSION In this paper, we proposed Lenet-5 Convolutional Neural Network with optimizations done using hardware looping and dot product units, which provided high accuracy when recognizing handwritten data using the MNSIT dataset. The hardware loops alone contribute to 24% cycle count decrease, while the dot products reduce the cycle count by 27%. As embedded systems are highly price-sensitive, this is an important consideration. Getting the sizes of neural networks down is an essential step in expanding the possibilities for neural networks in embedded systems.
  • 8. Accelerated Deep Learning Inference from Constrained Embedded Devices https://iaeme.com/Home/journal/IJECET 18 editor@iaeme.com An interesting topic for further research is Posit - an alternative floating-point number format that may offer additional advantages, as it has an increased dynamic range at the same word size. Because of the improved dynamic range, weights could be stored in lower precision, thus, again, decreasing the memory requirements. Combining the reduced size requirements with low-cost ISA improvements could make neural networks more ubiquitous in the price- sensitive embedded systems market. REFERENCES [1] J. Vreca et al., "Accelerating Deep Learning Inference in Constrained Embedded Devices Using Hardware Loops and a Dot Product Unit," in IEEE Access, vol. 8, pp. 165913-165926, 2020, doi: 10.1109/ACCESS.2020.3022824. [2] D. Valencia, S. F. Fard and A. Ali mohammad, "An Artificial Neural Network Processor with a Custom Instruction Set Architecture for Embedded Applications," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 67, no. 12, pp. 5200-5210, Dec. 2020, doi: 10.1109/TCSI.2020.3003769. [3] X. Chen and Z. Yu, "A Flexible and Energy-Efficient Convolutional Neural Network Acceleration with Dedicated ISA and Accelerator," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26, no. 7, pp. 1408-1412, July 2018, doi: 10.1109/TVLSI.2018.2810831. [4] M. Gautschi, Pasquale Davide Schiavone, Andreas Traber, Igor Loi, "Near-Threshold RISC-V Core with DSP Extensions for Scalable IoT Endpoint Devices," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 25, no. 10, pp. 2700-2713, Oct. 2017, doi: 10.1109/TVLSI.2017.2654506. [5] Y. Tian, "Artificial Intelligence Image Recognition Method Based on Convolutional Neural Network Algorithm," in IEEE Access, vol. 8, pp. 125731-125744, 2020, doi: 10.1109/ACCESS.2020.3006097. [6] Xin, J. Zhang and Y. Shao, "Complex network classification with convolutional neural network," in Tsinghua Science and Technology, vol. 25, no. 4, pp. 447-457, Aug. 2020, doi: 10.26599/TST.2019.9010055. [7] Shomron and U. Weiser, "Spatial Correlation and Value Prediction in Convolutional Neural Networks," in IEEE Computer Architecture Letters, vol. 18, no. 1, pp. 10-13, 1 Jan.-June 2019, doi: 10.1109/LCA.2018.2890236 [8] Y. Qie, P. Song and C. Hao, "Data Repair Without Prior Knowledge Using Deep Convolutional Neural Networks," in IEEE Access, vol. 8, pp. 105351-105361, 2020, doi: 10.1109/ACCESS.2020.2999960. [9] A. J. Salim, S. I. M. Salim, N. R. Samsudin and Y. Soo, "Customized instruction set simulation for soft-core RISC processor," 2012 IEEE Control and System Graduate Research Colloquium, Shah Alam, Selangor, 2012, pp. 38-42, doi: 10.1109/ICSGRC.2012.6287132. [10] A. Raveendran, V. B. Patil, D. Selvakumar and V. Desalphine, "A RISC-V instruction set processor-micro-architecture design and analysis," 2016 International Conference on VLSI Systems, Architectures, Technology and Applications (VLSI-SATA), Bangalore, 2016, pp. 1-7, doi: 10.1109/VLSI-SATA.2016.7593047.