Convolutional Neural Networks Implementation on a Network-on-Chip Platform

Autores

  • Alexandre N. Cardoso
  • Luiza de Macedo Mourelle
  • Nadia Nedjah

Palavras-chave:

Network-on-Chip, Convolutional Neural Networks, Parallel Processing

Resumo

Delivering more throughput to a given computational system or device may be crucial when using
computational intelligence-based approaches as a design preference for a growing set of applications. On the other
hand, these approaches are frequently under rigorous constraints regarding processing time, power consumption
and required memory. One of the main topics of interest in computational intelligence is machine learning. It deals
with computational methods and models based on observational data. In machine learning, the machine develops
the ability to continually learn with data, in an attempt to predict and recognize patterns as humans do. Deep neural
networks use several hidden layers to achieve pattern recognition. The main difference between traditional neural
networks and deep neural networks is the amount of network layers. A convolutional neural network is a deep
learning model, usually used to classify and recognize patterns in image and video-based applications. One of
the most known designs for convolutional neural network is LeNet-5. It allows manuscript characters recognition.
This kind of neural network consists of an input layer, that receives the image, a series of layers, that implement
image operations for characteristics mapping, and a last layer, that consists of a classification neural network, using
the characteristics map and provides the classification result as output. The network structure consists of a series
of paired convolutional layers followed by pooling layers. The output is classified by a fully connected layer. A
convolutional layer is used to allow image characteristics mapping. A pooling layer is responsible for reducing
the matrixes dimensionality and data complexity. Our work aims at investigating the use of parallel processing for

the implementation of a convolutional neural network on a multiprocessor system-on-chip. It exploits a network-
on-chip platform for communication between the processing elements. Mainly, our work consists of grouping

the network operations into conceptual units called tasks. These tasks are the workload to be distributed between
the processing units, which will operate in a parallel manner. As a case study, we implement LeNet-5 on the
multiprocessor system-on-chip MEMPHIS platform. We demonstrate that the distribution of the convolutional
neural network workload over a set of processing elements leads to significant performance gain over the serial
implementation.

Downloads

Publicado

2024-05-30

Edição

Seção

Artigos