Neural Network Architecture

description34 papers

group3 followers

lightbulbAbout this topic

Neural Network Architecture refers to the structured design of artificial neural networks, encompassing the arrangement of layers, types of neurons, and connections between them. It determines how data is processed and learned, influencing the network's performance in tasks such as classification, regression, and pattern recognition.

lightbulbAbout this topic

Key research themes

1. How can neural network architecture be optimized for computational efficiency without sacrificing accuracy?

This research area focuses on designing and scaling neural network architectures to achieve high accuracy on specified tasks while minimizing computational complexity and hardware resource usage. It is critical for deploying neural networks on resource-limited devices and speeding up inference by reducing operations and hardware area.

Finding Storage- and Compute-Efficient Convolutional Neural Networks

by Daniel Becking

2021, Master's Thesis, Technische Universität Berlin

Key finding: Proposed a two-step paradigm integrating compound model scaling (a lightweight NAS approach) and Entropy-Constrained Trained Ternarization (EC2T), a simultaneous pruning and ternary quantization algorithm, which compresses... Read more

articleView Paper downloadDownload

An Efficient Approach for Neural Network Architecture.pdf

by Kasem Khalil

2019

Key finding: Introduced a neural network hardware design that reduces the number of physical hidden layers by half (from N to N/2) through multiplexing input and output layers while maintaining the same accuracy as traditional N-layer... Read more

articleView Paper downloadDownload

Architecture of A Novel Low-Cost Hardware Neural Network

by Kasem Khalil

2020, 2020 IEEE 63rd International Midwest Symposium on Circuits and Systems (MWSCAS)

Key finding: Designed a neural network architecture sharing multipliers and adders between two hidden layers, cutting the number of these critical hardware components by half and reducing hardware cost by 63%. The method maintained... Read more

articleView Paper downloadDownload

Finding the Optimal Topology of an Approximating Neural Network

by Stoyan Cheresharov

2023, Mathematics

Key finding: Derived analytical formulas to estimate upper bounds on the number of hidden layers and neurons in networks trained via algorithms using the Jacobi matrix (e.g., Levenberg-Marquardt). These bounds aid in selecting compact yet... Read more

articleView Paper downloadDownload

Heuristic Architecture Search Using Network Morphism for Chest X-Ray Classification

by Pavlo Radiuk

2021, Heuristic Architecture Search Using Network Morphism for Chest X-Ray Classification

Key finding: Developed a heuristic architecture search method leveraging network morphism combined with hill-climbing and functional saving, achieving competitive chest X-ray classification accuracy (73.2% validation accuracy, 84.5% AUC)... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What methodologies and algorithms enable automated search and optimization of neural network architectures to improve performance and reduce manual design efforts?

This research theme investigates algorithmic frameworks and search strategies such as genetic algorithms, evolutionary methods, modular search spaces, and heuristics to automate the process of neural network architecture design. Automating architecture search accelerates model development, improves generalization, and allows discovering architectures difficult to design manually, helping in diverse tasks from image classification to medical imaging.

A Framework for Exploring and Modelling Neural Architecture Search Methods

by Nadiia Hrypynska

2022

Key finding: Proposed a systematic framework that categorizes and benchmarks NAS methods by summarizing architecture search decisions and strategies, applying quantitative and qualitative metrics for prototyping and comparison. This... Read more

articleView Paper downloadDownload

Heuristic Architecture Search Using Network Morphism for Chest X-Ray Classification

by Pavlo Radiuk

2021, Heuristic Architecture Search Using Network Morphism for Chest X-Ray Classification

Key finding: Presented a novel heuristic architecture search using enforced hill-climbing and network morphism to efficiently explore architectures. The method found high-performing architectures within 28 GPU hours on medical image... Read more

articleView Paper downloadDownload

Use of genetic algorithms with backpropagation in training of feedforward neural networks

by Michael McInerney

2024, IEEE International Conference on Neural Networks

Key finding: Developed hybrid training algorithms combining genetic algorithms (GA) and backpropagation (BP) that leverage GA’s global search to escape local minima and BP’s efficiency in fine-tuning. The GA-BP hybrids achieved faster... Read more

articleView Paper downloadDownload

Modular search space for automated design of neural architecture

by Pavlo Radiuk

2021, Proceedings of the O.S. Popov ОNAT

Key finding: Proposed a modularized neural architecture search space composed of parameterized building blocks derived from NAS-Bench-201 benchmark, represented as multisectoral networks described unambiguously by vectors. Applied to a... Read more

articleView Paper downloadDownload

Design of ANN Based Non-Linear Network Using Interconnection of Parallel Processor

by nitish pathak

2023, Computer Systems Science and Engineering

Key finding: Explored an ANN design leveraging massive parallelism with many interconnected processing elements distributed over parallel processors, achieving effective optimization for nonlinear resource allocation problems.... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How do architectural elements and training hyperparameters influence neural network learning dynamics and generalization?

This theme examines the role of architectural design choices, such as the number of layers, neurons, and activation functions, as well as learning hyperparameters like learning rate and regularization, on convergence, error minimization, and avoidance of local minima. Understanding these influences is vital to achieve stable and efficient learning with good generalization while preventing issues like overfitting or chaotic training behavior.

Design and regularization of neural networks: the optimal use of a validation set

by C. Svarer

2024, Neural Networks for Signal Processing VI. Proceedings of the 1996 IEEE Signal Processing Society Workshop

Key finding: Derived novel gradient-based algorithms for estimating regularization parameters and optimizing neural net architectures using a validation set. Proposed iterative schemes jointly optimizing weights and hyperparameters that... Read more

articleView Paper downloadDownload

Multilayer neural networks – as determined systems

by Ivan Kuno

2023, Computational Problems of Electrical Engineering

Key finding: Analyzed the effect of learning rate (η) on multilayer neural network training, observing bifurcation and chaotic behavior when η exceeds a critical threshold (~0.62 for a 3-layer network with 4 neurons per layer). Found that... Read more

articleView Paper downloadDownload

Use of genetic algorithms with backpropagation in training of feedforward neural networks

by Michael McInerney

2024, IEEE International Conference on Neural Networks

Key finding: Identified limitations of backpropagation training related to sensitivity to learning rate and momentum and susceptibility to local minima. Showed that integrating GA with BP alleviates these issues by global exploration with... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Neural Network Architecture

A Framework for Exploring and Modelling Neural Architecture Search Methods

by Nadiia Hrypynska

2020

For the past years, many researchers and engineers have been developing and optimising deep neural networks (DNN). The process of neural architecture design and tuning its hyperparameters remains monotonous, timeconsuming, and do not... more

descriptionView Paper arrow_downwardDownload

Impact of Training Set Batch Size on the Performance of Convolutional Neural Networks for Diverse Datasets

by Pavlo Radiuk

2017, Information Technology and Management Science

A problem of improving the performance of convolutional neural networks is considered. A parameter of the training set is investigated. The parameter is the batch size. The goal is to find an impact of training set batch size on the... more

Fig. 1. The MNIST dataset has a training set of 60 000 examples, and a test set of 10000 examples.Several samples of “handwritten digit image” and its “label” from the MNIST dataset. The MNIST (Mixed National Institute of Standards and Technology) database is used extensively for training and testing machine learning models [16]. The database consists of the pairs, which are “handwritten digit image” and “label”. Digit ranges from “0” to “9”, meaning 10 patterns in total. Handwritten digit images are grey scale images with pixel size of 28 x 28, labels — actual digit numbers this handwritten digit image represents, it is either “0” to “9” (Fig. 1).

Fig. 2. The variety of colour images from the CIFAR-10 dataset containing 10 image categories (labelled as “airplane”, “automobile”, “bird”, “cat”, “deer”, “dog”, “frog”, “horse”, “ship”, “truck”). batches and one test batch, each with 10 000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class (Fig. 2).

Fig. 3. The testing accuracy of the trained CNN with sequences B, and B, on the MNIST dataset. The larger the batch size value, the more smooth the curve. The lowest and noisiest curve corresponds to the batch size of 16 examples, the highest and the smothest one — to the batch size of 1024 examples. The selected models were applied using the machine learning framework TensorFlow v. 1.3.0 [19]. The results of the training are displayed with its visualisation toolbox TensorBoard. Figures 3—4 visualise the testing accuracy results. We can see that curves, which describe testing accuracy results, are noisy on the MNIST dataset and smooth on the CIFAR-10 dataset. The curves vary from the smallest batch size value, which is 16, to the largest one, which is 1024.

Fig. 4. The testing accuracy of the trained CNN with sequences B, and B, on the CIFAR-10 dataset. The smoothness of the curves is approximatly the same for all batch size values. The lowest curve corresponds to the batch size of 16 examples, the highest one — to 1024 examples.

Fig. 5. The testing accuracy of the batch size values of 32, 50 and 64 on the MNIST (left figure) and CIFAR-10 (right figure) datasets. The best test accuracy resul is demostrated by the batch size of 64 examples on the both datasets. The lowest testing accuracy corresponds to the batch size of 32 examples. The batch size wit the value of 50 shows result, which is close to the value of 64.

First of all, it should be noted that the batch size change trend is similar for the both considered datasets. The worst values of the test accuracy are demonstrated by the batch size of 16, 32, 50 and 64 examples. The best results of recognition accuracy are obtained from the batch size of 512 and 1024 examples. The batch sizes of 100, 128, 150, 200, 250 and 256 examples represent the average result of testing accuracy. Hence, the larger the batch size value, the higher the image recognition accuracy. Similar average batch size values from sequences B, and 8B, were compared and displayed below, in Figs. 5—7.

‘ig. 6. The testing accuracy of the batch size values of 100, 128 and 150 on the MNIST (left figure) and CIFAR-10 (right figure) datasets. The trend of the cur zrowth is similar, however, a predominance of the batch size of 150 examples is observed.

Fig. 7. The testing accuracy of the batch size values of 200, 250 and 256 on the MNIST (left figure) and CIFAR-10 (right figure) datasets. All three batch size values show almost identical result of testing accuracy on the both datasets. Nonetheless, on greater iterations, the batch size of 256 examples performes slightly better. Hense, at greater values of the batch size, the value of the test accuracy increases.

THE TRAINING TIME EFFICIENCY As aresult of the comparative analysis, the supposition about the dependence of the recognition accuracy on the batch size value was confirmed: the larger the batch size value, the higher the testing accuracy. Another supposition about the impact of the type of the batch size value on the CNN performance was not confirmed. The training time efficiency is similar to the testing accuracy change trend for the MNIST and CIFAR-10 datasets: the higher the batch size value, the more time is required to train the network. The final time expenditures of training the network are shown in Table II.

descriptionView Paper arrow_downwardDownload

Modular search space for automated design of neural architecture

by Pavlo Radiuk

2020, Proceedings of the O.S. Popov ОNAT

The past years of research have shown that automated machine learning and neural architecture search are an inevitable future for image recognition tasks. In addition, a crucial aspect of any automated search is the predefined search space. As many studies have demonstrated, the modularization technique may simplify the underlying search space by fostering successful blocks' reuse. In this regard, the presented research aims to investigate the use of modularization in automated machine learning. In this paper, we propose and examine a modularized space based on the substantial limitation to seeded building blocks for neural architecture search. To make a search space viable, we presented all modules of the space as multisectoral networks. Therefore, each architecture within the search space could be unequivocally described by a vector. In our case, a module was a predetermined number of parameterized layers with information about their relationships. We applied the proposed modular search space to a genetic algorithm and evaluated it on the CIFAR-10 and CIFAR-100 datasets based on modules from the NAS-Bench-201 benchmark. To address the complexity of the search space, we randomly sampled twenty-five modules and included them in the database. Overall, our approach retrieved competitive architectures in averaged 8 GPU hours. The final model achieved the validation accuracy of 89.1% and 73.2% on the CIFAR-10 and CIFAR-100 datasets, respectively. The learning process required slightly fewer GPU hours compared to other approaches, and the resulting network contained fewer parameters to signal lightness of the model. Such an outcome may indicate the considerable potential of sophisticated ranking approaches. The conducted experiments also revealed that a straightforward and transparent search space could address the challenging task of neural architecture search. Further research should be undertaken to explore how the predefined knowledge base of modules could benefit modular search space. Анотація. За минулі роки дослідження підтвердили, що автоматизоване машинне навчання та пошук архітектури нейронної мережі-це неминуче майбутнє для завдань розпізнавання зображень. Крім того, одним із вирішальних аспектів будь-якого автоматизованого пошуку виявився попередньо визначений простір пошуку. Як показали багато обчислювальних досліджень, техніка модуляризації здатна спростити базовий простір пошуку, сприяючи повторному використанню успішних блоків. У зв'язку з цим, ця наукова стаття має на меті дослідити використання модуляризації в автоматизованому машинному навчанні. У цій статті ми пропонуємо та оцінюємо модульований простір, з огляду на істотне обмеження попередньо визначених блоків для пошуку архітектури. Щоб зробити простір пошуку істотним, ми показали всі модулі простору, як багато секторальні мережі. Тому кожну архітектуру в просторі пошуку однозначно описано вектором. У нашому випадку модуль є Radiuk P.M. Modular search space for automated design of neural architecture 37

Results and discussion. The results of the examination based on the random initialization of 5 modules are presented herein. In addition to examining the proposed modular search space, we compared it to the two most recognized approaches in the search space designing. Table 1 shows the results of CIFAR-10.

As seen from Table 1, our approach’s validation accuracy, which is 89,1%, is below the competition, yet remains efficient considering the search time of almost 8 GPU hours and the restricted number of modules. Also, it is worth noting the potential impact of small search spaces with learned ratings on performance. Using a few ranked modules can result in increased productivity. Table 2 introduces the computational results of CIFAR-100.

descriptionView Paper arrow_downwardDownload

Textile Processing

by UMASHANKAR MAURYA

International Journal

for generic signal processing applications. In the proposed paper analog components like Gilbert Cell Multiplier (GCM), Neuron activation Function (NAF) are used to implement artificial NNA. The analog components used are comprises of... more

descriptionView Paper arrow_downwardDownload

Finding Storage- and Compute-Efficient Convolutional Neural Networks

by Daniel Becking

2020, Master's Thesis, Technische Universität Berlin

Convolutional neural networks (CNNs) have taken the spotlight in a variety of machine learning applications. To reach the desired performance, CNNs have become increasingly deeper and larger which goes along with a tremendous amount of... more

descriptionView Paper arrow_downwardDownload

Application of a Genetic Algorithm to Search for the Optimal Convolutional Neural Network Architecture with Weight Distribution

by Pavlo Radiuk

2020, Herald of Khmelnytskyi national university

In the past decade, a new way in neural networks research called Network architectures search has demonstrated noticeable results in the design of architectures for image segmentation and classification. Despite the considerable success of the architecture search in image segmentation and classification, it is still an unresolved and urgent problem. Moreover, the neural architecture search is also a highly computationally expensive task. This work proposes a new approach based on a genetic algorithm to search for the optimal convolutional neural network architecture. We integrated a genetic algorithm with standard stochastic gradient descent that implements weight distribution across all architecture solutions. This approach utilises a genetic algorithm to design a sub-graph of a convolution cell, which maximises the accuracy on the validation set. We show the performance of our approach on the CIFAR-10 and CIFAR-100 datasets with a final accuracy of 93.21% and 78.89%, respectively. The main scientific contribution of our work is the combination of genetic algorithm with weight distribution in the architecture search tasks that achieve similar to state-of-the-art results on a single GPU. Keywords: convolutional neural networks, genetic algorithms, weight distribution, ablation study. П.М. РАДЮК Хмельницький національний університет ЗАСТОСУВАННЯ ГЕНЕТИЧНОГО АЛГОРИТМУ ДЛЯ ПОШУКУ ОПТИМАЛЬНОЇ АРХІТЕКТУРИ ЗГОРТКОВОЇ НЕЙРОННОЇ МЕРЕЖІ З РОЗПОДІЛЕННЯМ ВАГ За останнє десятиліття новий спосіб дослідження нейронних мереж під назвою «Пошук мережевих архітектур» продемонстрував позитивні результати в розробці архітектур для сегментації та класифікації зображень. Незважаючи на значний успіх пошуку архітектур в задачах сегментації та класифікації зображень, він все ще є невирішеною і актуальною проблемою. Більше того, пошук архітектур нейронних мереж є також дуже витратим з точки зору обчислювальних ресурсів. У цій роботі пропонується новий підхід на основі генетичного алгоритму для пошуку оптимальної архітектури згорткової нейронної мережі. Ми інтегрували генетичний алгоритм зі стандартним стохастичним градієнтом, що реалізує розподіл ваг у всіх архітектурних рішеннях. Цей підхід використовує генетичний алгоритм для проектування частини графу в якості згорткового шару, що забезпечує максимальну точність на валідаційному наборі даних. У цій роботі ми демонструємо ефективність нашого підходу на наборах даних CIFAR-10 та CIFAR-100 з кінцевою точністю 93,21 % та 78,89 % відповідно. Основним науковим внеском нашої роботи є поєднання генетичного алгоритму з розподілом ваг в задачах пошуку архітектури, що досягає точності класифікацїі зображення з використанням одного графічного процесора близької до найсучасніших результатів. Ключові слова: згорткові нейронні мережі, генетичні алгоритми, розподілення ваг, абляція дослідження. Introduction In recent decades, artificial neural networks have produced outstanding results in computer vision with a wide variety of applied tasks such as object detection, image segmentation and classification and others. The design of neural network architectures for a specific task or dataset usually requires specific approaches and a large number of computational resources [1, 2]. Recently, a new way in neural networks research called Network Architectures Search (NAS) has demonstrated noticeable results in the design of architectures for image segmentation and classification. NAS approaches use a recurrent neural network (RNN) controller to generate a candidate network architecture, called child model, which is then trained to converge. After the training, researches measure the performance of the trained network architecture on the desired task or dataset. RNN controller receives the performance measurement as a signal to explore for a better architecture. After that, this process repeats over many computationally expensive iterations. Analysis of recent research Despite the considerable success of NAS in classification tasks, it is still highly computationally expensive. Recently, many studies have used the idea of parameter sharing [3] across all child models to reduce the need for training each child model from scratch, thereby eliminating most computational costs. Weight distribution, the analogue of parameter sharing for convolutional cells, has shown prominent result utilising reinforcement-based and gradient-based methods. The first method is an effective search for neural architecture using parameter sharing (ENAS) [4] based on reinforcement training with the RNN controller to create the candidate architecture. The second method is called a differentiable architecture search with various modifications (DARTS) [5], where each compound has a gradient-updating probability function. However, none of the publications has yet combined GAs with parameter sharing or its analogies to NAS. Genetic algorithm (GA) is a search method based on natural selection and genetics [6]. GAs consist of four fundamental concepts: selection, cross-over, mutation, and replacement. Population, another essential component of GA, is utilised to generate new candidate solutions. A new generation is created in each iteration, using a three-step process of selection, cross-over and mutation. The next generation is then inserted into the population through the replacement phase. The algorithm starts with a random set that is evaluated at the beginning of training.

descriptionView Paper arrow_downwardDownload

Analog VLSI Implementation of Neural Network Architecture for Signal Processing

by International journal of VLSI design & Communication Systems (VLSICS)

With the advent of new technologies and advancement in medical science we are trying to process the information artificially as our biological system performs inside our body. Artificial intelligence through a biological word is realized... more

descriptionView Paper arrow_downwardDownload

Single circuit in V1 capable of switching contexts during movement using VIP population as a switch

by Doris Voina and

2020

As animals adapt to their environments, their brains are tasked with processing stimuli in different sensory contexts. Whether these computations are context dependent or independent, they are all implemented in the same neural tissue. A crucial question is what neural architectures can respond flexibly to a range of stimulus conditions and switch between them. This is a particular case of flexible architecture that permits multiple related computations within a single circuit. Here, we address this question in the specific case of the visual system circuitry, focusing on context integration, defined as the integration of feedforward and surround information across visual space. We show that a biologically inspired microcircuit with multiple inhibitory cell types can switch between visual processing of the static context and the moving context. In our model, the VIP population acts as the switch and modulates the visual circuit through a disinhibitory motif. Moreover, the VIP population is efficient, requiring only a relatively small number of neurons to switch contexts. This circuit eliminates noise in videos by using appropriate lateral connections for contextual spatio-temporal surround modulation, having superior denoising performance compared to circuits where only one context is learned. Our findings shed light on a minimally complex architecture that is capable of switching between two naturalistic contexts using few switching units. Author Summary The brain processes information at all times and much of that information is context-dependent. The visual system presents an important example: processing is ongoing, but the context changes dramatically when an animal is still vs. running. How is context-dependent information processing achieved? We take inspiration from recent neurophysiology studies on the role of distinct cell types in primary visual cortex (V1).We find that relatively few "switching units"-akin to the VIP neuron type in V1 in that they turn on and off in the running vs. still context and have connections to and from the main population-is sufficient to drive context dependent image processing. We demonstrate this in a model of feature integration, and in a test of image denoising. The underlying circuit architecture illustrates a concrete computational role for the multiple cell types under increasing study across the brain, and may inspire more flexible neurally inspired computing architectures.

descriptionView Paper arrow_downwardDownload

Problem Decomposition and Information Minimization for the Global, Concurrent, On-line Validation of Neutron Noise Signals and Neutron Detector Operation

by International Journal of Artificial Intelligence (IJAIA) and

2020, International Journal of Artificial Intelligence & Applications (IJAIA)

This piece of research introduces a purely data-driven, directly reconfigurable, divide-and-conquer on-line monitoring (OLM) methodology for automatically selecting the minimum number of neutron detectors (NDs)-and corresponding neutron... more

descriptionView Paper arrow_downwardDownload