Training Artificial Neural Networks by Coordinate Search Algorithm

Ehsan Rokhsatyazdi; Shahryar Rahnamayan; Sevil Zanjani Miyandoab; Azam Asilian Bidgoli; H. R. Tizhoosh

doi:10.1109/SSCI52147.2023.10371958

Training Artificial Neural Networks by Coordinate Search Algorithm

Ehsan Rokhsatyazdi, Shahryar Rahnamayan, Sevil Zanjani Miyandoab, Azam Asilian Bidgoli, H. R. Tizhoosh

Artificial Intelligence and Informatics

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Training Artificial Neural Networks (ANNs) poses a challenging and critical problem in machine learning. Despite the effectiveness of gradient-based learning methods, such as Stochastic Gradient Descent (SGD), in training neural networks, they do have several limitations. For instance, they require differentiable activation functions, and cannot optimize a model based on several independent non-differentiable loss functions simultaneously; for example, the F1-score, which is used during testing, can be used during training when a gradient-free optimization algorithm is utilized. Furthermore, the training (i.e., optimization of weights) in any DNN can be possible with a small size of the training dataset. To address these concerns, we propose an efficient version of the gradient-free Coordinate Search (CS) algorithm, an instance of General Pattern Search (GPS) methods, for training (i.e., optimizing) neural networks. The proposed algorithm can be used with non-differentiable activation functions and tailored to multi-objective/multi-loss problems. Finding the optimal values for weights of ANNs is a large-scale optimization problem. Therefore instead of finding the optimal value for each variable, which is the common technique in classical CS, we accelerate optimization and convergence by bundling the variables (i.e., weights). In fact, this strategy is a form of dimension reduction for optimization problems. Based on the experimental results, the proposed method is comparable with the SGD algorithm, and in some cases, it outperforms the gradient-based approach. Particularly, in situations with insufficient labeled training data, the proposed CS method performs better. The performance plots demonstrate a high convergence rate, highlighting the capability of our suggested method to find a reasonable solution with fewer function calls. As of now, the only practical and efficient way of training ANNs with hundreds of thousands of weights is gradient-based algorithms such as SGD or Adam. In this paper we introduce an alternative method for training ANN.

Original language	English (US)
Title of host publication	2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	1540-1546
Number of pages	7
ISBN (Electronic)	9781665430654
DOIs	https://doi.org/10.1109/SSCI52147.2023.10371958
State	Published - 2023
Event	2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023 - Mexico City, Mexico Duration: Dec 5 2023 → Dec 8 2023

Publication series

Name	2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023

Conference

Conference	2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023
Country/Territory	Mexico
City	Mexico City
Period	12/5/23 → 12/8/23

Keywords

Artificial Neural Network (ANN)
Coordinate Search
Expensive Optimization
Gradient-free
Large-Scale Optimization
Stochastic Gradient Descent (SGD)

ASJC Scopus subject areas

Artificial Intelligence
Computer Science Applications
Human-Computer Interaction
Decision Sciences (miscellaneous)
Safety, Risk, Reliability and Quality

Access to Document

10.1109/SSCI52147.2023.10371958

Cite this

Rokhsatyazdi, E., Rahnamayan, S., Miyandoab, S. Z., Bidgoli, A. A., & Tizhoosh, H. R. (2023). Training Artificial Neural Networks by Coordinate Search Algorithm. In 2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023 (pp. 1540-1546). (2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/SSCI52147.2023.10371958

Training Artificial Neural Networks by Coordinate Search Algorithm. / Rokhsatyazdi, Ehsan; Rahnamayan, Shahryar; Miyandoab, Sevil Zanjani et al.
2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023. Institute of Electrical and Electronics Engineers Inc., 2023. p. 1540-1546 (2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Rokhsatyazdi, E, Rahnamayan, S, Miyandoab, SZ, Bidgoli, AA & Tizhoosh, HR 2023, Training Artificial Neural Networks by Coordinate Search Algorithm. in 2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023. 2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023, Institute of Electrical and Electronics Engineers Inc., pp. 1540-1546, 2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023, Mexico City, Mexico, 12/5/23. https://doi.org/10.1109/SSCI52147.2023.10371958

Rokhsatyazdi E, Rahnamayan S, Miyandoab SZ, Bidgoli AA, Tizhoosh HR. Training Artificial Neural Networks by Coordinate Search Algorithm. In 2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023. Institute of Electrical and Electronics Engineers Inc. 2023. p. 1540-1546. (2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023). doi: 10.1109/SSCI52147.2023.10371958

@inproceedings{441f375059c14d11b9ddd95f33895537,

title = "Training Artificial Neural Networks by Coordinate Search Algorithm",

abstract = "Training Artificial Neural Networks (ANNs) poses a challenging and critical problem in machine learning. Despite the effectiveness of gradient-based learning methods, such as Stochastic Gradient Descent (SGD), in training neural networks, they do have several limitations. For instance, they require differentiable activation functions, and cannot optimize a model based on several independent non-differentiable loss functions simultaneously; for example, the F1-score, which is used during testing, can be used during training when a gradient-free optimization algorithm is utilized. Furthermore, the training (i.e., optimization of weights) in any DNN can be possible with a small size of the training dataset. To address these concerns, we propose an efficient version of the gradient-free Coordinate Search (CS) algorithm, an instance of General Pattern Search (GPS) methods, for training (i.e., optimizing) neural networks. The proposed algorithm can be used with non-differentiable activation functions and tailored to multi-objective/multi-loss problems. Finding the optimal values for weights of ANNs is a large-scale optimization problem. Therefore instead of finding the optimal value for each variable, which is the common technique in classical CS, we accelerate optimization and convergence by bundling the variables (i.e., weights). In fact, this strategy is a form of dimension reduction for optimization problems. Based on the experimental results, the proposed method is comparable with the SGD algorithm, and in some cases, it outperforms the gradient-based approach. Particularly, in situations with insufficient labeled training data, the proposed CS method performs better. The performance plots demonstrate a high convergence rate, highlighting the capability of our suggested method to find a reasonable solution with fewer function calls. As of now, the only practical and efficient way of training ANNs with hundreds of thousands of weights is gradient-based algorithms such as SGD or Adam. In this paper we introduce an alternative method for training ANN.",

keywords = "Artificial Neural Network (ANN), Coordinate Search, Expensive Optimization, Gradient-free, Large-Scale Optimization, Stochastic Gradient Descent (SGD)",

author = "Ehsan Rokhsatyazdi and Shahryar Rahnamayan and Miyandoab, {Sevil Zanjani} and Bidgoli, {Azam Asilian} and Tizhoosh, {H. R.}",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023 ; Conference date: 05-12-2023 Through 08-12-2023",

year = "2023",

doi = "10.1109/SSCI52147.2023.10371958",

language = "English (US)",

series = "2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "1540--1546",

booktitle = "2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023",

}

TY - GEN

T1 - Training Artificial Neural Networks by Coordinate Search Algorithm

AU - Rokhsatyazdi, Ehsan

AU - Rahnamayan, Shahryar

AU - Miyandoab, Sevil Zanjani

AU - Bidgoli, Azam Asilian

AU - Tizhoosh, H. R.

PY - 2023

Y1 - 2023

N2 - Training Artificial Neural Networks (ANNs) poses a challenging and critical problem in machine learning. Despite the effectiveness of gradient-based learning methods, such as Stochastic Gradient Descent (SGD), in training neural networks, they do have several limitations. For instance, they require differentiable activation functions, and cannot optimize a model based on several independent non-differentiable loss functions simultaneously; for example, the F1-score, which is used during testing, can be used during training when a gradient-free optimization algorithm is utilized. Furthermore, the training (i.e., optimization of weights) in any DNN can be possible with a small size of the training dataset. To address these concerns, we propose an efficient version of the gradient-free Coordinate Search (CS) algorithm, an instance of General Pattern Search (GPS) methods, for training (i.e., optimizing) neural networks. The proposed algorithm can be used with non-differentiable activation functions and tailored to multi-objective/multi-loss problems. Finding the optimal values for weights of ANNs is a large-scale optimization problem. Therefore instead of finding the optimal value for each variable, which is the common technique in classical CS, we accelerate optimization and convergence by bundling the variables (i.e., weights). In fact, this strategy is a form of dimension reduction for optimization problems. Based on the experimental results, the proposed method is comparable with the SGD algorithm, and in some cases, it outperforms the gradient-based approach. Particularly, in situations with insufficient labeled training data, the proposed CS method performs better. The performance plots demonstrate a high convergence rate, highlighting the capability of our suggested method to find a reasonable solution with fewer function calls. As of now, the only practical and efficient way of training ANNs with hundreds of thousands of weights is gradient-based algorithms such as SGD or Adam. In this paper we introduce an alternative method for training ANN.

AB - Training Artificial Neural Networks (ANNs) poses a challenging and critical problem in machine learning. Despite the effectiveness of gradient-based learning methods, such as Stochastic Gradient Descent (SGD), in training neural networks, they do have several limitations. For instance, they require differentiable activation functions, and cannot optimize a model based on several independent non-differentiable loss functions simultaneously; for example, the F1-score, which is used during testing, can be used during training when a gradient-free optimization algorithm is utilized. Furthermore, the training (i.e., optimization of weights) in any DNN can be possible with a small size of the training dataset. To address these concerns, we propose an efficient version of the gradient-free Coordinate Search (CS) algorithm, an instance of General Pattern Search (GPS) methods, for training (i.e., optimizing) neural networks. The proposed algorithm can be used with non-differentiable activation functions and tailored to multi-objective/multi-loss problems. Finding the optimal values for weights of ANNs is a large-scale optimization problem. Therefore instead of finding the optimal value for each variable, which is the common technique in classical CS, we accelerate optimization and convergence by bundling the variables (i.e., weights). In fact, this strategy is a form of dimension reduction for optimization problems. Based on the experimental results, the proposed method is comparable with the SGD algorithm, and in some cases, it outperforms the gradient-based approach. Particularly, in situations with insufficient labeled training data, the proposed CS method performs better. The performance plots demonstrate a high convergence rate, highlighting the capability of our suggested method to find a reasonable solution with fewer function calls. As of now, the only practical and efficient way of training ANNs with hundreds of thousands of weights is gradient-based algorithms such as SGD or Adam. In this paper we introduce an alternative method for training ANN.

KW - Artificial Neural Network (ANN)

KW - Coordinate Search

KW - Expensive Optimization

KW - Gradient-free

KW - Large-Scale Optimization

KW - Stochastic Gradient Descent (SGD)

UR - http://www.scopus.com/inward/record.url?scp=85182928622&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85182928622&partnerID=8YFLogxK

U2 - 10.1109/SSCI52147.2023.10371958

DO - 10.1109/SSCI52147.2023.10371958

M3 - Conference contribution

AN - SCOPUS:85182928622

T3 - 2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023

SP - 1540

EP - 1546

BT - 2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2023 IEEE Symposium Series on Computational Intelligence, SSCI 2023

Y2 - 5 December 2023 through 8 December 2023

ER -

Training Artificial Neural Networks by Coordinate Search Algorithm

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this