ISSN :2582-9793

General Cyclical Training of Neural Networks

Perspective (Published On: 30-Mar-2023 )
General Cyclical Training of Neural Networks
DOI : 10.54364/AAIML.2023.1157


Adv. Artif. Intell. Mach. Learn., 3 (1):958-976

Leslie : U.S. Naval Research Laboratory

Download PDF Here

DOI: 10.54364/AAIML.2023.1157

Article History: Received on: 05-Mar-23, Accepted on: 23-Mar-23, Published on: 30-Mar-23

Corresponding Author: Leslie


Citation: Leslie N. Smith (2023). General Cyclical Training of Neural Networks. Adv. Artif. Intell. Mach. Learn., 3 (1 ):958-976



This position paper describes the principle of ``General Cyclical Training'' in machine learning, where training starts and ends with "easy training" and the "hard training" happens during the middle epochs. We propose several manifestations for training neural networks, including algorithmic examples (via hyper-parameters and loss functions), data-based examples, and model-based examples. Specifically, we introduce several novel techniques: cyclical weight decay, cyclical batch size, cyclical focal loss, cyclical softmax temperature, cyclical data augmentation, cyclical gradient clipping, and cyclical semi-supervised learning.  In addition, we demonstrate that cyclical weight decay, cyclical softmax temperature, and cyclical gradient clipping (as three examples of this principle) are beneficial in the test accuracy performance of a trained model.   Furthermore, we discuss model-based examples (such as pretraining and knowledge distillation) from the perspective of general cyclical training and recommend some changes to the typical training methodology. In summary, this paper defines the general cyclical training concept and discusses several specific ways in which this concept can be applied to training neural networks.  In the spirit of reproducibility, the code used in our experiments is available at \url{}.


   Article View: 378
   PDF Downloaded: 16