General Cyclical Training of Neural Networks

Perspective (Published On: 30-Mar-2023 )

DOI : 10.54364/AAIML.2023.1157

Leslie

Adv. Artif. Intell. Mach. Learn., 3 (1):958-976

1. Leslie: U.S. Naval Research Laboratory

Download PDF Here

DOI: 10.54364/AAIML.2023.1157

Article History: Received on: 05-Mar-23, Accepted on: 23-Mar-23, Published on: 30-Mar-23

Corresponding Author: Leslie

Email: leslie.smith@nrl.navy.mil

Citation: Leslie N. Smith (2023). General Cyclical Training of Neural Networks. Adv. Artif. Intell. Mach. Learn., 3 (1 ):958-976

Abstract

This position paper describes the principle of ``General Cyclical Training'' in machine learning, where training starts and ends with "easy training" and the "hard training" happens during the middle epochs. We propose several manifestations for training neural networks, including algorithmic examples (via hyper-parameters and loss functions), data-based examples, and model-based examples. Specifically, we introduce several novel techniques: cyclical weight decay, cyclical batch size, cyclical focal loss, cyclical softmax temperature, cyclical data augmentation, cyclical gradient clipping, and cyclical semi-supervised learning. In addition, we demonstrate that cyclical weight decay, cyclical softmax temperature, and cyclical gradient clipping (as three examples of this principle) are beneficial in the test accuracy performance of a trained model. Furthermore, we discuss model-based examples (such as pretraining and knowledge distillation) from the perspective of general cyclical training and recommend some changes to the typical training methodology. In summary, this paper defines the general cyclical training concept and discusses several specific ways in which this concept can be applied to training neural networks. In the spirit of reproducibility, the code used in our experiments is available at \url{https://github.com/lnsmith54/CFL}.

Statistics

Article View: 1112
PDF Downloaded: 17

General Cyclical Training of Neural Networks

Perspective (Published On: 30-Mar-2023 )

Abstract

Statistics

Other Journals

Site Links

Other Usefull Links

Publisher

Editor in Chief