Balancing Neural Network Training Convergence with Efficient Computational Boundaries
The neural network training process produces black-box models with low explainability. In addition, the process itself is numerical, with parameters (such as learning rate, momentum, and early stopping trigger) being chosen ad hoc. During the training with chosen parameters, after each calculated update of the weights, the observed total change of weights indicates which training stage the network is currently in. At the same time, neural networks are limited in the data they can model due to various reasons, such as architecture, activation functions, data itself, and the training approach. This limitation is expressed in the phenomenon of the efficient computational frontier, which, it seems, cannot be crossed, no matter the hyperparameters of the network. This paper tackles the efficient usage of information regarding the total change of weights and the efficient computational frontier to determine when the training should be stopped. The results demonstrate the efficiency of training of simpler models compared to more complex models and prove that the general weight structure of models is formed very quickly in the training, while the forming of finer details takes up much more time.