Enhancing the convergence speed of line search methods: Applications in Neural Network training

jun. 30, 2024·

José Ángel Martín-Baos

Ricardo García-Ródenas

Luis Rodriguez-Benitez

María Luz López-García

· 2 min de lectura

Proyecto

Abstract

The training of machine learning models, such as neural networks, relies on optimisation techniques that necessitate large volumes of data. The algorithms that have demonstrated satisfactory performance on this task frequently use linear searches on subsets of the data. This ensures that, despite the potential low quality of the search direction, the overall computational cost remains low, making this strategy globally efficient. In these methods, strategies employing a constant learning rate have proven to be particularly effective. This paper introduces a novel scheme designed to significantly expedite the convergence process of line search-based methods. Our approach incorporates additional high-quality linear searches derived from the convergence process of the methods, and by making use of an Armijo rule, it dynamically adjusts the step size through successive reductions or expansions based on the evaluated quality of the descent direction. This strategic adjustment enables more substantial progress in the search direction, potentially reducing the number of iterations needed to reach an optimal solution. We have applied our proposed solution to accelerate the performance of widely-used algorithms such as Gradient Descent (GD), Momentum GD, and Adaptive Moment Estimation (Adam). To illustrate the practical implications and effectiveness of our approach, we present a comprehensive case study focusing on the training of Deep Neural Networks and Kernel Logistic Regression.

Última actualización el jun. 30, 2024