Minimizers
Comparison of gradient-based coordinate descent and gradient descent on continuous bivariate functions.
Within the subject area of machine learning, gradient descent is perhaps the most well-known and commonly used iterative optimization method for finding the parameters that minimize error functions such as residual sum of squares for (L1-regularized) linear regression, or negative log-likelihoods for logistic regression and neural networks.
However, there are a few alternative methods that may perform better than gradient descent in specific cases, such as coordinate descent, which tends to perform better in L1-regularized regression problems, as well as generally converging faster (in terms of runtime).
Running instructions
You can view the published R notebook here.
If you would like to generate the notebook yourself from the R markdown file in this repository, simply change the setwd command in the first chunk of the R markdown file to point to wherever the minimizers directory is stored on your device, then run the notebook and use the Preview option on RStudio to generate a .nb.html file.
Note: You will also need to uncomment all of the install.packages directives in the first chunk. These were uncommented in order to suppress the output of the first chunk.
Results
| Six-hump camel function (F1) | Peaks function (F2) | |
|---|---|---|
| Plot | ![]() |
![]() |
| Equation | ![]() |
![]() |
| Gradient descent optimization | ![]() |
![]() |
| Coordinate descent optimization | ![]() |
![]() |
© 2019 Edwin Onuonga - Released under the MIT License.
Authored and maintained by Edwin Onuonga.










