Master's Thesis: An Optimization Perspective: Understanding the Supervised Learning Landscape
During my master's studies at the University of Bergen (UiB), I wrote this thesis to explore the critical role of optimization in machine learning (ML). The goal was to highlight how optimization plays a fundamental part within ML, particularly in Supervised Learning (SL) for methods like linear regression and Support Vector Machines (SVM). The thesis demonstrates that ML relies heavily on optimization techniques. By conducting numerical experiments, it illustrates how different hyperparameter settings can lead to varying results, underscoring the importance of hyperparameter optimization in achieving the best possible performance.
Thesis Overview
This thesis provides a comprehensive and mathematically detailed examination of how optimization is integral to supervised learning. It combines theoretical concepts with practical applications, covering several key themes.
- Theoretical Foundations: The thesis delves into the mathematical principles underlying optimization and ML, laying a strong foundation for understanding the models discussed.
- Algorithmic Analysis: Each chapter methodically breaks down algorithms such as linear regression and SVM, explaining their optimization models in detail.
- Hyperparameter Tuning: A significant portion is dedicated to hyperparameter optimization, highlighting its crucial role in fine-tuning ML models for optimal performance.
- Empirical Validation: Through numerous numerical experiments, the thesis validates the theoretical models, showing how different hyperparameter settings can lead to varied results. This emphasizes the importance of hyperparameter optimization in achieving the best possible outcomes.
The overall text is intended to be accessible to those with a solid background in mathematics and optimization, with an emphesis on exploring ML from those perspectives. It aims to be a useful resource for understanding and applying optimization in ML.
Chapters
Chapter 1: Introduction
This chapter introduces the fundamental concepts of ML and SL, emphasizing the importance of optimization in these fields. It outlines the main goals and scope of the thesis, setting the stage for exploring where and how optimization is used in ML.
Chapter 2: Fundamental Optimization Principles
We review essential optimization concepts, including non-linear and convex optimization. This chapter covers basics like the Lagrangian function and the Karush-Kuhn-Tucker (KKT) conditions, which are crucial for understanding the models discussed later.
Chapter 3: Foundations of Machine Learning (ML)
Here, we delve into the core principles of ML, describing its main subfields: supervised learning, unsupervised learning, and reinforcement learning. We also discuss the ML pipeline and the No-Free-Lunch Theorems, highlighting the necessity of a wide range of different ML algorithms.
Chapter 4: Optimization in Linear Regression
This chapter focuses on linear regression, presenting optimization models such as Maximum Likelihood Estimation (MLE) and Maximum a Posteriori (MAP). We discuss how these models are derived, their closed-form solutions, and their application in linear regression.
Chapter 5: Optimization in Support Vector Machines (SVM)
We explore the optimization techniques used in SVMs, including the primal and dual formulations of hard-margin and soft-margin SVMs. This chapter also covers feature mapping and kernel functions, explaining how these methods help SVMs handle complex datasets utilizing kernel SVMs.
Chapter 6: Hyperparameter Optimization (HPO)
Hyperparameter optimization is crucial for fine-tuning ML algorithms. This chapter discusses HPO algorithms and presents optimization models for tuning hyperparameters. We emphasize the challenges and strategies for effective hyperparameter optimization and its importance in achieving optimal results.
Chapter 7: Numerical Experiments
This chapter presents practical evaluations of the discussed SL algorithms. It includes detailed numerical experiments to show how linear regression and SVM perform under different hyperparameter settings. These experiments illustrate how varying hyperparameters can lead to different results, highlighting the critical role of hyperparameter optimization in achieving the best outcomes.
Chapter 8: Conclusion
The conclusion summarizes the key findings, reiterating the importance of optimization in SL. It reflects on how the discussed optimization models and techniques contribute to improving ML algorithms. The study underscores that while the focus was on a subset of ML, the optimization principles are widely applicable across different ML models. It also highlights the importance of hyperparameter optimization and the need for ongoing exploration of advanced optimization methods to keep pace with the evolving field of machine learning.