Stochastic Gradient Descent

This covers my work on the second lesson of the Fast AI course. The lesson looked at both data prep/cleaning and then started into SGD.

Data Prep and Cleaning

Drawing on my old days working with data quality software, this lesson emphasized the importance of good data sets, in particular good training sets. Often a good scripting language, such as Python, is ideal for the wrangling required to get data ready for your project.

In this example, fast.ai library provides verification:

and, in this case, visual validation:

Next up, the instructor covered the basic data structures we’d be using: namely tensors which just multi-dimensional matrices. Using regression, we then attempted to find functions that could predict a simple set of data while minimizing the mean-squared error.

The beauty of stochastic gradient descent is its iterative approach to optimizing a function to match the data. In particular, SGD uses a randomly (stochastic) selected search path, but measures each step along the way for decreasing error.

Takeaways

In this lesson, we then extended from the simple function example, to a classifying function for teddy bears, using mini-batches to ensure a performant search. At the end, we had a decent machine learning algorithm (aka a function) that could pretty reasonably identify teddy bears in an image.

Next in this series…

Lesson 3

← Previous Post Next Post →