My ML Intensive

Implementing ML algorithms from scratch, additional projects, links, and resources!

Oct 01, 2022

It’s been two weeks since I started my sprint on machine learning. This sprint is much more intense than my previous two sprints on design and existentialism. It is currently past midnight on a Friday night, and part of me still wants to work on my Jupiter notebook. This happened last night and the night before. I am quite absorbed in my ML sprint, so it only feels right to extend it. Many amazing people have given me recommendations and advice over the past 2 weeks, so I want to pass it on and share them here :)

minSkLearn - ML Algo from Scratch

My friend Jeremy (@JvNixon) recommended spending some time on fundamental ML algorithms, so I set the goal to implement a handful of major ML algorithms from scratch using Python and Numpy. In essence, I am implementing a mini-ML library, which I call “minSkLearn.” Check out my notes and the resources that helped me!

Micrograd & Neuronet layer on top
Andrej Karpathy’s lecture
My notes
Linear Regression (gradient descent, least squares, lasso/ridge regulation)
My notes
Logistic Regression
Helpful links: Intro to logistic regression, logistic gradient derivation
My notes
K-Means & K Nearest Neighbors
My notes
Convolutional Neural Network (CNN)
(ELI5) explanation with helpful graphics
Backpropagation math explanation
Simple CNN implementation explanation

I finished implementing algorithms 1-4 from scratch — see my github repo here! I realized the tremendous difficulty of implementing CNN after spending about 2 days on it. Even the famous Stanford computer vision grad course (CS231n) abstracts from the complexity and provides most of the code for their students in the CNN assignment. I came very close to giving up many times. However, I was somehow magically making progress every additional hour I put in, so I decided to keep going. As much as I was very frustrated, I was also the most intrigued that I’ve ever been since starting my meta-learning experiment. This feels like a real challenge — a difficult problem that seduces me step by step. I want to know if there’s a way to implement CNN on top of Micrograd as Andrej did for MLP. I think I am simultaneously afraid of this challenge and unable just to give up.

Some crazy math I am going through today

ML Applications

Aside from spending more time on CNN implementation on my extended ML sprint, I also want to dabble with applications of ML. Here’s a list of projects I’ve been working on or want to explore:

Sketchly: I am building a full stack app that generates illustrations for blog posts
Tech stack: React + Flask
Models: GPT3, Stable Diffusion
Current progress: With Patricia (@patriciamou_)’s blog post on the artistic furniture selection for the SF Commons as input, my code outputs this image :)
Dreambooth x Stable Diffusion: explore Stable Diffusion with Dreambooth, which fine-tunes text-to-image diffusion models and enables image generations that are subject-specific.
Notebook by @shivamshrirao that runs on Colab free tier
Current progress: After some debugging, I was able to train the model with pictures of myself and generate new photos of “<amy> in <new setting>”. However, the quality is not great. I am planning on trying 100 - 200 regularization images and 2k - 5k steps.
Makemore (reimplementation): a character-level language model for making more things such as new names!
Andrej’s tutorial
Check out Practical Deep Learning: for SWEs who wants to explore ML applications
Recommended by Ben (@8enmann), one of the amazing researchers behind GPT3, at an event to me!

It’s hard to scope exactly how long it will take for me to build and explore everything on my list, but I am excited to keep going. I feel like I am maybe, finally falling into a rabbit hole and happily digging deeper into it. In the end, that’s the feeling I’ve been searching for, so I will hang on to it :)

Finally, I am starting an autodidact accountability group with Benny (@BennyRubanov) and Amaan (@amaan_eth) - message if you want to join :)

Amy’s Meta Learning Experiment

Discussion about this post