End of W6

  • Paired with @Victoria Ritvo (she) (W2'26) on her Survivor ML elimination model where we tried swapping out her logistic regression model with a random forest classifier. With some tuning it had decent results. I had lots of fun coding without LLMs with @Victoria Ritvo (she) (W2'26) and revisiting some classical ML concepts plus I think her overall project idea is a fun one.

  • Paired with @Shalini Pyapali (she) (W2'26) on her Nora app which is now live! We spent the time going over the flow of her web and phone app to figure out how we could make the UX more intuitive. And I personally enjoyed seeing the progress of the app over time and being part of the journey.

  • Paired with @Tawfiq Hamid (he) (SP1'26) on fine-tuning an image generator in the style of XKCD comics. Before training anything, we tested the idea on nanobanana by prompting it to generate a comic in that style and it certainly did a decent job with some inconsistency in the background of the comic panels. So we are hoping fine-tuning an image generator would give more consistent results. We spent the rest of the time fetching the comics from XKCD's JSON API and putting together our dataset. And also setting up the environment to use this trainer from Huggingface.

  • Had an amazing chat with @Jasmine Kim (she) (SP1'26) over end of batch reflections and atproto. I think I get the atproto hype now. Because it is an open-source and decentralized network, users have control over their data, their app view, and can easily switch to a different app of a similar functionality whenever.

  • Also really enjoyed chatting with @Evan Gedrich Pintado (he) (W2'26) over neuroscience and his tvOS project.

  • During demo day, I loved playing Othello on a sphere with @cyrene zhang (any) (SP2'25) that @Rafal Dittwald (he) (SP1'26) made and seeing @Miguel Conner (he) (SP1'26) and @Joshua Cohen (he) (SP2'26) 's progress on training a mini-LLM. I wish I had also checked out the remote demos.

  • Clojure workshop with @Rafal Dittwald (he) (SP1'26) was a blast. I love that I got to learn more about functional languages and then immediately apply it to an advent of code problem by taking turns to drive via mob programming which was mildly stressful but mostly just fun because it was so collaborative and @Rafal Dittwald (he) (SP1'26) is truly a Clojure expert who helped us along.

  • On the model parallelism front, I spent the past two weeks fixing some of my original RL fine-tuning model logic and parameters and will continue working on getting the model distributed. These are some of the changes I made so far.

    • Improved context management by switching from just resetting the llm responses to a sliding window approach.

    • While I was unable to get the RL agent to learn against a greedy agent. It shows signs of learning against a much easier opponent like a random agent which is a start.

    • Played around with batch sizes. Smaller batch sizes for GRPO is better for memory footprint but less ideal for getting training signal per batch. And larger batch sizes increase GPU memory usage and make it intractable especially on 1 GPU.

    • Lowered learning rate as I noticed that later in the training episodes, I was getting bad responses from the LLM that did not follow my requested format in the system prompt which was a result of my policy changing too quickly.

  • 3D print with @c stavridis (they) (W2'26), @Evan Gedrich Pintado (he) (W2'26), and @Mike Cugini (they/he) (SP1'26). I just casually mentioned wanting to do this before the end of batch and @c stavridis (they) (W2'26) immediately showed me how to do it in Womp. And when I had issues getting the 3D printer to print properly, both @Mike Cugini (they/he) (SP1'26) and @c stavridis (they) (W2'26) helped me troubleshoot. Thank you both for being so generous with your time!

  • End of batch celebration with @Will Wang (he) (W2'26) @Shalini Pyapali (she) (W2'26) @Hugh Tipping (he) (SP1'26) @Isha Bhand (she/they) (W2'26) 

It may be my last official day in batch, but I will continue to be around and I am so glad to have spent the last six weeks at RC building and learning alongside everyone!

Next
Next

End of W5