My Current Approach in Learning and Experimenting with LLMs and Deep Learning

March 18, 2026319 wordstotal reads
LLMsLearning

I do not know if it is right or wrong but my approach from the past month has been coming across a topic and then going deep into it.

For example I picked up the OLMo 3 blog and went deep into it. That was my first experience coming across the terms involved in the complete training lifecycle of an LLM, like Base Model, Thinking SFT, DPO, RL, SFT, cold start, etc.

Then I came across the famous Anthropic distillation post and learned about the terms hard distillation and soft distillation. It is the fastest way to at least reach closer to frontier-level models if compute or capital is the bottleneck and also if you want to make a lighter smaller and cheaper version of your big daddy model.

I am actually liking this way much much better than starting from scratch at the Mathematics level because the breadth I was able to cover with this strategy in just a single month was very good.

Through the OLMo 3 blog by Allen AI I was able to get a rough idea how these AI labs go about things and the major steps involved when training an AI model the different phases pre-training mid-training and post-training.

The most interesting thing to me was the Thinking SFT part of it when we take a base pre-trained model and teach it to emit reasoning traces those think tags that you see the model do. Which led me to my first experimental project Teach a 3B Base Model to Emit Reasoning Traces basically doing a cold start SFT with a targeted dataset on a base model using the LoRA technique.

I would love thoughts from the Actual researchers in Academia or Industry to see if this approach of trying everything first so that I can truly see what I actually like and then diving deep into it is the right one or not.