Topic: Language modeling journey: From bigram prediction and DIY transformers to LLaMA 65B