Qwen3, the latest LLM in the Qwen family uses a unified architecture for thinking and non-thinking mode, using the same LLM for reasoning.| DebuggerCafe
Phi-4 Mini and Phi-4 Multimodal are the latest Small Language Models for Chatting and Multimodal instruction following by Microsoft.| DebuggerCafe
In this article, we cover the summary of the Phi-3 technical report including the architecture, the dataset curation strategy, benchmarks, and Phi-3 vision capabilities.| DebuggerCafe
Instruction tuning the OPT-125M model by training it on the Open Assistant Guanaco dataset using Hugging Face Transformers.| DebuggerCafe
Instruction tuning the GPT2 model on the Alpaca dataset using the Hugging Face Transformers library and the SFT Trainer pipine.| DebuggerCafe