Similar to yesterday’s post on running Mistral 8x7Bs Mixture of Experts (MOE) model, I wanted to document the steps I took to run Mistral’s 7B-Instruct-v0.2 model on a Mac for anyone else interested in playing around with it. Unlike yesterday’s post though, this 7B Instruct model’s inference speed is about 20 tokens/second on my M2 … Continue reading Running Mistral 7B Instruct on a Macbook→