The topic of LLMs and agents/<somethingsomething>-assistants is reaching a point in which commodity hardware can easily run even multiple models locally. I did some experimentation, and this is the setup I ended up with. This is on my personal computer, not for work, so your mileage may vary …