Playing with local LLMs

I have been somewhat interested about LLM performance for years, and it used to be that playing with them was quite painful (e.g. conda ecosystem in general sucks and it used to be that GPU was mandatory), but now with ollama ( https://ollama.com/ ) they’re quite trivial to benchmark across different devices without need for setting up complex stack. So this morning I indulged.. I have not yet gotten around to checking the numbers on a real GPU card, but here’s what I found out at my home (without starting gaming PC)....

25.4.2024 · 3 min · 634 words · Markus Stenberg