Found today another AI "toy" to play with.
It is called tabby and you'll find it on
GitHub.
Tabby is indeed an assistant and one that you can self-host. No matter where your Linux, MacOS or Windows computer is running (on-prem/hybrid or cloud) it will work. Instructions to download and run the software are very easy. It will download 2 (smaller) LLM's from Qwen and StarCoder, if it doesn't find these on your system.
Currently I'm testing it with a computer based on a pre-ryzen AMD APU that AMD adjusted to fit on motherboards that support Ryzen 1st gen till 3th gen. That computer also has an old NVidia GeForce 1650 card which has (only) 4 GByte of VRAM on it. And yet, both LLM's fit in there. The website has a listing of which LLM's are supported and their requirements, including the required NVidia development code. It might all sound complicated, it really isn't.
Once you have it running, tabby becomes a server on your computer. Acces it by entering
http://localhost:8080 in a browser on the computer that hosts tabby. Or use any other computer with a browser in your network to visit: http://<ip address tabby>:8080
You will be asked to create an admin account the first time you access it. Tabby comes with a community license for free (till 5 users). They also have subscription plans if that is more your thing.
My test machine could be considered crippled. However, tabby performs really well. Even on my testing clunker it is almost as fast as ChatGPT. This amazed me to no end. Sure, the models are small and I have had hardly any time to really check how useful the answers it provides truly are.
But the responsiveness and ease of working with the community version, that was a (very) pleasant surprise. So I thought to mention it here at DC.
Oh, while it comes with its own web interface, in there are links that point to ways on how to incorporate tabby into editors like VSCode. If that is more to your liking.