Home

Custom LLM

Running own LLM

https://twitter.com/shrihacker/status/1704276963085701221

The cheapest solution currently is probably llama 13b + finetuning/distillation of previous chatgpt outputs. Easily a tenth of GPT 3.5 cost, 10-20 times the speed of the API and nearly comparable quality on an A100.