Running own LLM
https://twitter.com/shrihacker/status/1704276963085701221
The cheapest solution currently is probably llama 13b + finetuning/distillation of previous chatgpt outputs. Easily a tenth of GPT 3.5 cost, 10-20 times the speed of the API and nearly comparable quality on an A100.