large language models

Created: by Pradeep Gowda Updated: Oct 09, 2023 Tagged: llm · deep-learning · chatgpt

See also generative-ai page. and smol-llm, transformer-math, LlamaIndex, RAG, and local-llm, llm-embedding, AI SaaS, LLM Training, Reward Models

Introductory Materials

Generative AI exists because of the transformer – A visual story from Financial Times; Sept 2023.

Large language models, explained with a minimum of math and jargon

Study Guides


  • Lil’Log Hi, this is Lilian. I’m documenting my learning notes in this blog. Other than writing a ML blog, I’m leading Applied Research at OpenAI on the side.
  • Finbarr Timbers – eg: Five years of GPT progress


  • Sparks of Artificial General Intelligence: Early experiments with GPT-4. (2023) PDFBubeck, Sébastien, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, et al. “Sparks of Artificial General Intelligence: Early experiments with GPT-4,” 2023.
  • SeamlessM4T—Massively Multilingual & Multimodal Machine Translation | Meta AI ResearchBarrault, Loïc, Yu-An Chung, Mariano Cora Meglioli, David Dale, Ning Dong, Paul-Ambroise Duquenne, Hady Elsahar, et al. SeamlessM4T-Massively multilingual & multimodal machine translation,” 2023. https://arxiv.org/abs/2308.11596.
  • Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. “Language models are few-shot learners,” 2020. https://arxiv.org/abs/2005.14165.


An observation on Generalization - YouTube by Ilya Sutskever (OpenAI); Aug 14, 2023.

  • Supervised Learning - precise mathematical condition under which learning should succeed, which is - Low training error + more training data than “degrees of freedom” = low test error

Prompt Engineering

Open source models

Stuff you can run on your computer

smol-ai/developer: with 100k context windows on the way, it’s now feasible for every dev to have their own smol developer

How is LLaMa.cpp possible? how can we run llama.cpp on local machines when the expectation is that large models need expensive GPUS (eg: A100) to run

Introducing Code Llama, a state-of-the-art large language model for coding Code Llama is a code-specialized version of Llama 2 that was created by further training Llama 2 on its code-specific datasets, sampling more data from that same dataset for longer. Essentially, Code Llama features enhanced coding capabilities, built on top of Llama 2. It can generate code, and natural language about code, from both code and natural language prompts (e.g., “Write me a function that outputs the fibonacci sequence.”) It can also be used for code completion and debugging. It supports many of the most popular languages being used today, including Python, C++, Java, PHP, Typescript (Javascript), C#, and Bash.

Ask HN: Cheapest way to run local LLMs? | Hacker News

See also Perplexity Labs where they have multiple models to try from

 Perplexity labs models  
Perplexity labs models; Apr 2024

LLMs in your language

All languages are NOT created (tokenized) equal

Small Language Models

  • Eldan, Ronen, and Yuanzhi Li. TinyStories: How Small Can Language Models Be and Still Speak Coherent English?” 2023.

Using LLMs


LlamaIndex 🦙 0.8.13

Haystack | Haystack Open-source LLM framework to build production-ready applications. > Use the latest LLMs: hosted models by OpenAI or Cohere, open-source LLMs, or other pre-trained models > All tooling in one place: preprocessing, pipelines, agents & tools, prompts, evaluation and finetuning > Choose your favorite database: Elasticsearch, OpenSearch, Weaviate, Pinecone, Qdrant, Milvus and more > Scale to millions of documents: use Haystack’s proven retrieval architecture > Compare it to LangChainAI

GPT4All A free-to-use, locally running, privacy-aware chatbot. No GPU or internet required.

AI Proxy

an AI proxy that lets you use a variety of providers (OpenAI, Anthropic, LLaMa2, Mistral, and others) behind a single interface w/ caching & API key management.

MLC LLM Machine Learning Compilation for Large Language Models (MLC LLM) is a high-performance universal deployment solution that allows native deployment of any large language models with native APIs with compiler acceleration. The mission of this project is to enable everyone to develop, optimize and deploy AI models natively on everyone’s devices with ML compilation techniques.

Project Overview Project Overview of MLC LLM

Multimodal Learning



OWASP | Top 10 for Large Language Models

Operational Issues


GGUF and GGML are file formats used for storing models for inference, especially in the context of language models like GPT

Benchmarking LLMs

LLM Benchmark Report for: NousResearch/Redmond-Puffin-13B





Multimodal Learning Multi Agent Frameworks AI Agent Framework