LLM

large language models

Created: by Pradeep GowdaUpdated:Sep 10, 2024Tagged: llm · deep-learning · chatgpt .

Introductory Materials

Generative AI exists because of the transformer – A visual story from Financial Times; Sept 2023.

Large language models, explained with a minimum of math and jargon

A High-level Overview of Large Language Models - Borealis AI

Books

Study Guides

-Mastering LLMs workshop. See Takeaways from Mastering LLMs Course from SwaroopCH. - The Novice’s LLM Training Guide; copy - Normcore LLM Reads by Vicki Boykis - A Hackers’ Guide to Language Models - YouTube by jeremy-howard - Transformer Math 101 | EleutherAI Blog - Lil’Log; lilian weng’s blog

Blogs

Papers

Videos

An observation on Generalization - YouTube by Ilya Sutskever (OpenAI); Aug 14, 2023.

Prompt Engineering

Models

See this page - Models Table – Dr Alan D. Thompson – Life Architect for a visual representation of models, and a table of various attributes of models.

Stuff you can run on your computer

smol-ai/developer: with 100k context windows on the way, it’s now feasible for every dev to have their own smol developer

How is LLaMa.cpp possible? how can we run llama.cpp on local machines when the expectation is that large models need expensive GPUS (eg: A100) to run

Introducing Code Llama, a state-of-the-art large language model for coding Code Llama is a code-specialized version of Llama 2 that was created by further training Llama 2 on its code-specific datasets, sampling more data from that same dataset for longer. Essentially, Code Llama features enhanced coding capabilities, built on top of Llama 2. It can generate code, and natural language about code, from both code and natural language prompts (e.g., “Write me a function that outputs the fibonacci sequence.”) It can also be used for code completion and debugging. It supports many of the most popular languages being used today, including Python, C++, Java, PHP, Typescript (Javascript), C#, and Bash.

Ask HN: Cheapest way to run local LLMs? | Hacker News

See also Perplexity Labs where they have multiple models to try from

 Perplexity labs models  
Perplexity labs models; Apr 2024

LLMs in your language

All languages are NOT created (tokenized) equal

Small Language Models

Building LLMs

Using LLMs

Fine-tuning has one huge advantage though: it is far more effective at guiding a model’s behavior than prompting, so you can often get away with a much smaller model. That gets you faster responses and lower inference costs. A fine-tuned Llama 7B model is 50x cheaper than GPT-3.5 on a per-token basis, and for many use cases can produce results that are as good or better! For example, classifying the 2M recipes at https://huggingface.co/datasets/corbt/all-recipes with GPT-4 would cost $23k. Even with GPT-3.5 it would cost over $1k. The model we fine-tuned performs similarly to GPT-4 and costs just $19 to run over the entire dataset. OpenBMB/ToolBench: An open platform for training, serving, and evaluating large language model for tool learning.

Frameworks

LlamaIndex 🦙 0.8.13

Haystack | Haystack Open-source LLM framework to build production-ready applications. > Use the latest LLMs: hosted models by OpenAI or Cohere, open-source LLMs, or other pre-trained models > All tooling in one place: preprocessing, pipelines, agents & tools, prompts, evaluation and finetuning > Choose your favorite database: Elasticsearch, OpenSearch, Weaviate, Pinecone, Qdrant, Milvus and more > Scale to millions of documents: use Haystack’s proven retrieval architecture > Compare it to LangChainAI

GPT4All A free-to-use, locally running, privacy-aware chatbot. No GPU or internet required.

AI Proxy

an AI proxy that lets you use a variety of providers (OpenAI, Anthropic, LLaMa2, Mistral, and others) behind a single interface w/ caching & API key management.

MLC LLM Machine Learning Compilation for Large Language Models (MLC LLM) is a high-performance universal deployment solution that allows native deployment of any large language models with native APIs with compiler acceleration. The mission of this project is to enable everyone to develop, optimize and deploy AI models natively on everyone’s devices with ML compilation techniques.

Project Overview Project Overview of MLC LLM

Multimodal Learning

todo

Security

OWASP | Top 10 for Large Language Models

Abliteration

Uncensor any LLM with abliteration

Operational Issues

GGUF

GGUF and GGML are file formats used for storing models for inference, especially in the context of language models like GPT

Benchmarking LLMs

LLM Benchmark Report for: NousResearch/Redmond-Puffin-13B

Linkdump

2023

2024

References


See also: Multimodal Learning, Multi Agent Frameworks, AI Agent Framework, generative-ai, smol-llm, transformer-math, LlamaIndex, RAG, local-llm, llm-embedding, AI SaaS, LLM Training, Reward Models, AI Code Assistants, Building LLM Based Systems, LLMs in Data Management