Small LLMs aka SLMs

smoler the better

Created: by Pradeep GowdaUpdated:Aug 04, 2024Tagged: smol-llm · llm · slm .

LLMs that you can run on the desktop or a “regular(ish) PC”.

A look at Apple’s new Transformer-powered predictive text model

the model being used by AppleSpell, an internal macOS application that checks for spelling and grammar mistakes as you type.

 found the predictive text model in /System/Library/LinguisticData/RequiredAssets_en.bundle/AssetData/en.lm/unilm.bundle. The bundle contains multiple Espresso model files that are used while typing (Espresso appears to be the internal name for the part of CoreML that runs inference on models).

  a set of 15,000 tokens in unilm.bundle/sp.dat that pretty clearly look like they form the vocabulary set for a large language model.

Read the rest of the above blog post to see how the tokenizer works, model architecture (GPT-2?) of about 34M parameters and hidden size of 512 units, which makes it smaller than GPT-2 models.


Orca 2: Teaching Small Language Models How to Reason - Microsoft Research; see

M2 Max with 64GB RAM. It does ~50 tok/s on our q4 quantized 7b mistral fine-tune, with comparable speeds to GPT-4 via


moondream

moonbeam is a computer-vision model can answer real-world questions about images. It’s tiny by today’s models, with only 1.6B parameters. That enables it to run on a variety of devices, including mobile phones and edge devices.

Apache 2.0. You can use moondream for commercial purposes.

Applications:

  1. Security
  2. Drone and Robotics
  3. Retail and shopping –

Prem 1B and Prem 1B chat


Florence - a Microsoft Collection; SOTA 200M & 800M parameter vision foundation model. MIT Licensed!. 200M checkpoint beats Flamingo 80B (400x bigger model) by a huge margin. Performs captioning, object detection and segmentation, OCR, phrase grounding and more. Leverages FLD-5B dataset - 5.4 billion annotations across 126 million images. Multi task learning. Fine-tuned model checkpoints beat the likes of PaLI, PaLI-X.

“Florence2 200M, Qwen2 500M, MSFT InstructLM 500M With little fine-tuning they unlock so many creative and on-device use cases” via


Fine-tune Llama-3-8B with Llama-3-405B synthetic data

A simple notebook for fine-tuning a small model (Llama-3-8B) to be an expert in a specific domain, by letting a larger, more capable model (Llama-3-405B) teach it (by generating synthetic dataset for that domain).

nisten/Biggie-SmoLlm-0.15B-Base · Hugging Face via

MacOS desktop

Phone


aiOS™ by Hyperspace “Organizing the World’s AI Agents. Join the world’s largest peer-to-peer AI network and start earning points”