50/Saturday morning hikes are great

Monday, 2024-09-09 to Sunday, 2024-09-15

Created: by Pradeep GowdaUpdated:Sep 10, 2024Tagged: weekly .

Met old colleagues LM, and DS for lunch, and coffee this week. Always nice to meet old friends. Rivet Coffee bar and Roastery up in Westfield is nice. It was chock-full of people on a Friday. Nice day to sit out too, though we stayed indoors, all the outdoor spots were taken.

On Saturday, 11/14, went for a hike with the “Saturday Morning Walking Bunch” at MacGregor Park in Westfield, IN. We did 6 miles around the park. After that I took them to Rivet for coffee, the place was overflowing on a Saturday morning - a good sign for the coffee shop. The light roast Ethiopia coffee was once again delightful.

LLMs

aider  lets you pair program with LLMs, to edit code in your local git repository. Start a new project or work with an existing git repo. Aider works best with GPT-4o & Claude 3.5 Sonnet and can connect to different LLMs.

python -m pip install aider-chat

# Change directory into a git repo
cd ~/code/project

# Work with Claude 3.5 Sonnet on your repo
export ANTHROPIC_API_KEY=your-key-goes-here
aider

Purchased Sebastian Raschka’s Build a Large Language Model (From Scratch).

JAX and Equinox: What are they and why should I bother?

Interesting paper on LLMs hallicinating and how it might be never possible to totally get rid of them – Banerjee, Sourav, Ayushi Agarwal, and Saloni Singla. LLMs will always hallucinate, and we need to live with this,” 2024. https://arxiv.org/abs/2409.05746.

WordLlama: Things you can do with the token embeddings of an LLM.  An NLP toolkit that handles tasks like fuzzy-deduplication, similarity and ranking with minimal inference-time dependencies and optimized for CPU hardware. WordLlama is a utility for NLP and word embedding model that recycles components from large language models (LLMs) to create efficient and compact word representations (such as GloVe, Word2Vec or FastText). WordLlama begins by extracting the token embedding codebook from a state-of-the-art LLM (e.g., LLama3 70B), and training a small context-less model in a general purpose embedding framework. Miller, D. Lee. WordLlama: Recycled token embeddings from large language models,” 2024. https://github.com/dleemiller/wordllama.

ICML 2024 Tutorial: Physics of Language Models - YouTube by Zeyuan Allen-Zhu; See additional material.

Programming

Signed up to Database Design and Implementation Book Club by Phil Eaton. The textbook under discussion is Database Design and Implementation by Edward Sciore. First class is on October 5, 2024. The textbook uses a pedagogical database named SimpleDB written in Java (See the same repo for code).

After hearing about Liveview Native, I once again tried Elixir. With Mise (which I first mentioned in 46, it is quite easy: mise use -g elixir@1.17.2-otp-27. However, erlang needs to be installed separately with mise use -g erlang@27.

From around the web

The Modern Carousel - Daniela Spector Photography, via X/daniela0spector.

She scanned a bunch of negatives she bought on Ebay, and found photos from an India trip.