Contents
- New article (added 7/24/23)
- In theory, LLMs can encode algorithms of high complexity (added 6/25/23)
- In practice, out-of-the-box LLMs have significant limitations in complex reasoning algorithms (added 6/24/23)
- LLMs can learn specific algorithms, but they need to be taught (added 6/24/23)
- Basic “concept math” appears to be a good representation of how LLMs and other networks understand the world (added 6/23/23)
- For transformers, making sense of concepts, and reasoning on those concepts, appear to be two different things (added 6/23/23)
- An LLM watching another LLM is a good design primitive (added 6/4/23)
- Transformers truly learn meaning from form, they aren’t just stochastic parrots (added 5/22/23)
- Training an LLM on code and language is surprisingly synergistic (added 4/30/23)
- On average, we are all stupid (added 4/30/23)
- LLMs can be tricked into giving wrong answers incredibly easily, and when forced to think harder, they become even more wrong (added 5/14/23)
- An LLM flawlessly trained on adding 16-digit numbers can’t even generalize to 17-digit numbers (added 5/25/23)
- LLMs are fundamentally not deterministic, so don’t count on that (added 6/26/23)
- In fine-tuning, every letter counts (added 5/23/23)
- LLMs are giant superpositions of personalities (added 4/30/23)
- The order with which an LLM sees data during training doesn’t matter for memorization (added 5/6/23)
- The frequency with which an LLM sees training data seems to matter for its performance related to that data (added 5/6/23)
- On certain tasks, the typical LLM scaling (bigger is better) is reversed and bigger is worse (added 5/7/23)
- The higher the model layer, the more complex the job of its neurons (added 5/22/23)