LLMs can find a needle in the haystack

GPT outperforms Claude.


Friday December 15, 2023


Is RAG necessary when you have incredible memory?


Check out this thread:

This is a powerful analysis. Sure, Anthropic will find a way to improve or challenge the results. But the point is clear: these technologies can remember hyper specific 7-digit random numbers out of a batch of 126,000 tokens, where a token is roughly 4 characters. GPT is clear winner here, too.

Also, open source is getting incredibly good. This implies the future is open source.


RAG can be used to make retrieval more efficient. But if retrieval is already super efficient maybe RAG is only a short term thing. Context lengths of 10m tokens…probably by next year right?

Start of the year we were at 4K tokens. Now there are 126,000 tokens. 30x improvement. So to do another 30x improvement is 3.76M. So yea, by next year you should be able to just load the entire RAG database into memory. But…gonna be super expensive.

Point is: would GPT be this effective if it was using RAG over a database? Or is it more effective loading it all into context?


Bryan lives somewhere at the intersection of faith, fatherhood, and futurism and writes about tech, books, Christianity, gratitude, and whatever’s on his mind. If you liked reading, perhaps you’ll also like subscribing: