James R. Hull - Notes on OpenAI’s Retrieval Process

Boris (tech engineer at OpenAI) on that retrieval slide:

There’s so much to be said about this, but key is a good evaluation framework, and then doubling down on understanding what went wrong. Possibilities: 1. Outdated, inaccurate or contradictory information in source documents 2. Retrieval failure 3. Answer synthesis 4. Unclear query 5. Bad evaluation by a human reviewer

It’s a definite art, but fun to engineer.

More on retrieval (Yi Ding):

A few thoughts on how OpenAI is implementing RAG in the new Assistants Retrieval tool before I was locked out.

They’re splitting on newlines. You can tell because they forget to insert the newlines back between the splits when giving you the reference (red squiggles).
The reference is contiguous. I haven’t tried more complicated questions so can’t rule out that they would do some kind of subquestion decomposition, but the queries I tried all just gave a single reference.
The number of chunks returned is variable. This indicates some kind of “small to big” strategy where they retrieve more context than they match. Perhaps something similar to our AutoMergingRetriever https://docs.llamaindex.ai/en/stable/examples/retrievers/auto_merging_retriever.html w/ some kind of similarity cutoff but worth a bit more investigation.
They have some trouble handling UTF-8.

In general, very bleeding edge still, but look forward to seeing how this evolves.