Why I prefer to roll my own RAG:
Lastly, this isn’t to do with cost but there is a LOT more to deal with RAG than simple semantic comparisons. If you notice that the results aren’t satisfactory your only option is to try and improve the document. In comparison to a vector DB you have a massive amount of tools & metrics to work with.
As cool as it is, it doesn’t make sense to offload all the quality-of-life improvements you can get with tweaking into the black box that is the Assistants API (especially if you don’t have millions in VC to back up your experiments).