[Roadmap] RAG #1657

thinkall · 2024-02-13T02:41:34Z

Why RAG

Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of LLMs by incorporating a retrieval mechanism into the generative process. This approach allows the model to leverage a vast amount of relevant information from a pre-existing knowledge base, which can significantly improve the quality and accuracy of its generated responses. Thus, for agents chat, incorporating a RAG agent offers several compelling advantages that can significantly enhance the performance and utility of your agent system.

RAG in AutoGen

AutoGen has provided RetrieveUserProxyAgent and RetrieveAssistantAgent for performing RetrieveChat in Aug, 2023 and announced it in blog in Oct, 2023. Given a set of documents, the Retrieval-augmented User Proxy first automatically processes documents—splits, chunks, and stores them in a vector database. Then for a given user input, it retrieves relevant chunks as context and sends it to the Retrieval-augmented Assistant, which uses LLM to generate code or text to answer questions. Agents converse until they find a satisfactory answer.

As both AutoGen and RAG are evolving very fast, we find that many users are asking for supports on customized vector databases, incremental document ingesting, customized retrieve/re-ranking algorithms, customized RAG pattern/workflow, etc. We've adjusted some of the issues and feature requests, such as we've added QdrantRetrieveUserProxyAgent for using qdrant as the vector db; we've integrated UNSTRUCTURED to support many unstructed documents. However, there are many more to do.

Our Plan

In order to better support RAG in AutoGen, we plan to refactor the existing RetrieveChat agents. The goals includes:

Primary goals

Support launching RAG with one agent instead of two
Support customizing vector databases with a parameter instead of extending agent class
Support RAG in AutoGen Studio
Support leveraging 3rd-party OSS tools
Make RAG a capability for any conversable agent
Support RAG as a tool like in OpenAI Assistant
Make vector db dependency optional
the chat interface of the RAG agent is the same as any other conversable agent

Optional goals

Support async functions
Support benchmarks
Support evaluation

Tasks

Give feedback

[Bug]: Retrieve Agents not working with function calls #1469

bug rag
[Feature Request]: agent RAG #1387

enhancement rag
[Bug]: Function calling in groupchat does not work #1440

bug function/tool group chat
Fix issue 1440 by applying new function registration decorator #1661

function/tool rag
Refactor RAG agents with core functionalities #1726

4 of 4

rag
[Feature Request]: Support for different retrieval algorithms for RAG agents #1047

enhancement
https://github.com/microsoft/autogen/discussions/484
Automatically decide whether RAG is needed
Add a score threshold for retriever
Add a vectordb module #2263
Add html parser for RAG and some improvements #2271

rag
Add source to the answer for default prompt #2289
[Bug]: overlap parameter in the split_text_to_chunks not used. #1844

bug rag
ChromaDB required even if you use QdrantRetrieveUserProxyAgent and not Chroma #531

rag
How to add a new vector database? #725

rag
Autogen Retrieval User Proxy generate_init_message bug #859

rag
add RetrieveUserProxyAgent and RetrieveAssistantAgent from autogen studio #1723

enhancement rag studio
[Issue]: QdrantRetrieveUserProxyAgent is missing support for text-embedding-ada-002 embedding model #1282

enhancement rag
[Issue][RAG] Reuse an existing vector database for RetrieveUserProxyAgent #1261

rag
Support setting vector_db as a param #2313
Add documentation to user guide
Delete files from a collection?
Add better code parser
MongoDB + new abstraction of vectordb #2942

rag
Add RAG Agent to AutoGen Studio #2881

rag studio vectordb
Improve update context condition checking rule #2883

rag
Bugfix: PGVector/RAG - Calculate the Vector Size based on Model Dimensions #2865

rag vectordb
PGVector Support for Custom Connection Object #2566

rag vectordb
2447 fix pgvector tests and notebook #2455

bug rag vectordb
RetrieveUserProxyAgent, use context_max_tokens from retrieve_config if provided #2259

rag
Fix unstructured deps installation error #2248

rag
[Bug]: LlamaIndex checks failing #3046

bug
Options

The text was updated successfully, but these errors were encountered:

Knucklessg1 · 2024-03-01T03:59:33Z

Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.

julianakiseleva · 2024-03-01T17:18:54Z

@thinkall

thinkall · 2024-03-06T11:44:48Z

Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.

Hi @Knucklessg1 , contribution is welcome, thank you for your interest!

WaelKarkoub · 2024-03-19T23:03:20Z

Hi @thinkall, would this PR, #2046, help out with Automatically decide whether RAG is needed?

I was thinking if the agent adds a tag like <rag context="some context"> in the message, we can intercept that by one of the hooks or even a reply, and then perform some rag operations

thinkall · 2024-03-20T10:01:39Z

Hi @thinkall, would this PR, #2046, help out with Automatically decide whether RAG is needed?

I was thinking if the agent adds a tag like <rag context="some context"> in the message, we can intercept that by one of the hooks or even a reply, and then perform some rag operations

Thank you @WaelKarkoub , interesting idea! Would adding mean RAG is already performed?

WaelKarkoub · 2024-03-20T10:33:56Z

@thinkall we could define what that tag means by adding attributes (e.g. <rag context="some context" task="search">could mean it needs to look through some databases) I'm not fully familiar with how rag works, but that tag system should be general enough for multiple use cases.

ChristianWeyer · 2024-03-23T11:13:31Z

Great initiative @thinkall.
How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

thinkall · 2024-03-24T09:40:00Z

Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?

ChristianWeyer · 2024-03-24T11:24:38Z

Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?

One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support.

thinkall · 2024-03-25T08:54:36Z

Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?

One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support.

Agree!

Would you like to have a quick chat on this? It would be great to hear more from you!

dsalas-crogl · 2024-03-25T18:07:23Z

@thinkall Will the upcoming RAG update still require using message_generator in groupchat scenarios? It's my understanding that currently, the RAG agent has to initiate chat and message_generator has to be used, which results in all initial prompt messages being sent through retrieve_docs in RetrieveUserProxyAgent.

ChristianWeyer · 2024-03-26T10:25:07Z

Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?

One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support.

Agree!

Would you like to have a quick chat on this? It would be great to hear more from you!

Sure. I am cethewe in AG Discord.

thinkall · 2024-03-30T15:14:46Z

Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.

Hi @Knucklessg1 , contribution is welcome, thank you for your interest!

Hi @Knucklessg1 , are you in our Discord channel? Could we have a quick chat? Thanks.

thinkall · 2024-03-30T15:18:57Z

@thinkall Will the upcoming RAG update still require using message_generator in groupchat scenarios? It's my understanding that currently, the RAG agent has to initiate chat and message_generator has to be used, which results in all initial prompt messages being sent through retrieve_docs in RetrieveUserProxyAgent.

Hi @dsalas-crogl , I'd like to remove the usage of message_generator, would that benefit your use case? Thanks.

Are you in our Discord channel?

Knucklessg1 · 2024-03-30T15:24:23Z

Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.

Hi @Knucklessg1 , contribution is welcome, thank you for your interest!

Hi @Knucklessg1 , are you in our Discord channel? Could we have a quick chat? Thanks.

Yes absolutely. I reached out on Discord.

jamesliu · 2024-03-30T15:44:56Z

@thinkall any flow diagram regarding the rag?

thinkall · 2024-03-31T08:19:35Z

@thinkall any flow diagram regarding the rag?

Hi @jamesliu , there's one diagram here, you can find the workflow details in the Introduction section.

Josephrp · 2024-04-03T10:56:11Z

interesting roadmap , and i'm very happy with chromadb , looking forward to in memory vector store too , now. if anyone is interested it could be a good opportunity to collaborate and break down complex tasks .

i'll also consider creating + sharing an "advanced upsert" agent , which enriches the text chunks to improve retrieval performance.

raolak · 2024-04-08T08:34:16Z

Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:

Existing code on which we need to run fix? or
New codes (eg. new service) which goes through incremental development

Usecase:

In a system comprised of numerous microservices that are regularly updated with new features or fixes, how can an agent acquire knowledge about these changes? Is RAG a viable method for this?
For the development of new microservices within an existing system, how can knowledge be transferred to agents to enhance the design and implementation process using RAG?

thinkall · 2024-04-09T04:58:58Z

Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:

Existing code on which we need to run fix? or

New codes (eg. new service) which goes through incremental development

Usecase:

In a system comprised of numerous microservices that are regularly updated with new features or fixes, how can an agent acquire knowledge about these changes? Is RAG a viable method for this?

For the development of new microservices within an existing system, how can knowledge be transferred to agents to enhance the design and implementation process using RAG?

RAG can help if the documents are well organized. For pure code, currently, you can use 3rd party code chunk methods to help load the code into the vector dbs.

ekzhu · 2024-04-12T03:06:38Z

Let's also add documentation task to the roadmap? We should have a rag category under https://microsoft.github.io/autogen/docs/topics

ChristianWeyer · 2024-04-12T07:22:31Z

Do we also have a task on the roadmap for using custom embeddings? This is a very vital and important requirement for successful RAG.
Another would be to use a re-ranking model optionally to improve RAG results. @thinkall

thinkall · 2024-04-12T13:19:17Z

Do we also have a task on the roadmap for using custom embeddings? This is a very vital and important requirement for successful RAG. Another would be to use a re-ranking model optionally to improve RAG results. @thinkall

Custom embeddings are already supported and will also be supported in the new version.

Re-ranking may also be supported, but we may not implement the algorithms, instead we could support plugin different re-ranking models.

maximedupre · 2024-04-12T17:24:40Z

Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:

Existing code on which we need to run fix? or

New codes (eg. new service) which goes through incremental development

Usecase:

In a system comprised of numerous microservices that are regularly updated with new features or fixes, how can an agent acquire knowledge about these changes? Is RAG a viable method for this?

For the development of new microservices within an existing system, how can knowledge be transferred to agents to enhance the design and implementation process using RAG?

RAG can help if the documents are well organized. For pure code, currently, you can use 3rd party code chunk methods to help load the code into the vector dbs.

@thinkall Is it possible for you to provide an example of a 3rd party code chunk method? I'm very interested in extending the knowledge of an agent to my whole codebase :)

raolak · 2024-04-13T05:47:28Z

I’ve got some thoughts on how we could use agents to automate code generation +1 for Maxime, it would be really helpful to include a section on code chunking with examples to illustrate the working. Now, when it comes to generating code, agents shouldn't just spit out code, they should mimic the way engineers think and work in real life. Here’s what I mean: - Start by clearly defining the requirements—think OKRs that cascade from key results down to epics and stories. - Set up milestones and break down tasks among the agents involved. - Have each agent carry out their tasks and check in regularly to ensure everything’s on track, with room for human oversight when needed. - Provide visibility of OKR, milestones and tasks visibility at one place. Make a planner central for agent and human collaboration and progress tracking. - Keep the agent's execution isolated (in a separate process). May be a distributed workflow where each node can host one or more agents. Workflow is orchestrated through central planner contributed by agents and humans For the code workflow, we could see something like this: - Set up a new GitHub project with a clean, well-structured setup and isolated code (for new projects). - Handle resource creation, both on-premises and in the cloud. - Back it all up with a CI/CD system tailored for both on-premises and cloud environments. - Support incremental code commits through PR with CI/CD These steps would be helpful both for rolling out new features or fixes to existing projects and for starting fresh ones. I managed to get a mini reference implementation of a distributed key-value store (80%) using chatgpt(gpt4) and was able to build, test, and run the services locally (screenshot attached). I was experimenting with autogen to reproduce the steps that I have followed and see if I can achieve decent level autonomy (I am sure it will take many iterations :) ). I am still learning and experimenting. I will share my findings as I make progress. [image: Screenshot 2024-04-13 at 10.37.18 AM.png] Thanks for all the great work and support. Regards lnr

…

Message ID: ***@***.***>

cforce · 2024-04-13T09:58:39Z

@raolak

A solution for this has just been released
Checkout https://github.com/princeton-nlp/SWE-agent

thinkall · 2024-04-17T11:32:48Z

@thinkall Is it possible for you to provide an example of a 3rd party code chunk method? I'm very interested in extending the knowledge of an agent to my whole codebase :)

Hi @maximedupre , please check out an example of using 3rd party chunk method here: https://microsoft.github.io/autogen/blog/2023/10/18/RetrieveChat/#customizing-text-split-function

thinkall self-assigned this Feb 13, 2024

thinkall added the rag retrieve-augmented generative agents label Feb 13, 2024

sonichi added the roadmap Issues related to roadmap of AutoGen label Feb 13, 2024

ekzhu mentioned this issue Feb 13, 2024

Fix various failure modes when ingesting websites #1496

Open

3 tasks

thinkall mentioned this issue Feb 20, 2024

add RetrieveUserProxyAgent and RetrieveAssistantAgent from autogen studio #1723

Open

rickyloynd-microsoft mentioned this issue Feb 22, 2024

New rag agent #1727

Closed

7 tasks

thinkall mentioned this issue Feb 24, 2024

[Bug]: RetrieveUserProxyAgent ignores customized_prompt #1743

Closed

thinkall mentioned this issue Mar 8, 2024

[Feature Request]: Add threshold to retrieval config #1830

Closed

jackgerrits assigned qingyun-wu Mar 18, 2024

jackgerrits added the in-progress Roadmap is actively being worked on label Mar 18, 2024

jackgerrits changed the title ~~Roadmap for RAG~~ [Roadmap] RAG Mar 18, 2024

thinkall mentioned this issue Apr 3, 2024

Add a vectordb module #2263

Merged

3 tasks

thinkall mentioned this issue Apr 7, 2024

Support setting vector_db as a param #2313

Merged

3 tasks

randombet self-assigned this Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Roadmap] RAG #1657

[Roadmap] RAG #1657

thinkall commented Feb 13, 2024 •

edited by qingyun-wu

Loading

Tasks

Knucklessg1 commented Mar 1, 2024

julianakiseleva commented Mar 1, 2024

thinkall commented Mar 6, 2024

WaelKarkoub commented Mar 19, 2024

thinkall commented Mar 20, 2024

WaelKarkoub commented Mar 20, 2024 •

edited

Loading

ChristianWeyer commented Mar 23, 2024

thinkall commented Mar 24, 2024

ChristianWeyer commented Mar 24, 2024

thinkall commented Mar 25, 2024

dsalas-crogl commented Mar 25, 2024

ChristianWeyer commented Mar 26, 2024

thinkall commented Mar 30, 2024

thinkall commented Mar 30, 2024 •

edited

Loading

Knucklessg1 commented Mar 30, 2024

jamesliu commented Mar 30, 2024

thinkall commented Mar 31, 2024

Josephrp commented Apr 3, 2024

raolak commented Apr 8, 2024 •

edited

Loading

thinkall commented Apr 9, 2024

ekzhu commented Apr 12, 2024

ChristianWeyer commented Apr 12, 2024 •

edited

Loading

thinkall commented Apr 12, 2024

maximedupre commented Apr 12, 2024

raolak commented Apr 13, 2024 via email

cforce commented Apr 13, 2024

thinkall commented Apr 17, 2024

[Roadmap] RAG #1657

[Roadmap] RAG #1657

Comments

thinkall commented Feb 13, 2024 • edited by qingyun-wu Loading

Why RAG

RAG in AutoGen

Our Plan

Primary goals

Optional goals

Tasks

Knucklessg1 commented Mar 1, 2024

julianakiseleva commented Mar 1, 2024

thinkall commented Mar 6, 2024

WaelKarkoub commented Mar 19, 2024

thinkall commented Mar 20, 2024

WaelKarkoub commented Mar 20, 2024 • edited Loading

ChristianWeyer commented Mar 23, 2024

thinkall commented Mar 24, 2024

ChristianWeyer commented Mar 24, 2024

thinkall commented Mar 25, 2024

dsalas-crogl commented Mar 25, 2024

ChristianWeyer commented Mar 26, 2024

thinkall commented Mar 30, 2024

thinkall commented Mar 30, 2024 • edited Loading

Knucklessg1 commented Mar 30, 2024

jamesliu commented Mar 30, 2024

thinkall commented Mar 31, 2024

Josephrp commented Apr 3, 2024

raolak commented Apr 8, 2024 • edited Loading

thinkall commented Apr 9, 2024

ekzhu commented Apr 12, 2024

ChristianWeyer commented Apr 12, 2024 • edited Loading

thinkall commented Apr 12, 2024

maximedupre commented Apr 12, 2024

raolak commented Apr 13, 2024 via email

cforce commented Apr 13, 2024

thinkall commented Apr 17, 2024

thinkall commented Feb 13, 2024 •

edited by qingyun-wu

Loading

WaelKarkoub commented Mar 20, 2024 •

edited

Loading

thinkall commented Mar 30, 2024 •

edited

Loading

raolak commented Apr 8, 2024 •

edited

Loading

ChristianWeyer commented Apr 12, 2024 •

edited

Loading