Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Started and Documentation Clarification #74

Closed
JpEncausse opened this issue Jun 10, 2024 · 2 comments
Closed

Getting Started and Documentation Clarification #74

JpEncausse opened this issue Jun 10, 2024 · 2 comments
Assignees

Comments

@JpEncausse
Copy link

JpEncausse commented Jun 10, 2024

Hello,
I'd like to use embedJs to handle the RAG part of a larger application. But i'm not sure If/How it will fit.

Workflow

  • I get a user query to my application
  • I handle all the LLM part (based on Azure OpenAI GPT-4o)
  • I expose a tool that is the RAG application (or other tools)
  • The RAG convert the query into an embedding
  • Then run Cosine Similarities on it's data (previously chunk and with embeddings)
  • Then return the most promising chunk and sources.

Questions

  1. Can I use embedJs to handle the RAG part ? or is it stupid ? I don't want it to manage full LLM with query, history, ... but only focus to Store/Retrieve data from a Database. Is it possible to get a simple exemple ?
  2. Can I provide a config object or parameter when I instantiate the Model instead of using ENV variable ?
  3. Can I use the Azure OpenAI v3 Large as embedding ?

The confusing part of the documentation is we don't understand the workflow of request done to LLM and because each request cost something it is important to understand the magic underneath

@adhityan
Copy link
Collaborator

Hello; this is possible today (with a caveat) -

  1. You instantiate the embedJs library with the embedding model and vector database of your choice
  2. Use the loaders to add the content you want to add / use custom loaders to add new not-yet supported propreitory data
  3. You call the getContext method on RAGApplication instance - this method retrieves relevant embeddings, filters them and returns you the results. Note: no LLM calls are made uptil this point anywhere.

Caveat - today the library expects you to also always instantiate a LLM. You don't need actual credentials; just use the HuggingFace model and give any dummy credentials (or use any LLM model instead). The credentials are irrelevant as no LLM calls are made in the steps above. LLM calls only happen when you call the query method.

That said - your usecase is very interesting and frankly something the library should absolutely support. In the next release, I will remove the need to add a LLM and use it as a pure embedding store. Post the next release, you don't need to concern yourself with the caveat above.

@adhityan
Copy link
Collaborator

In the latest version 0.0.88 you can turn off the initialization of any LLM by passing "NO_MODEL" as the parameter to setModel method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants