Skip to content

Simple, local-first, retrieval-augmented generation without tears (or API keys). Powered by MLX and AlphaMonarch.

License

Notifications You must be signed in to change notification settings

kgourgou/raggler

Repository files navigation

raggler

Point this at your files and enjoy some simple RAG (retrieval augmented generation). I mainly built this to quickly send questions to my obsidian vault, but it should work for any plain text files / markdown. Raggler will also try to chunk your PDFs.

See what this thing can do

The code currently uses MLX language models, so you will need Apple Silicon (M1, M1 Pro, etc.). However, it should be simple to change the code to use other language models.

Make a virtualenv (optional, but recommended), clone this repo, navigate to the cloned directory, and then run:

pip install .
python3 raggler.py 'Give me a geometry problem and then suggest a variation of it.' --files "tests/fake_files/" --ctx --rfr

It will take a while to download the LLM, but it will be cached for future use by the HF library.

Installing

Make a virtual environment first. Clone this repo, navigate to the cloned directory, then install the package in editable mode with:

pip install -e . 

or, if you have the just command runner installed,

just install

To get the dev requirements, run

pip install -r dev_requirements.txt

or, if you are impatient and just want to install everything,

just install-dev

Usage

Point-and-Rag in CLI

Raggler is mostly a "point at your files and rag" library.

If you have all of your files in the same directory, do something like this:

export RAGGLER_DIR=/path/to/your/files
python raggler.py 'A query for your files' --refresh_index 

You can also store RAGGLER_DIR in a local .env file within the project directory.

echo "RAGGLER_DIR=/path/to/your/files" > .env

The first time you run raggler.py, it will take a while to index your files. Your index will also be saved locally as a pickle file for fast retrieval under data/.

A few pointers:

  1. You don't need to refresh the index every time (except if your files have changed), but it also won't happen automatically when the files change.
  2. You can use the --show_context flag to see the context of the answer.

As a library

You can also use raggler as a library; see notebooks/.

I've tested it on a 16Gb M1 Pro with a few hundred files and python 3.11 and it works OK.

Acknowledgements

  • All chunking is possible thanks to LangChain.
  • Hugging Face for hosting language models and embedders.
  • The MLX team and community for allowing for fast inference on Apple Silicon.
  • Maxime Labonne for creating the AlphaMonarch model which handles the query-answering part of the pipeline.
  • Chat-with-MLX and MLX-RAG for the inspiration.

About

Simple, local-first, retrieval-augmented generation without tears (or API keys). Powered by MLX and AlphaMonarch.

Resources

License

Stars

Watchers

Forks