Skip to content

Implement a search system that allows employees to quickly find information within the company's knowledge base (in the form of PDFs you can upload).

Notifications You must be signed in to change notification settings

kaylamarietorres/KnowledgeBaseSearch

Repository files navigation

Internal Knowledge Base Search

Goal:

Implement a search system that allows employees to quickly find information within a company’s knowledge base. This is helpful for large volumes of local text data to work with in order to find more centered, targeted answers. For example, a collection of books, documents, and (potentially) videos that you want to be able to interact with using a chatbot.

Features:

  • Advanced search capabilities with natural language queries.
  • Retrieval-Augmented Generation (RAG)
    • Question-answering functionality to provide direct answers from text without hallucinations.
  • Filtering and sorting options to refine search results.
  • Highlighting exact portions of the document where the answer is found to improve user experience.
  • Summarizing documents

Fullstack Architecture

My Image

Tools

Haystack by deepset

Haystack is an open-source framework for building production-ready LLM applications, RAG pipelines and state-of-the-art search systems that work intelligently over large document collections. It lets you quickly try out the latest AI models while being flexible and easy to use.

Some examples of what you can build include:

  • Advanced RAG on your own data source, powered by the latest retrieval and generation techniques.
  • Chatbots and agents powered by cutting-edge generative models like GPT-4, that can even call external functions and services.
  • Generative multi-modal question answering on a knowledge base containing mixed types of information: images, text, audio, and tables.
  • Information extraction from documents to populate your database or build a knowledge graph.

ElasticSearch

Elasticsearch is a distributed search and analytics engine designed for fast and efficient data retrieval and analysis. In this project we will be using it as a vector database to create, store, and search vector embeddings.

REST API

REST API provides client and server communication. It is simple, standardized, and provides security (using OAuth for this project).

Hugging Face Transformers

TBD

Kibana

TBD

Installation

Haystack-Elasticsearch

Haystack-Elasticsearch Core Integration

add dependencies here for installing python and pip

pip install --upgrade pip pip install streamlit elasticsearch pip install -U haystack-ai pip install elasticsearch-haystack

pip install docker

pip install streamlit

docker compose up

Contributing

License

About

Implement a search system that allows employees to quickly find information within the company's knowledge base (in the form of PDFs you can upload).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages