Skip to content

The Multi-Modal Search Engine is a cutting-edge project that integrates OpenAI's CLIP model into a user-friendly web interface. With intuitive search functionality and seamless integration of text queries to retrieve relevant images, this project demonstrates the potential of multi-modal search systems.

License

Notifications You must be signed in to change notification settings

ahmedembeddedx/multimodal-search-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal Search Engine

This project is a Multi-Modal Search Engine developed using CLIP by OpenAI, with Flask API for backend and HTML/CSS for the frontend web application.

Introduction

This project provides a seamless web interface where users can input text queries, and the system retrieves relevant images based on the textual description based on CLIP architecture read the paper.

Take a look

Screenshot-2024-04-10-at-11-02-46-PM

Screenshot-2024-04-10-at-11-03-23-PM

Screenshot-2024-04-10-at-11-03-51-PM

Screenshot-2024-04-10-at-11-04-14-PM

Demo Video

Watch the YouTube video

  • This video demonstrates how to use our project's main feature.

How to use for your own images?

  • Sample data of 130 images is present in the file or
  • See the video or
  • Place your images in src/minidata
  • Run the notebook src/image-processor
  • Move the data in src/image_embeddings & the data in src/minidata to flaskapp/image_embeddings & flaskapp/static respectively (caution: transfer the data, not the directories)

Features

  • Multi-Modal Search: Users can input textual descriptions of images to retrieve relevant images.
  • Intuitive Web Interface: The frontend is built using React to provide a user-friendly experience.
  • Scalable Backend: Flask API serves as the backend, handling requests and interacting with the CLIP model.

Clone the repository:

git clone https://github.com/ahmedembeddedx/Multi-Modal_Search_Engine.git

Usage

Start the backend server:

cd flaskapp/
flask run

Access the web application in your browser at http://127.0.0.1:5000/.

Stacks

  • OpenAI for developing CLIP.
  • Flask for the backend framework.

Future Expectences

  • Shift the app to ReactJs
  • Use ImageBind by MetaAI
  • More accurate modal evaluation
  • Integrate Audio & Video Functionality

About

The Multi-Modal Search Engine is a cutting-edge project that integrates OpenAI's CLIP model into a user-friendly web interface. With intuitive search functionality and seamless integration of text queries to retrieve relevant images, this project demonstrates the potential of multi-modal search systems.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages