A very simple framework for state-of-the-art Natural Language Processing (NLP)
-
Updated
Jul 2, 2024 - Python
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Deep learning for natural language processing
Topic Modelling for Humans
The code powering searchthearxiv.com, a simple semantic search engine for more than 300,000 ML papers on arXiv.
Resume Matcher is an open source, free tool to improve your resume. It works by using language models to compare and rank resumes with job descriptions.
DSC 214 Topological Data Science Project
Multi-Relational Hyperbolic Word Embeddings from Natural Language Definitions
Naive RAG implementation using LangChain + OpenAI GPT 3.5 + Sentence_Transformer + FAISS
Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
A Fast, Adaptive, Stable, and Transferable Topic Model
front end to greek and latin corpora: searching, browsing, concordances, texts, dictionaries, parsing
This repository provides a complete workflow for text processing using Hugging Face Transformers and NLTK. It includes modules for sentence normalization, spelling correction, word embedding generation, positional encoding computation, and English-to-French translation
Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.
Code implementation for our DAS, 2020 paper titled "Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval"
This repository contains download links to pretrained static word embeddings (word2vec, fastText) in Filipino.
Algorithmic solvers for popular NYT word puzzles
The repository contains notebooks created for collecting and preprocessing the corpus of diary entries and for experiments on creating models for predicting gender, age groups of authors and the time period of text creation.
Developed a deep learning model utilizing TensorFlow to automate the classification of financial documents. Leveraging a Bidirectional LSTM RNN, we accurately categorize the documents. Our user-friendly Streamlit application ensures high accuracy & efficiency in document management, all deployed on the Hugging Face platform for seamless integration
An approach exploring and assessing literature-based doc-2-doc recommendations using word2vec combined with doc2vec, and applying it to TREC and RELISH datasets
Add a description, image, and links to the word-embeddings topic page so that developers can more easily learn about it.
To associate your repository with the word-embeddings topic, visit your repo's landing page and select "manage topics."