Skip to content

KeypartX is a graph-based approach to represent perception (text in general) by key parts of speech.

License

Notifications You must be signed in to change notification settings

pengKiina/KeypartX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyPI - Python docs PyPI - PyPi PyPI - License arXiv PyPI -Download

KeypartX

  • No more Topic Modeling
  • No need Training
  • No more Machine Learning but Human-like Reading
  • Get the Insights of Text Big and Small

KeypartX: a graph-based approach to represent perception (text in general) by key parts of speech. KeypartX solved the coherence crux that current topic modeling algorithms are trying to deal with but failed. KeypartX extracts the topics from text corpus syntactically, semantically and pragmatically instead of a meaningless combination of words from topic modeling.

Key Parts: Noun, Adjective, Verb and Emoji

KeypartX Vs Topic Modeling results from the following text:

“Thai food was great we loved it. Thiland also has beautiful beach resorts, we will come to Thailand again👍”

  • KeypartX Result

  • Topic Modeling Result

['food','thailand','resort','great','love', 'beautiful']

Installation

if need coreferee: 
 pip install keypartx[coreferee_spacy] 
 #!pip install keypartx[crosslingual-coreference_spacy] # a alternative coreference package 
 python3 -m coreferee install en 
 python -m spacy download en_core_web_lg 

else:
 pip install spacy 
 pip install keypartx  
 python -m spacy download en_core_web_lg

Getting Started

For an in-depth overview of the features of KeypartX you can check the Documents or you can follow along with one of the examples as follows:

Name Link
KeypartX Quick Start Open In Colab
KeypartX with Real Example Open In Colab
KeypartX VS Topic Modelling Open In Colab
KeypartX Network Comparison Open In Colab

Visualization Examples

  • 1 NLP Target

Original sentence: """Thai food was great,delicousr and not expensive, we loved it. We visited 3 beach resorts, they are higly recommened... We had "Fire-Vodka" !!!"""

  • 2 Keyparts Wordclouds

The following wordclouds are generated from a real example of corpus comprised of reviews by those who visit Thailand.

  • 3 Community and Gray Perceptual Unit Networks

Citation

To cite the KeypartX paper, please use the following bibtex reference:

@article{pengyang2022keypartx,
  title={KeypartX: Graph-based Perception (Text) Representation},
  author={Peng, Yang},
  journal={arXiv preprint arXiv:2209.11844},
  year={2022}
}