Skip to content
/ LexiExp Public

LexiExp -- Free open source sentiment lexicon expansion script

License

Notifications You must be signed in to change notification settings

uhh-lt/LexiExp

Repository files navigation

LexiExp -- Free open source sentiment lexicon expansion script

Please read the following License agreement. LexiExp is licensed under ASL 2.0 and other lenient licenses, allowing its use for academic and commercial purposes without restrictions.

LexiExp is a tool for expanding existing sentiment seed lexicon. It also provides a polarity estimation for the new expanded lexicon using a statistical co-occurrence calculation. LexiExp is based on semantic similarity following the JoBimText project.

How to Use LexiExp from Command Line

  1. Download: Source code is avilable under source code or simply use the excutable jar file.

  2. Command Line & Input Parameters:

    $ java -jar LexiExp0.0.1.jar -s <string> [-e <int>] [-db <string>] -o <string>

    -h help

    -db,--database Database/model name (DEFAULT: reviewsTrigram). This database works for English lexicon. List of available languages and their corresponding models:

    • English: reviewsTrigram , wikipediaTrigram , twitter2012Bigram , trigram
    • German: germanTrigram , twitterDETrigram
    • Dutch: dutchTrigram
    • French: frenchTrigram
    • Spanish: spanishTrigram
    • Bengali: bengaliBigram
    • Indian: hindiBigram , hindiTrigram
    • Arabic: arabicTrigram
    • Turkish: turkishTrigram
    • Hebrew: hebrewTrigram
    • Russian: russianTrigram

    -e,--expansion Number of expansions (DEFAULT: 10)

    -o,--output Output file name (DEFAULT: out_expanded_lexicon.txt)

    -s,--seed Seed set input file word"\TAB"polarity pairs [w_1\tp_1] [w_2\tp_2] ... [w_m\tp_m] (DEFAULT: lexicon )

    To run the example, please add the following english lexicon sample file in the same directory and run the jar without any parameters:

    $ java -jar LexiExp0.0.1.jar

Resources:

  1. Kumar, A., Kohail, S., Kumar, A., Ekbal, A., Biemann, C. (2016): IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain Dependency and Distributional Semantics Features for Aspect Based Sentiment Analysis, In Proceedings of the 10th International Workshop on Semantic Evaluation, San Diego, CA, USA. (selected for best of SemEval session) (pdf)
  2. Kumar, A., Kohail, S., Ekbal, A., Biemann C. (2015): IIT-TUDA: System for Sentiment Analysis in Indian Languages using Lexical Acquisition. In: Third International Conference on Mining Intelligence and Knowledge Exploration (MIKE 2015). Hyderabad, India (pdf)