Cross-source-cross-domain-sentiment-analysis

This repository hold 2 pickle dictionaries (Python) containing labeled data for cross source cross domain sentiment analysis. The two files are related either to English texts or Italian written ones.

The Dataset_ENG is composed by:

Amazon: it contains a sample of 75,000 reviews of different Amazon products (as lectronic devices, kitchen objects, clothes and house accessories) collected from January to February 2018 and written in English. Each review is accompanied by the its date, the short title and the sentiment (expressed in a 5-stars rating) defined by the user who wrote the review. For privacy issues, the user name is omitted.
Tripadvisor: it contains a sample of 75,000 reviews English reviews about hotels, restaurants, cities downloaded from Tripadvisor.com between January and February 2018. Each review is accompanied by the its date, the short title and the sentiment (expressed in a 5-stars rating) defined by the user who wrote the review. For privacy issues, the user name is omitted.
Facebook: it contains 5,782 English Facebook posts. The post are related only to specific public pages having a 5-start rating system. The sampled reviews performed from January to February 2018 are about several topics, namely universities, events, famous people, locals, parties, shops and cities. Each item in the collection is accompanied by the sentiment (expressed in a 5-stars rating) defined by the user. For privacy issues, the user name is omitted.

The Dataset_ITA is composed by:

Amazon: it contains a sample of 75,000 reviews of different Amazon products (as lectronic devices, kitchen objects, clothes and house accessories) collected from January to February 2018 and written in Italian. Each review is accompanied by the its date, the short title and the sentiment (expressed in a 5-stars rating) defined by the user who wrote the review. For privacy issues, the user name is omitted.
Tripadvisor: it contains a sample of 75,000 reviews reviews written in Italian about hotels, restaurants, cities downloaded from Tripadvisor.com between January and February 2018. Each review is accompanied by the its date, the short title and the sentiment (expressed in a 5-stars rating) defined by the user who wrote the review. For privacy issues, the user name is omitted.
Facebook: it contains 1,077 Italian Facebook posts. The post are related only to specific public pages having a 5-start rating system. The sampled reviews performed from January to February 2018 are about several topics, namely universities, events, famous people, locals, parties, shops and cities. Each item in the collection is accompanied by the sentiment (expressed in a 5-stars rating) defined by the user. For privacy issues, the user name is omitted.
Twitter: sample of 937 Italian tweets manually labeled. The sample was collected at April 2018 and it regards Italian television shows and other more general topics. Each review has a three class sentiment label among negative, neutral or positive.

If you use these datasets, please cite:

Zola, P., Cortez, P., Ragno, C., & Brentari, E. (2019). Social Media Cross-Source and Cross-Domain Sentiment Classification. International Journal of Information Technology & Decision Making.

Thank you!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Dataset_ENG.rar		Dataset_ENG.rar
Dataset_ITA.rar		Dataset_ITA.rar
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cross-source-cross-domain-sentiment-analysis

About

Releases

Packages

paolazola/Cross-source-cross-domain-sentiment-analysis

Folders and files

Latest commit

History

Repository files navigation

Cross-source-cross-domain-sentiment-analysis

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages