SMS-Spam-Classification

In this project, I have explored and compared text preprocessing and feature selection methods among word count, character count, bag of words, removing stop words, stemming, and Lemmatization. Logistic regression classifier is used to detect ham or spam SMS messages. The dataset is a collection of 5,574 text messages in English, taggled according being ham (legitimate) or spam. The originate dataset can be found at https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
sms code.ipynb		sms code.ipynb
spam.csv		spam.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMS-Spam-Classification

About

Releases

Packages

Languages

haojing9058/SMS-Spam-Classification

Folders and files

Latest commit

History

Repository files navigation

SMS-Spam-Classification

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages