Parameter-Optimization-of-SVM

Multi-class SVM on Adult Census IncomeDataset

This project demonstrates the use of multi-class SVM on the Adult Census Income dataset from the UCI Machine Learning Repository. The dataset contains a total of 15 columns and 32561 rows.

The Adult Census Income dataset contains information about individuals, such as their age, education level, marital status, occupation, and whether or not they earn more than $50K per year.

Preprocessing

Before using the dataset for classification, the following preprocessing steps were applied:

The "?" values in the dataset were replaced with the mode of the respective column.
The categorical features were one-hot encoded.
The numerical features were standardized using the StandardScaler from scikit-learn.

Training and Testing

The dataset was divided into a 70-30 split for training and testing, respectively. The SVM model was optimized for 1000 iterations with random values of C and gamma between 0 and 1, and with different kernels. The optimization was performed on 10 different samples of the dataset, and the best accuracy and SVM parameters were recorded for each sample.

Results

The results of the SVM optimization for each sample are summarized in the table below:

Sample	Best Accuracy	Best C	Best Gamma	Best Kernel
1	0.779814	0.621529397317055	0.18370585815911455	rbf
2	0.800287	0.7293151589606629	0.6811439432806408	poly
3	0.797728	0.7288867203557419	0.46100263352088766	poly
4	0.782168	0.05309078943383161	0.1680117142843266	rbf
5	0.802743	0.25995296118639877	0.20122283502869864	poly
6	0.798649	0.44410793092714784	0.5681517380630898	poly
7	0.793428	0.36063178294224174	0.12582310778330696	poly
8	0.794964	0.8704578887749851	0.9407696885526249	poly
9	0.790562	0.6741497148676773	0.0390595787384963	rbf
10	0.794145	0.08561209156557315	0.10956788956020369	rbf

The sample with the highest accuracy was sample 5, with an accuracy of 0.802743.

Conclusion

Multi-class SVM can be a powerful tool for classification tasks, especially when applied to preprocessed datasets. By optimizing the SVM parameters, we can achieve high accuracy on the test set. However, care must be taken to avoid overfitting and to select appropriate values for C, gamma, and the kernel function.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
102017045_Parameter_Optimization_of_SVM.ipynb		102017045_Parameter_Optimization_of_SVM.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parameter-Optimization-of-SVM

Multi-class SVM on Adult Census IncomeDataset

Preprocessing

Training and Testing

Results

Conclusion

About

Releases

Packages

Languages

Arshpreet-Singh-1/Parameter-Optimization-of-SVM-

Folders and files

Latest commit

History

Repository files navigation

Parameter-Optimization-of-SVM

Multi-class SVM on Adult Census IncomeDataset

Preprocessing

Training and Testing

Results

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages