Skip to content
This repository has been archived by the owner on Aug 9, 2021. It is now read-only.

一个完整的机器学习项目中的对文本特征类编码地方问题 #139

Open
LelandYan opened this issue Mar 2, 2019 · 0 comments

Comments

@LelandYan
Copy link

为什么使用sklearn的LabelEncoder()和pandas中的factorize()的结果不同

from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()
housing_cat = housing["ocean_proximity"]
housing_cat_encoded1 = encoder.fit_transform(housing_cat)
housing_cat_encoded2, housing_categories = housing_cat.factorize()
housing_cat_encoded1[:10] 
 housing_cat_encoded2[:10] 

为什么housing_cat_encoded1的值0-4, housing_cat_encoded2的值0-2

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant