Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

02.1 Feature importance voting and pre-assessment of features #4

Open
KLFTESPACE opened this issue Nov 20, 2023 · 5 comments
Open

02.1 Feature importance voting and pre-assessment of features #4

KLFTESPACE opened this issue Nov 20, 2023 · 5 comments

Comments

@KLFTESPACE
Copy link

I am sorry to trouble u that this part of code in file 02.1 can't run successfully. I referred to the solution in the issues and ran pip install git+https://github.com/kahramankostas/XuniVerse,Successfully installed contourpy-1.2.0 cycler-0.12.1 fonttools-4.44.3 joblib-1.3.2 kiwisolver-1.4.5 matplotlib-3.8.2 packaging-23.2 patsy-0.5.3 pillow-10.1.0 pyparsing-3.1.1 scikit-learn-1.3.2 scipy-1.11.4 statsmodels-0.14.0 threadpoolctl-3.2.0 xverse-1.0.5. but it did not take effect.

Hope for your early reply. Thanks!
my pandas version is 2.1.3,

AttributeError Traceback (most recent call last)
Cell In[22], line 14
12 clf = VotingSelector()
13 print(X, y)
---> 14 clf.fit(X, y)
15 #Selected features
16 temp="./results/"+i[18:-4]+"FI.csv"

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:224, in VotingSelector.fit(self, X, y)
222 #start training on the data
223 temp_X = X[self.use_features]
--> 224 self.feature_importances_, self.feature_votes_ = self.train(temp_X, y)
226 return self

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:285, in VotingSelector.train(self, X, y)
283 #handle categorical values with either 'woe' or 'le'
284 if self.handle_category == 'woe':
--> 285 transformed_X, self.mapping, iv_df = self.woe_information_value(X, y) #woe transformed_X
286 elif self.handle_category == 'le':
287 transformed_X = X.copy(deep=True)

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:115, in VotingSelector.woe_information_value(self, X, y)
112 def woe_information_value(self, X, y):
114 clf = WOE()
--> 115 clf.fit(X, y)
117 return clf.transform(X), clf.woe_bins, clf.iv_df

File D:\Python_env\Lib\site-packages\xverse\transformer_woe.py:137, in WOE.fit(self, X, y)
132 if self.monotonic_binning:
133 self.mono_bin_clf = MonotonicBinning(feature_names=self.mono_feature_names,
134 max_bins=self.mono_max_bins, force_bins=self.mono_force_bins,
135 cardinality_cutoff=self.mono_cardinality_cutoff,
136 prefix=self.mono_prefix, custom_binning=self.mono_custom_binning)
--> 137 X = self.mono_bin_clf.fit_transform(X, y)
138 self.mono_custom_binning = self.mono_bin_clf.bins
140 #identify the variables to tranform and assign the bin mapping dictionary

File D:\Python_env\Lib\site-packages\sklearn\utils_set_output.py:157, in _wrap_method_output..wrapped(self, X, *args, **kwargs)
155 @wraps(f)
156 def wrapped(self, X, *args, **kwargs):
--> 157 data_to_wrap = f(self, X, *args, **kwargs)
158 if isinstance(data_to_wrap, tuple):
159 # only wrap the first output for cross decomposition
160 return_tuple = (
161 _wrap_data_with_container(method, data_to_wrap[0], X, self),
162 *data_to_wrap[1:],
163 )

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:257, in MonotonicBinning.fit_transform(self, X, y)
256 def fit_transform(self, X, y):
--> 257 return self.fit(X, y).transform(X)

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:122, in MonotonicBinning.fit(self, X, y)
118 raise ValueError("The input feature(s) should be numeric type. Some of the input features
119 has character values in it. Please use a encoder before performing monotonic operations.")
121 #apply the monotonic train function on dataset
--> 122 fit_X.apply(lambda x: self.train(x, y), axis=0)
123 return self

File D:\Python_env\Lib\site-packages\pandas\core\frame.py:10034, in DataFrame.apply(self, func, axis, raw, result_type, args, by_row, **kwargs)
10022 from pandas.core.apply import frame_apply
10024 op = frame_apply(
10025 self,
10026 func=func,
(...)
10032 kwargs=kwargs,
10033 )

10034 return op.apply().finalize(self, method="apply")

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:837, in FrameApply.apply(self)
834 elif self.raw:
835 return self.apply_raw()
--> 837 return self.apply_standard()

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:963, in FrameApply.apply_standard(self)
962 def apply_standard(self):
--> 963 results, res_index = self.apply_series_generator()
965 # wrap results
966 return self.wrap_results(results, res_index)

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:979, in FrameApply.apply_series_generator(self)
976 with option_context("mode.chained_assignment", None):
977 for i, v in enumerate(series_gen):
978 # ignore SettingWithCopy here in case the user mutates
--> 979 results[i] = self.func(v, *self.args, **self.kwargs)
980 if isinstance(results[i], ABCSeries):
981 # If we have a view on v, we need to make a copy because
982 # series_generator will swap out the underlying data
983 results[i] = results[i].copy(deep=False)

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:122, in MonotonicBinning.fit..(x)
118 raise ValueError("The input feature(s) should be numeric type. Some of the input features
119 has character values in it. Please use a encoder before performing monotonic operations.")
121 #apply the monotonic train function on dataset
--> 122 fit_X.apply(lambda x: self.train(x, y), axis=0)
123 return self

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:170, in MonotonicBinning.train(self, X, y)
165 """
166 Execute this block when monotonic relationship is not identified by spearman technique.
167 We still want our code to produce bins.
168 """
169 if len(bins_X_grouped) == 1:
--> 170 bins = algos.quantile(X, np.linspace(0, 1, force_bins)) #creates a new binnning based on forced bins
171 if len(np.unique(bins)) == 2:
172 bins = np.insert(bins, 0, 1)

AttributeError: module 'pandas.core.algorithms' has no attribute 'quantile'

@KLFTESPACE
Copy link
Author

I am sorry to trouble u that this part of code in file 02.1 can't run successfully. I referred to the solution in the issues and ran pip install git+https://github.com/kahramankostas/XuniVerse,Successfully installed contourpy-1.2.0 cycler-0.12.1 fonttools-4.44.3 joblib-1.3.2 kiwisolver-1.4.5 matplotlib-3.8.2 packaging-23.2 patsy-0.5.3 pillow-10.1.0 pyparsing-3.1.1 scikit-learn-1.3.2 scipy-1.11.4 statsmodels-0.14.0 threadpoolctl-3.2.0 xverse-1.0.5. but it did not take effect.

Hope for your early reply. Thanks!

my pandas version is 2.1.3,
AttributeError Traceback (most recent call last) Cell In[22], line 14 12 clf = VotingSelector() 13 print(X, y) ---> 14 clf.fit(X, y) 15 #Selected features 16 temp="./results/"+i[18:-4]+"FI.csv"

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:224, in VotingSelector.fit(self, X, y) 222 #start training on the data 223 temp_X = X[self.use_features] --> 224 self.feature_importances_, self.feature_votes_ = self.train(temp_X, y) 226 return self

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:285, in VotingSelector.train(self, X, y) 283 #handle categorical values with either 'woe' or 'le' 284 if self.handle_category == 'woe': --> 285 transformed_X, self.mapping, iv_df = self.woe_information_value(X, y) #woe transformed_X 286 elif self.handle_category == 'le': 287 transformed_X = X.copy(deep=True)

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:115, in VotingSelector.woe_information_value(self, X, y) 112 def woe_information_value(self, X, y): 114 clf = WOE() --> 115 clf.fit(X, y) 117 return clf.transform(X), clf.woe_bins, clf.iv_df

File D:\Python_env\Lib\site-packages\xverse\transformer_woe.py:137, in WOE.fit(self, X, y) 132 if self.monotonic_binning: 133 self.mono_bin_clf = MonotonicBinning(feature_names=self.mono_feature_names, 134 max_bins=self.mono_max_bins, force_bins=self.mono_force_bins, 135 cardinality_cutoff=self.mono_cardinality_cutoff, 136 prefix=self.mono_prefix, custom_binning=self.mono_custom_binning) --> 137 X = self.mono_bin_clf.fit_transform(X, y) 138 self.mono_custom_binning = self.mono_bin_clf.bins 140 #identify the variables to tranform and assign the bin mapping dictionary

File D:\Python_env\Lib\site-packages\sklearn\utils_set_output.py:157, in _wrap_method_output..wrapped(self, X, *args, **kwargs) 155 @wraps(f) 156 def wrapped(self, X, *args, **kwargs): --> 157 data_to_wrap = f(self, X, *args, **kwargs) 158 if isinstance(data_to_wrap, tuple): 159 # only wrap the first output for cross decomposition 160 return_tuple = ( 161 _wrap_data_with_container(method, data_to_wrap[0], X, self), 162 *data_to_wrap[1:], 163 )

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:257, in MonotonicBinning.fit_transform(self, X, y) 256 def fit_transform(self, X, y): --> 257 return self.fit(X, y).transform(X)

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:122, in MonotonicBinning.fit(self, X, y) 118 raise ValueError("The input feature(s) should be numeric type. Some of the input features 119 has character values in it. Please use a encoder before performing monotonic operations.") 121 #apply the monotonic train function on dataset --> 122 fit_X.apply(lambda x: self.train(x, y), axis=0) 123 return self

File D:\Python_env\Lib\site-packages\pandas\core\frame.py:10034, in DataFrame.apply(self, func, axis, raw, result_type, args, by_row, **kwargs) 10022 from pandas.core.apply import frame_apply 10024 op = frame_apply( 10025 self, 10026 func=func, (...) 10032 kwargs=kwargs, 10033 )

10034 return op.apply().finalize(self, method="apply")

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:837, in FrameApply.apply(self) 834 elif self.raw: 835 return self.apply_raw() --> 837 return self.apply_standard()

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:963, in FrameApply.apply_standard(self) 962 def apply_standard(self): --> 963 results, res_index = self.apply_series_generator() 965 # wrap results 966 return self.wrap_results(results, res_index)

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:979, in FrameApply.apply_series_generator(self) 976 with option_context("mode.chained_assignment", None): 977 for i, v in enumerate(series_gen): 978 # ignore SettingWithCopy here in case the user mutates --> 979 results[i] = self.func(v, *self.args, **self.kwargs) 980 if isinstance(results[i], ABCSeries): 981 # If we have a view on v, we need to make a copy because 982 # series_generator will swap out the underlying data 983 results[i] = results[i].copy(deep=False)

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:122, in MonotonicBinning.fit..(x) 118 raise ValueError("The input feature(s) should be numeric type. Some of the input features 119 has character values in it. Please use a encoder before performing monotonic operations.") 121 #apply the monotonic train function on dataset --> 122 fit_X.apply(lambda x: self.train(x, y), axis=0) 123 return self

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:170, in MonotonicBinning.train(self, X, y) 165 """ 166 Execute this block when monotonic relationship is not identified by spearman technique. 167 We still want our code to produce bins. 168 """ 169 if len(bins_X_grouped) == 1: --> 170 bins = algos.quantile(X, np.linspace(0, 1, force_bins)) #creates a new binnning based on forced bins 171 if len(np.unique(bins)) == 2: 172 bins = np.insert(bins, 0, 1)

AttributeError: module 'pandas.core.algorithms' has no attribute 'quantile'

my python version is 3.11.0b5

@kahramankostas
Copy link
Owner

Thank you very much for your question. I'm sorry you're getting an error. I think this error is caused by Xverse's updates.

please make sure you don't have Xverse installed before. and make sure you install the IoTDevID version (git+https://github.com/kahramankostas/XuniVerse) when you reinstall this repository.

I recommend you to clean Xverse from your computer and reinstall it

@KLFTESPACE
Copy link
Author

you install the IoTDevID version (git+https://github.com/kahramankostas/XuniVerse) when you reinstall this repository.

I recreated a Python 3.9 virtual environment and installed the required packages. In addition, I installed xverse using pip install git+https://github.com/kahramankostas/XuniVerse, but this time another error was reported.



KeyError Traceback (most recent call last)
Cell In[27], line 14
12 clf = VotingSelector()
13 print(X, y)
---> 14 clf.fit(X, y)
15 #Selected features
16 temp="./results/"+i[18:-4]+"FI.csv"

File D:\projects\device_detect\IOT\lib\site-packages\xverse\ensemble_voting.py:224, in VotingSelector.fit(self, X, y)
222 #start training on the data
223 temp_X = X[self.use_features]
--> 224 self.feature_importances_, self.feature_votes_ = self.train(temp_X, y)
226 return self

File D:\projects\device_detect\IOT\lib\site-packages\xverse\ensemble_voting.py:285, in VotingSelector.train(self, X, y)
283 #handle categorical values with either 'woe' or 'le'
284 if self.handle_category == 'woe':
--> 285 transformed_X, self.mapping, iv_df = self.woe_information_value(X, y) #woe transformed_X
286 elif self.handle_category == 'le':
287 transformed_X = X.copy(deep=True)

File D:\projects\device_detect\IOT\lib\site-packages\xverse\ensemble_voting.py:117, in VotingSelector.woe_information_value(self, X, y)
114 clf = WOE()
115 clf.fit(X, y)
--> 117 return clf.transform(X), clf.woe_bins, clf.iv_df

File D:\projects\device_detect\IOT\lib\site-packages\sklearn\utils_set_output.py:157, in _wrap_method_output..wrapped(self, X, *args, **kwargs)
155 @wraps(f)
156 def wrapped(self, X, *args, **kwargs):
--> 157 data_to_wrap = f(self, X, *args, **kwargs)
158 if isinstance(data_to_wrap, tuple):
159 # only wrap the first output for cross decomposition
160 return_tuple = (
161 _wrap_data_with_container(method, data_to_wrap[0], X, self),
162 *data_to_wrap[1:],
163 )

File D:\projects\device_detect\IOT\lib\site-packages\xverse\transformer_woe.py:310, in WOE.transform(self, X, y)
306 if not self.woe_bins:
307 raise ValueError("woe_bins variable is not present.
308 Estimator has to be fitted to apply transformations.")
--> 310 outX[new_column_name] = tempX.replace(self.woe_bins[original_column_name])
312 #transformed dataframe
313 return outX

KeyError: 'IP_MF'

@kahramankostas
Copy link
Owner

I tried it on windows and ubuntu. it works flawlessly on windows but gives this error on ubuntu. i can't figure out why. i will update the answer if i find a solution. For now you might consider running it on window or skip the Xverse step.

@KLFTESPACE
Copy link
Author

I tried it on windows and ubuntu. it works flawlessly on windows but gives this error on ubuntu. i can't figure out why. i will update the answer if i find a solution. For now you might consider running it on window or skip the Xverse step.

actually,this error occurs on windows 11, and i And I don't know which step went wrong. Here are my steps:
1.pip install git+https://github.com/kahramankostas/XuniVerse ,Successfully installed contourpy-1.2.0 cycler-0.12.1 fonttools-4.45.0 importlib-resources-6.1.1 joblib-1.3.2 kiwisolver-1.4.5 matplotlib-3.8.2 numpy-1.26.2 packaging-23.2 pandas-2.1.3 patsy-0.5.3 pillow-10.1.0 pyparsing-3.1.1 python-dateutil-2.8.2 pytz-2023.3.post1 scikit-learn-1.3.2 scipy-1.11.4 six-1.16.0 statsmodels-0.14.0 threadpoolctl-3.2.0 tzdata-2023.3 xverse-1.0.5 zipp-3.17.0
2. pip install seaborn
3. pip install -U scapy
I skipped the step "pip install graphviz" because I was experiencing errors when calling the ciz function, which prevented me from generating the graph.
and here is all packages:
Package Version


anyio 4.0.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
asttokens 2.4.1
async-lru 2.0.4
attrs 23.1.0
Babel 2.13.1
beautifulsoup4 4.12.2
bleach 6.1.0
certifi 2023.11.17
cffi 1.16.0
charset-normalizer 3.3.2
colorama 0.4.6
comm 0.2.0
contourpy 1.2.0
cycler 0.12.1
debugpy 1.8.0
decorator 5.1.1
defusedxml 0.7.1
exceptiongroup 1.1.3
executing 2.0.1
fastjsonschema 2.19.0
fonttools 4.45.0
fqdn 1.5.1
graphviz 0.20.1
idna 3.4
importlib-metadata 6.8.0
importlib-resources 6.1.1
ipykernel 6.26.0
ipython 8.17.2
ipywidgets 8.1.1
isoduration 20.11.0
jedi 0.19.1
Jinja2 3.1.2
joblib 1.3.2
json5 0.9.14
jsonpointer 2.4
jsonschema 4.20.0
jsonschema-specifications 2023.11.1
jupyter 1.0.0
jupyter_client 8.6.0
jupyter-console 6.6.3
jupyter_core 5.5.0
jupyter-events 0.9.0
jupyter-lsp 2.2.0
jupyter_server 2.10.1
jupyter_server_terminals 0.4.4
jupyterlab 4.0.9
jupyterlab-pygments 0.2.2
jupyterlab_server 2.25.2
jupyterlab-widgets 3.0.9
kiwisolver 1.4.5
MarkupSafe 2.1.3
matplotlib 3.8.2
matplotlib-inline 0.1.6
mistune 3.0.2
nbclient 0.9.0
nbconvert 7.11.0
nbformat 5.9.2
nest-asyncio 1.5.8
notebook 7.0.6
notebook_shim 0.2.3
numpy 1.26.2
overrides 7.4.0
packaging 23.2
pandas 2.1.3
pandocfilters 1.5.0
parso 0.8.3
patsy 0.5.3
Pillow 10.1.0
pip 22.0.4
platformdirs 4.0.0
prometheus-client 0.19.0
prompt-toolkit 3.0.41
psutil 5.9.6
pure-eval 0.2.2
pycparser 2.21
Pygments 2.17.1
pyparsing 3.1.1
python-dateutil 2.8.2
python-json-logger 2.0.7
pytz 2023.3.post1
pywin32 306
pywinpty 2.0.12
PyYAML 6.0.1
pyzmq 25.1.1
qtconsole 5.5.1
QtPy 2.4.1
referencing 0.31.0
requests 2.31.0
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rpds-py 0.13.1
scapy 2.5.0
scikit-learn 1.3.2
scipy 1.11.4
seaborn 0.13.0
Send2Trash 1.8.2
setuptools 58.1.0
six 1.16.0
sniffio 1.3.0
soupsieve 2.5
stack-data 0.6.3
statsmodels 0.14.0
terminado 0.18.0
threadpoolctl 3.2.0
tinycss2 1.2.1
tomli 2.0.1
tornado 6.3.3
traitlets 5.13.0
types-python-dateutil 2.8.19.14
typing_extensions 4.8.0
tzdata 2023.3
uri-template 1.3.0
urllib3 2.1.0
wcwidth 0.2.11
webcolors 1.13
webencodings 0.5.1
websocket-client 1.6.4
widgetsnbextension 4.0.9
xverse 1.0.5
zipp 3.17.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants