You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
and i get solution -> replace any data can not convert to numeric to NAN
this is solution
def average_values(name_list):
flag = 1
for i in name_list:
df = pd.read_csv(i)
col = i[14:-4] # Extract column name from file path
df = df.apply(lambda x: pd.to_numeric(x,errors='coerce')) # this is my change , if any value can not convert to numeric , replace it by NAN
temp = pd.DataFrame(df.mean(), columns=[col]) # Assign column name to DataFrame
if flag:
std = temp
flag = 0
else:
std[col] = temp[col]
tt = std.T
return tt
##isolated
name_list=find_the_way('./isolated/','.csv')
iso=average_values(name_list)
iso = iso.drop(['Dataset' , 'ML algorithm'],axis=1)# this is my change
name_list=find_the_way('./crossval/','.csv')
cv=average_values(name_list)
cv=cv.drop(['Dataset' , 'ML_algorithm'], axis=1)# this is my change
it is true ? because the graph result not same paper
it is my result
paper result
The text was updated successfully, but these errors were encountered:
It appears that the issue stems from a library update. Previously, the df.mean() command disregarded non-numeric columns, but it seems this functionality has been removed in the newer version of pandas.
Your solution appears to be appropriate in this context. I will proceed to update the code accordingly when the opportunity arises.
Regarding the differences in the graphs, if you've augmented the feature array containing the root features (as inferred from the close resemblance between all cascaded results and the final results), such variations are expected.
The significance of this step lies in assessing how the inclusion of additional features impacts the results obtained with the root features. It's essential to ascertain whether the success observed is sustainable in an isolated dataset.
If the success can be maintained in the isolated dataset, it suggests that the feature may indeed be beneficial. However, if not, it indicates that the success observed in the cross-validation step might be a result of information leakage.
In this context, if you increase the root features uncontrollably (such as moving features from the iden list to the feature list), you will throw away the possibility of making this useful comparison.
as far as I understand that in earlier versions of pandas the default value for df.mean(numeric_only=True/False) was True. I think this has now been changed to false. so if you fix the code as below (I have already added this fix) it should solve your problem:
and i get solution -> replace any data can not convert to numeric to NAN
this is solution
##isolated
it is true ? because the graph result not same paper
it is my result
paper result
The text was updated successfully, but these errors were encountered: