MoneyInMotionPerformancefromGamePlay

Feature Engineering:

The first section of the code focuses on feature engineering. It includes a function called feature_engineer that takes a train dataset as input and performs various feature engineering operations. The code utilizes different grouping and aggregation techniques to generate new features based on categorical and numerical variables. The final output is a processed dataframe containing the engineered features.

Data Preparation and Model Training:

The next section focuses on data preparation and model training. It begins by importing necessary modules such as sklearn.model_selection, xgboost, and sklearn.metrics. The code defines the number of splits for cross-validation using the GroupKFold class. It also initializes an empty dataframe (oof) to store out-of-fold predictions and a dictionary (models) to store trained models.The code then enters a loop to perform cross-validation. Within each iteration, the train_index and test_index for the current fold are generated using gkf.split(). For each fold, the code defines the parameters for an XGBoost classifier and iterates over different question numbers. It filters the training and validation data based on the question number and level group. The XGBoost classifier is trained on the filtered data and evaluated on the validation set. The trained model is stored in the models dictionary, and the predictions on the validation set are stored in the oof dataframe. The feature engineering section utilizes loops to iterate over different features and data groups. It also generates binary features for specific events and sums up event occurrences and elapsed time for each group.

Evaluation and Threshold Optimization:

The next section of the code focuses on evaluation and threshold optimization. It initializes a copy of the oof dataframe (true) to store the true labels. The code then enters a while loop to iterate over different threshold values. For each threshold value, it calculates the F1 score using the predicted labels from the oof dataframe and the true labels from the true dataframe. The F1 score and threshold value are stored in separate lists (listA and listB).The loop also keeps track of the best F1 score and its corresponding threshold value. After the loop, the code calculates the overall F1 score using the best threshold and prints it.

Testing and Prediction:

The final section of the code focuses on testing and prediction. It defines a dictionary (limits) that specifies the lower and upper question numbers for each level group. The code then enters a loop to iterate over the test data and sample_submission. Within each iteration, it performs feature engineering on the test data and retrieves the level group. Based on the level group, the code determines the question number limits using the limits dictionary. Next, the code iterates over the question numbers within the limits and retrieves the trained model corresponding to the level group and question number. It predicts the probability of correctness for the test data using the model. The code updates the 'correct' column in the sample_submission dataframe based on the predicted probability and the best threshold obtained from the evaluation phase. Then code makes predictions using the updated sample_submission dataframe and submits them via the environment by calling the env.predict() function.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
money-in-motion-v-final.ipynb		money-in-motion-v-final.ipynb
submission.csv		submission.csv
test.csv		test.csv
train_labels.csv.zip		train_labels.csv.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MoneyInMotionPerformancefromGamePlay

Feature Engineering:

Data Preparation and Model Training:

Evaluation and Threshold Optimization:

Testing and Prediction:

About

Releases

Packages

Languages

License

jojo142/MoneyInMotionPerformancefromGamePlay

Folders and files

Latest commit

History

Repository files navigation

MoneyInMotionPerformancefromGamePlay

Feature Engineering:

Data Preparation and Model Training:

Evaluation and Threshold Optimization:

Testing and Prediction:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages