Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 377 #394

Closed
wants to merge 2 commits into from
Closed

Fix issue 377 #394

wants to merge 2 commits into from

Conversation

idhamari
Copy link
Contributor

New implementation to solve Issue 377: Enable definition of traning and validation set for privacy-preserving machine learning .

There is still a bug, probably related to this line:

        DataSubset subsetTrain = inputHandle.getSubset();

It seems the returned value is null when the handle comes from the GUI.

@prasser
Copy link
Collaborator

prasser commented Jun 19, 2022

Please take a look at https://github.com/arx-deidentifier/arx/tree/feature-training-test

As mentioned, I merged your other PR into this branch and revised it. feature-training-test is the currently leading implementation of the feature, so if you want to further investigate this issue, please base your efforts on this branch. Will close this.

@prasser prasser closed this Jun 19, 2022
@idhamari
Copy link
Contributor Author

Thanks for your feedback. I just tested the mentioned branch and I don't see the option for selecting kfold or subset for evaluation. I will add the UI part based on your recent updates.

@prasser
Copy link
Collaborator

prasser commented Jun 20, 2022

The setting is there. It is contained in the project properties dialog.

@idhamari
Copy link
Contributor Author

idhamari commented Jun 20, 2022

ok, I found the option "Use test and training set:" in Edit/Settings/Utility analysis.
It seems it stopped at this part

    if (config.isUseTrainingTestSet() && !inputHandle.isSubsetAvailable()) {
        throw new IllegalArgumentException("Training and test set can only be used with a subset");
    }

As the subset is not available despite I select random samples then did the anonymisation i.e.

          config.isUseTrainingTestSet()   : true
          inputHandle.isSubsetAvailable() : false

This error is similar to the one I got before in the previous implemntation!

@prasser
Copy link
Collaborator

prasser commented Jun 20, 2022

Its not an error but expected behavior. You need to click the button to show the overall dataset.

@idhamari
Copy link
Contributor Author

idhamari commented Jun 20, 2022

Thanks, everything works now after clicking the "Toggle sample view" button, maybe it should be activated automatically when "Use test and training set" is selected!
Screenshot from 2022-06-20 16-42-36

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants