-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix new plotting APIs with Triage's result schemas #713
base: post-postmodeling
Are you sure you want to change the base?
Commits on Jan 31, 2019
-
Add feature_importance metric to SLR
This commit adds a small change into the catwalk component to calculate feature importances when the model object is a catwalk.estimators.ScaledLogisticRegression. Now, instead of not calculating anything, triage will be able to push feature importances using e to the power of the coefficients.
ivanhigueram committedJan 31, 2019 Configuration menu - View commit details
-
Copy full SHA for 870807a - Browse repository at this point
Copy the full SHA 870807aView commit details
Commits on Mar 18, 2019
-
Introduce experiment_runs table, beef up experiments table
At long last, the experiment runs table. It contains a variety of metadata about the experiment run, such as installed libraries, git hash, and number of matrices and models built/skipped/errored. Similarly, the experiments table is augmented with data that doesn't change from run-to-run (e.g. number of time splits, as-of-times, total grid size) A variety of methods on the Experiment act as 'entrypoints'. The first entrypoint you hit when running an experiment (e.g generate_matrices, or train_and_test_models) gets tagged on the experiment_runs row. - Add Experiment runs table [Resolves #440] [Resolves #403] and run-invariant columns to Experiments table - Add tracking module to wrap updates to the experiment_runs table - Have experiment call tracking module to save initial information and retrieve a run_id to update with more data later, either itself or through components (e.g. MatrixBuilder, ModelTrainer) that do relevent work - Have experiment save run-invariant information when first computed
Configuration menu - View commit details
-
Copy full SHA for 33321a9 - Browse repository at this point
Copy the full SHA 33321a9View commit details
Commits on Mar 19, 2019
-
Configuration menu - View commit details
-
Copy full SHA for aa94fe6 - Browse repository at this point
Copy the full SHA aa94fe6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7aee1c3 - Browse repository at this point
Copy the full SHA 7aee1c3View commit details
Commits on Mar 20, 2019
-
Configuration menu - View commit details
-
Copy full SHA for b84eb35 - Browse repository at this point
Copy the full SHA b84eb35View commit details -
Configuration menu - View commit details
-
Copy full SHA for bd89738 - Browse repository at this point
Copy the full SHA bd89738View commit details
Commits on Mar 21, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 61835c8 - Browse repository at this point
Copy the full SHA 61835c8View commit details
Commits on Mar 22, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 6bfb4e1 - Browse repository at this point
Copy the full SHA 6bfb4e1View commit details
Commits on Apr 1, 2019
-
Configuration menu - View commit details
-
Copy full SHA for acc2e66 - Browse repository at this point
Copy the full SHA acc2e66View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2beb28e - Browse repository at this point
Copy the full SHA 2beb28eView commit details
Commits on Apr 3, 2019
-
Merge pull request #665 from dssg/nanounanue-audition-doc-patch
Fixed Audition's docs
Configuration menu - View commit details
-
Copy full SHA for 7707b08 - Browse repository at this point
Copy the full SHA 7707b08View commit details -
Merge pull request #637 from dssg/runs_table
Introduce experiment_runs table, beef up experiments table
Configuration menu - View commit details
-
Copy full SHA for da24b61 - Browse repository at this point
Copy the full SHA da24b61View commit details
Commits on Apr 8, 2019
-
Merge pull request #587 from dssg/slr_importances
Add feature_importance metric to SLR [solves #509]
Configuration menu - View commit details
-
Copy full SHA for 3fb59a3 - Browse repository at this point
Copy the full SHA 3fb59a3View commit details
Commits on Apr 9, 2019
-
* mostly removing non-ascii from the license file. adding explict lineterminator on csv.writer
Configuration menu - View commit details
-
Copy full SHA for 01f357e - Browse repository at this point
Copy the full SHA 01f357eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 169252e - Browse repository at this point
Copy the full SHA 169252eView commit details
Commits on Apr 10, 2019
-
Configuration menu - View commit details
-
Copy full SHA for a4b4700 - Browse repository at this point
Copy the full SHA a4b4700View commit details
Commits on Apr 17, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 3f64c3f - Browse repository at this point
Copy the full SHA 3f64c3fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 47ff0a8 - Browse repository at this point
Copy the full SHA 47ff0a8View commit details -
Scheduled monthly dependency update for April (#664)
* Update black from 18.9b0 to 19.3b0 * Update alembic from 1.0.7 to 1.0.8 * Update sqlalchemy from 1.2.18 to 1.3.1 * Update scikit-learn from 0.20.2 to 0.20.3 * Update pandas from 0.24.1 to 0.24.2 * Update boto3 from 1.9.105 to 1.9.125 * Update sqlparse from 0.2.4 to 0.3.0 * Update csvkit from 1.0.3 to 1.0.4 * Update fakeredis from 1.0.2 to 1.0.3 * Update hypothesis from 4.7.17 to 4.14.2 * Update tox from 3.7.0 to 3.8.4 * Fix SQLAlchemy warnings that are now errors
Configuration menu - View commit details
-
Copy full SHA for 8e14077 - Browse repository at this point
Copy the full SHA 8e14077View commit details
Commits on Apr 18, 2019
-
Individual md files for dirty duck. Added markdown modules. Modified …
…requirements.txt
Configuration menu - View commit details
-
Copy full SHA for 422bf82 - Browse repository at this point
Copy the full SHA 422bf82View commit details -
Merge branch 'dirtyduck-integration' of github.com:dssg/triage into d…
…irtyduck-integration
Configuration menu - View commit details
-
Copy full SHA for 8261162 - Browse repository at this point
Copy the full SHA 8261162View commit details
Commits on Apr 19, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 5c3569c - Browse repository at this point
Copy the full SHA 5c3569cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 375954e - Browse repository at this point
Copy the full SHA 375954eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 64ec319 - Browse repository at this point
Copy the full SHA 64ec319View commit details -
Configuration menu - View commit details
-
Copy full SHA for 540f997 - Browse repository at this point
Copy the full SHA 540f997View commit details -
Dirty duck (the whole enchilada) (#670)
* Dirty duck (the whole enchilada) * Improve mkdocs.yml to fit dirty duck markdown version * Added function to create dirty duck md files to manage.py * Updated link at menu bar * Individual md files for dirty duck. Added markdown modules. Modified requirements.txt * Added some suggested modifications * Material design
Configuration menu - View commit details
-
Copy full SHA for ec78a2a - Browse repository at this point
Copy the full SHA ec78a2aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0e07f8d - Browse repository at this point
Copy the full SHA 0e07f8dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0c46b21 - Browse repository at this point
Copy the full SHA 0c46b21View commit details -
Configuration menu - View commit details
-
Copy full SHA for 32ec0dd - Browse repository at this point
Copy the full SHA 32ec0ddView commit details -
Configuration menu - View commit details
-
Copy full SHA for e19d86b - Browse repository at this point
Copy the full SHA e19d86bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9afc197 - Browse repository at this point
Copy the full SHA 9afc197View commit details -
Remove redundant imputation flag columns [Resolves #544]
The content of the imputation flag columns across all functions for a given timespan will be the same. This commit removes the redundant columns, and names the imputation flag column without any function name (e.g. 'events_entity_id_1y_outcome_imp' instead of 'events_entity_id_1y_outcome_avg_imp') - Change the Imputer class interface: - Add column_imputation_base to constructor - Change imputation_flag_sql to imputation_flag_select_and_alias so the caller can keep track of the aliases without doing SQL parsing - Change the Aggregation/SpacetimeAggregation to: - Create reverse column name -> Aggregate lookup (with some refactoring so it can build this without duplicating a bunch fo existing logic) - When creating the imputation SQL, query the lookup to create the column_imputation_base - Modify experiment algorithm doc to describe imputation flag behavior
Configuration menu - View commit details
-
Copy full SHA for c9c5182 - Browse repository at this point
Copy the full SHA c9c5182View commit details
Commits on Apr 21, 2019
-
Removed prepare_dirtyduck from manage.py - that command is not valid …
…anymore after splitting the files. Fixed some broken links
Configuration menu - View commit details
-
Copy full SHA for e7aea52 - Browse repository at this point
Copy the full SHA e7aea52View commit details -
Merge pull request #675 from dssg/fixing-little-dirtyduck
Fixed broken links
Configuration menu - View commit details
-
Copy full SHA for 067cd59 - Browse repository at this point
Copy the full SHA 067cd59View commit details
Commits on Apr 24, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 42aa9f1 - Browse repository at this point
Copy the full SHA 42aa9f1View commit details
Commits on Apr 25, 2019
-
Fix Travis deploy [Resolves #493] (#677)
Resolves #493 - Fixes the travis pypi deploy - Adds a travis github pages deploy and a manage command to run the local documentation site
Configuration menu - View commit details
-
Copy full SHA for f82c880 - Browse repository at this point
Copy the full SHA f82c880View commit details -
Configuration menu - View commit details
-
Copy full SHA for f677953 - Browse repository at this point
Copy the full SHA f677953View commit details -
Merge pull request #676 from dssg/redundant_imp_flags
Remove redundant imputation flag columns [Resolves #544]
Configuration menu - View commit details
-
Copy full SHA for 4cebba4 - Browse repository at this point
Copy the full SHA 4cebba4View commit details
Commits on May 2, 2019
-
Add compute best/worst/stochastic for each evaluation [Resolves #292] (…
…#674) * Add compute best/worst/stochastic for each evaluation [Resolves #292] Instead of just computing evaluation metrics based on one sort seed, computes best/worst/stochastic (many sort seeds) - Add schema migration to create new columns to store the different versions of each metric, plus number of trials and standard deviation. The old 'value' is copied to stochastic_value and then dropped. - Modify sorting and thresholding routines to use numpy.arrays instead of converting to Python lists - Update sort_predictions_and_labels to implement best and worst sort in addition to the random one - Update catwalk.ModelEvaluator to generate evaluations for the best/worst sorting for each metric, and do 30 trials for the metrics which have sufficiently different best/worst. To enable this, the evaluation is refactored to decouple the flattening of the metric definitions from the actual evaluation computation. - Update Audition and Postmodeling to look at the 'stochastic value' by default - Remove sort_seed from scoring example config as it is no longer used * Changes from review
Configuration menu - View commit details
-
Copy full SHA for 7435238 - Browse repository at this point
Copy the full SHA 7435238View commit details
Commits on May 3, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 0fd2260 - Browse repository at this point
Copy the full SHA 0fd2260View commit details -
Insert Ranks for Predictions [Resolves #357] (#671)
* Insert Ranks for Predictions [Resolves #357] Adds ranking to the predictions tables. A few flavors of ranking are added. rank_abs (already existing column) - Absolute rank, starting at 1, without ties. Ties are broken based on either a random draw or a user-supplied fallback clause in the predictions table (e.g. label_value) rank_pct (already existing column) - Percentile rank, *without ties*. Based on the rank_abs tiebreaking. rank_abs_without_ties - Absolute rank, starting at 1, with ties and skipping (e.g. if two entities are tied for 3, there will be no 4) The tiebreaking for rank_abs (that cascades to rank_pct) is either done randomly using a random seed that is based on the model's seed, or through user input at the new "prediction->rank_tiebreaker_order_by" config value. What is the model's seed, you ask? It's a new construct, that we store in the models table under 'random_seed'. For each model training task, we generate a value between -1000000000 and 1000000000. This value is set as the Python seed right before training of an individual model, so behavior is the same on singlethreaded or multiprocess training contexts. How is this generated? The experiment requires that one is passed in the config, so this becomes part of the experiment config that is saved. To help make space in the predictions table, and to remove unnecessary precision that would make tiebreaking kind of irrelevant, the score in the predictions tables are turned into DECIMAL(6, 5). To keep track of how tiebreaking was done, there is a new prediction_metadata table that holds this metadata, whether user configuration or the Triage-supplied default. Implementation-wise, this is done via an update statement after predictions are initially inserted with NULL ranks to prevent memory from ballooning.
Configuration menu - View commit details
-
Copy full SHA for 1dc8a4a - Browse repository at this point
Copy the full SHA 1dc8a4aView commit details
Commits on May 6, 2019
-
Configuration menu - View commit details
-
Copy full SHA for a01e1fe - Browse repository at this point
Copy the full SHA a01e1feView commit details -
Merge pull request #685 from dssg/logging_typos
Fix logging typos that only show up when splits are empty
Configuration menu - View commit details
-
Copy full SHA for 6a0de22 - Browse repository at this point
Copy the full SHA 6a0de22View commit details -
Support Python 3.7 [Resolves #683] (#684)
* Remove hdf5 support and pytables * Fix YAML formatting * Test on both 3.6 and 3.7 * Change travis to xenial to support Python3.7 * Make tox more generic python 3
Configuration menu - View commit details
-
Copy full SHA for c2a7728 - Browse repository at this point
Copy the full SHA c2a7728View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9575398 - Browse repository at this point
Copy the full SHA 9575398View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2d0e53b - Browse repository at this point
Copy the full SHA 2d0e53bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 11ee044 - Browse repository at this point
Copy the full SHA 11ee044View commit details
Commits on May 7, 2019
-
ensure
S3Store
does not attempt to write too-large chunks to S3 (……5+ GiB) Underlying library ``s3fs`` automatically writes objects to S3 in "chunks" or "parts" -- *i.e.* via multipart upload -- in line with S3's *minimum* limit for multipart of 5 MiB. This should, in general, avoid S3's *maximum* limit per (part) upload of 5 GiB. **However**, ``s3fs`` assumes that no *single* ``write()`` might exceed the maximum, and as such fails to chunk out such too-large upload requests prompted by singular writes of 5+ GiB. This can and should be resolved in ``s3fs``. But first, it can, should be and is resolved here in ``S3Store``. resolves #530
Configuration menu - View commit details
-
Copy full SHA for be4f431 - Browse repository at this point
Copy the full SHA be4f431View commit details
Commits on May 8, 2019
-
Merge pull request #687 from dssg/jsl/s3store-5gb
write 5+ GiB (matrices) to S3Store
Configuration menu - View commit details
-
Copy full SHA for 4f8992b - Browse repository at this point
Copy the full SHA 4f8992bView commit details -
Configuration menu - View commit details
-
Copy full SHA for fb1207e - Browse repository at this point
Copy the full SHA fb1207eView commit details -
Configuration menu - View commit details
-
Copy full SHA for cc5f66c - Browse repository at this point
Copy the full SHA cc5f66cView commit details
Commits on May 9, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 6cb43f9 - Browse repository at this point
Copy the full SHA 6cb43f9View commit details
Commits on May 16, 2019
-
Don't auto-upgrade db for new Experiments [Resolves #695] (#698)
* Don't auto-upgrade db for new Experiments [Resolves #695] To avoid the problem of time-consuming database upgrades happening when we don't want them, the Experiment now: 1. Checks to see if the results_schema_versions table exists at all. if it doesn't exist, upgrade. This is because means the results schema should be clean in this case, and new users won't have to always run a new thing when they first try Triage. 2. If it does exist, and the version number doesn't match what the code's current HEAD revision is, throw an error. The error message is customized to whether the database revision is a known revision to the code (easy case, just upgrade if you have time) or not (you probably upgraded on a different branch and need to go check out that branch to downgrade).
Configuration menu - View commit details
-
Copy full SHA for 345a3a3 - Browse repository at this point
Copy the full SHA 345a3a3View commit details
Commits on May 20, 2019
-
Add more user database management options to CLI [Resolves #697] (#699)
* Add more user database management options to CLI [Resolves #697] In recent weeks/months, more operations on the results schema have proven to be things that are useful to 'users' (people who use the 'triage' command), not just 'developers' (people who use the 'manage' command). These include: stamping to a specific revision, downgrading, upgrading to a specific revision, and even just viewing the revision history. Here we allow the `triage db` command to interface with alembic to do these things. Furthermore, the old 'stamp' logic in `triage db` isn't terribly useful now that we have been on alembic for a while, and pinning it to experiment config versions wasn't very useful. Using the standard alembic revisions for stamping I think makes more sense, but I copied the dictionary from before into the help text for 'stamp' because it could still be helpful. - Modify old `triage db stamp` logic to use standard alembic revisions - Enable `triage db upgrade` to take a revision (but default to HEAD) - Add `triage db downgrade` that takes a revision - Add `triage db history` to show revisions
Configuration menu - View commit details
-
Copy full SHA for b29b3d0 - Browse repository at this point
Copy the full SHA b29b3d0View commit details -
Bias Part 1: Protected groups generator (#680)
Adds a bias_audit_config section to triage experiment config that supports: - Users can specify the protected groups logic using a pre-computed table (from_obj_table) or a query (from_obj_query) that must contain entity_id, date and the attributes columns to generate the groups for bias audit using aequitas. - Users must specify knowledge_date_column, entity_id_column and a list of attribute_columns, otherwise we would not be able to create the table without knowing which columns it has. - The bias_audit_config is optional. If is set, then there is protected_groups_table generator that is basically a replication of the labels generator. - The protected groups table created is in the named protected_groups_{experiment_hash} and is the result of a left join of the cohort table with the from_obj specified by the user.
Configuration menu - View commit details
-
Copy full SHA for e97aab1 - Browse repository at this point
Copy the full SHA e97aab1View commit details
Commits on May 21, 2019
-
Add README.md to example/config/, explaining experiment.yaml, audition.yaml, postmodeling_config.yaml and postmodeling_crosstabs.yaml Remove feature.yaml and change documentation of feature-testing since cli.py just takes an experiment config.
Configuration menu - View commit details
-
Copy full SHA for 5342254 - Browse repository at this point
Copy the full SHA 5342254View commit details
Commits on May 24, 2019
-
Configuration menu - View commit details
-
Copy full SHA for f763558 - Browse repository at this point
Copy the full SHA f763558View commit details -
Merge pull request #701 from dssg/kit_caps_test
The pull request changes the functionality of the string_is_tablesafe validation primitive to only allow lowercase letters (or numbers, underscores) in strings it checks, as well as adding additional tests for feature aggregation prefixes and subset names, both of which will be used for table names. As described in #632, uppercase letters in these experiment config values end up getting lowercased on table creation by referenced using their uppercase forms (with quotes) at various places in the code, causing postgres to return a "table does not exist" error. This PR also removes a redundant/conflicting dev.txt requirement of different versions of black, keeping the newer version.
Configuration menu - View commit details
-
Copy full SHA for 3638f49 - Browse repository at this point
Copy the full SHA 3638f49View commit details
Commits on May 30, 2019
-
Incorporates an Aequitas bias audit into Triage. The bias audit is optional and is controlled with experiment configuration. This is run during evaluation time and on each model. One dirtyduck config (inspections_dt) is updated with a sample bias audit config. To enable this, some requirements are updated so that Triage and Aequitas can coexist together more peacefully.
Configuration menu - View commit details
-
Copy full SHA for e47f07a - Browse repository at this point
Copy the full SHA e47f07aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 91150d8 - Browse repository at this point
Copy the full SHA 91150d8View commit details -
Scheduled monthly dependency update for May (#679)
* Pin ipython to latest version 7.5.0 * Pin ipython to latest version 7.5.0 * Pin jupyter to latest version 1.0.0 * Pin jupyter to latest version 1.0.0 * Pin sphinx to latest version 2.0.1 * Pin sphinx_rtd_theme to latest version 0.4.3 * Pin coverage to latest version 4.5.3 * Pin flake8 to latest version 3.7.7 * Pin mkdocs to latest version 1.0.4 * Pin tox to latest version 3.9.0 * Pin tox-pyenv to latest version 1.1.0 * Pin nose to latest version 1.3.7 * Pin mock to latest version 2.0.0 * Pin colorama to latest version 0.4.1 * Pin httpie to latest version 1.0.2 * Pin psycopg2-binary to latest version 2.8.2 * Update black from 18.9b0 to 19.3b0 * Pin mkdocs-material to latest version 4.2.0 * Update alembic from 1.0.8 to 1.0.10 * Update sqlalchemy from 1.3.1 to 1.3.3 * Update psycopg2-binary from 2.7.7 to 2.8.2 * Update boto3 from 1.9.125 to 1.9.139 * Update s3fs from 0.2.0 to 0.2.1 * Update ohio from 0.1.2 to 0.4.0 * Update moto from 1.3.7 to 1.3.8 * Update hypothesis from 4.14.2 to 4.18.3 * Update tox from 3.8.4 to 3.9.0
Configuration menu - View commit details
-
Copy full SHA for be6e974 - Browse repository at this point
Copy the full SHA be6e974View commit details
Commits on Jun 5, 2019
-
Configuration menu - View commit details
-
Copy full SHA for b5e8407 - Browse repository at this point
Copy the full SHA b5e8407View commit details -
Configuration menu - View commit details
-
Copy full SHA for 18368c1 - Browse repository at this point
Copy the full SHA 18368c1View commit details
Commits on Aug 7, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 91ebcbd - Browse repository at this point
Copy the full SHA 91ebcbdView commit details
Commits on Oct 7, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 258306b - Browse repository at this point
Copy the full SHA 258306bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1ce00fc - Browse repository at this point
Copy the full SHA 1ce00fcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9299831 - Browse repository at this point
Copy the full SHA 9299831View commit details
Commits on Oct 9, 2019
-
Configuration menu - View commit details
-
Copy full SHA for 40b4b73 - Browse repository at this point
Copy the full SHA 40b4b73View commit details -
Configuration menu - View commit details
-
Copy full SHA for d157ad6 - Browse repository at this point
Copy the full SHA d157ad6View commit details
Commits on Oct 10, 2019
-
Configuration menu - View commit details
-
Copy full SHA for ba5b6c2 - Browse repository at this point
Copy the full SHA ba5b6c2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6c0da70 - Browse repository at this point
Copy the full SHA 6c0da70View commit details