Skip to content

Commit

Permalink
Initial talk outline
Browse files Browse the repository at this point in the history
  • Loading branch information
John Hawkins authored and John Hawkins committed Jul 27, 2023
1 parent fb4e4a7 commit d456b21
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions docs/talk/talk.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Projit: An Open Source Tool for Data Science Project Management

Data science projects often require multiple rounds of experimentation that involves careful discipline to prevent problems like over-fitting or p-hacking. Data science projects require management of data, models and results, but the management process needs to be flexible enough to enable rapid and agile experimentation. In this talk we will introduce 'projit,' an open source utility for managing data projects through loose coupling of components in a Git style command line interface.

Projit allows you to register data sets, access them in your experiments through the projit package. You then programmatically register the experiments with an arbitrary list of results and hyper-parameters for each experimental run. All data is stored in a central meta-data repository. This repository can be queried with the command line utility and committed into a git repository such that it stores the meta-data history of your project.



0 comments on commit d456b21

Please sign in to comment.