Fixing references in JOSS draft

john-hawkins · Sep 29, 2024 · 000db06 · 000db06
1 parent 3377c28
commit 000db06
Showing 1 changed file with 21 additions and 21 deletions.
diff --git a/docs/paper/joss.md b/docs/paper/joss.md
@@ -41,25 +41,25 @@ and processes.
 
 Software approaches to managing scientific data, processes and meta-data are 
 typically either built as front-ends for specific 
-scientific domains [@Howe2008;Pettit:2010] (leveraging known analytical practices in
-the given domain) or they are designed to faciliate interoperability between different 
+scientific domains [@Howe2008;@Pettit:2010] 
+or they are designed to faciliate interoperability between different 
 technology stacks [@Subramanian2013]. Machine learning focused frameworks tend to 
 focus on solving problems of model training and deployment for specific 
-technologies\cite[@Alberti:2018;MolnerDomenech:2020], and hence have limited generality.
+technologies [@Alberti:2018;@MolnerDomenech:2020], and hence have limited generality.
 
 `Projit` is a Python package for managing data science project meta-data
-inside a simple local JSON store. It also provides a CLI tool for
+inside a simple local JSON store. It provides a CLI tool for
 interrogating this data so that the current state of a project can easily
 be assessed and understood. The API for `projit` was
 designed so that it can be included in arbitrary python scripts to
 locate datasets, register experiments and store results along
 with hyper-parameters. 
 
-The `projit` datastore is light-weight enough that it can easily be stored
-with code inside a source code repository. Meaning that future users can
-interrogate the experiment history of the project. This is useful for both
+The `projit` datastore is light-weight so it can be saved
+with code inside a source code repository. Allowing future users to
+interrogate the experiment history of project. This is useful for both
 project continuation, auditing/repeatability and opening the possibility
-of scripted meta-data analysis. The package has been
+of scripted meta-data analysis. The `projit` package has been
 used in a number of scientific publications to manage the results of 
 machine learning experiments into systematic reviews for biomedical
 projects [@Hawkins+Tivey:2024] and the analysis of text features derived 
@@ -80,12 +80,12 @@ generate standardised result sets for comparison.
 To facilitate loose coupling between stages of the project the `projit` utility
 imposes a simple schema for components of a data science project. These consist
 of:
-- Datasets
-- Experiments
-- Results
+* Datasets
+* Experiments
+* Results
 
 All of these entities can be added, removed or modified using either the CLI tool
-or the Python package within scripts. The relation of these components is depicted
+or the Python package within scripts. These entities in a project are depicted
 in Figure \autoref{fig:projit}
 
 ![Projit Application Entities.\label{fig:projit}](images/Projit_decoupled_process.drawio.png)
@@ -109,29 +109,29 @@ from anywhere inside the project without tracking the location of the root direc
 Secondly, we develop a sub-command structure that allows the `'projit` CLI to be
 a versatile tool with something close to a natural language interface.
 For example, the primary command `list` can be applied to any of the `projit` 
-entities, as shown in the code listing below:
+entities, as shown in the command below:
 
 ```
-projit list datasets
-projit list experiments
-projit list results
+> projit list datasets
 ```
 
 The same principle applies to the remove and add commands, which naturally require
 additional paramaters to specifiy what is being added or removed. The design goal 
-of the CLI is to make project intuitive without imposing arbitrary constraints.
+of the CLI is to make projit intuitive without imposing arbitrary constraints.
 
 # Research Applications
 
 The fundamental research application of `projit` is in managing the project lifecycle
-and efficiency of development. Results to all experiments can be tracked and 
-interrogated to easily produce tables of data. 
-An additional level of application comes with a focus
+and efficiency of development. Paths to datasets are retrieved from meta-data, not
+hard coded. Experiments are named, with execution times tracked. The Results to 
+all experiments can be tracked over each iteration, with hyper-parameters and 
+interrogated to easily produce tables of data and analysis.
+Additional application comes with a focus
 on open science, allowing other teams to review and audit experiment history, 
 then easily repeat or extend experiments. 
 Finally, there is a research application in meta-analysis.
 Projects in which the projit meta-data are stored along with open source code can 
-be interrogated to look at the performance of certain techniques or algorithms across
+be analysed to look at the performance of certain techniques or algorithms across
 multiple projects.  
 
 # Acknowledgements