Skip to content

Commit

Permalink
remove need for dea-env module, move some pip packages to conda env
Browse files Browse the repository at this point in the history
  • Loading branch information
Ariana Barzinpour committed Nov 22, 2023
1 parent 0c31644 commit 4f28d76
Show file tree
Hide file tree
Showing 7 changed files with 32 additions and 70 deletions.
4 changes: 4 additions & 0 deletions docker/env.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ dependencies:
- distributed
- docutils
- ephem
- eodatasets3
- fiona
- Flask
- Flask-Babel
Expand Down Expand Up @@ -92,6 +93,9 @@ dependencies:
- nodejs
- numexpr
- numpy
- odc-algo
- odc-dscache
- odc-io
- ordered-set
- packaging
- pandas
Expand Down
4 changes: 0 additions & 4 deletions docker/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ jupyter-nbextensions-configurator

# ODC/DEA: these are installed in builder stage
otps
eodatasets3

# Dale's s2cloudmask
# https://github.com/daleroberts/s2cloudmask
Expand All @@ -19,10 +18,7 @@ opencv-python-headless
opencv-contrib-python-headless

datacube[performance,s3]
odc-algo
odc-cloud[ASYNC]
odc-dscache
odc-io
odc-stac
odc-stats[ows]
odc-ui
Expand Down
49 changes: 16 additions & 33 deletions nci_environment/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,22 @@

These scripts are used to update and deploy DEA modules on NCI.

There are two modules, with date-based version numbers:
The *dea* module, with a date-based version number, contains third party dependencies
of all of the DEA code, installed via a `conda` environment. This `conda` environment is the same
as the one for dea-sandbox.

1. The Python Environment module *dea-env*

* This contains third party dependencies of all of the DEA code, installed via
a `conda` environment.

2. A *dea* module, which depends on the _environment module_:
Additionally, the environment includes:

* [Open Data Cube Core](https://github.com/opendatacube/datacube-core/)

* [EO Datasets](https://github.com/GeoscienceAustralia/eo-datasets/)

* [Digital Earth AU](https://github.com/GeoscienceAustralia/digitalearthau/)

* [Data Cube Stats](https://github.com/GeoscienceAustralia/datacube-stats/)

* [Fractional Cover](https://github.com/GeoscienceAustralia/fc/)

* [Water Observation From Space](https://github.com/GeoscienceAustralia/wofs)

* Creates users accounts in the Production Database the first time it is
loaded by a user.

Expand All @@ -34,14 +31,11 @@ There are two modules, with date-based version numbers:

This will load the latest version of `dea/<build_date>` module.

It will also load `dea-env/<build_date>` which contains all of the software
dependencies for using DEA.

## Notes

Loading these module might conflict with other python modules you have loaded.
Loading this module might conflict with other python modules you have loaded.

The `dea-env` module will prevent conflicts with locally installed python packages by
The `dea` module will prevent conflicts with locally installed python packages by
changing `PYTHONUSERBASE` for each release;

pip install --user <package_name>
Expand All @@ -54,33 +48,23 @@ It includes a config file, which it specifies by setting the

# Maintainer Instructions

Only run these scripts from Raijin. We've seen filesystem sync issues when
Only run these scripts from Gadi. We've seen filesystem sync issues when
run from VDI.

module load python3/3.8.5
module load python3/3.10.4
pip3 install --user pyyaml jinja2

## Building a new _Environment Module_

It requires python 3.8+ and pyyaml. Run the following on raijin at the NCI:

$ module use /g/data/v10/public/modules/modulefiles/
$ module load python3/3.8.5
$ ./build_environment_module.py dea-env/modulespec.yaml

This will build a new environment module for today.

The module version number is the current date in format YYYYMMDD, as it is a snapshot
of all of our pip/conda dependencies on that date.

## Building a new _DEA Module_

A DEA module will specify one exact environment module.
It requires python 3.10+ and pyyaml. Run the following on gadi at the NCI:

$ module use /g/data/v10/public/modules/modulefiles/
$ module load python3/3.8.5
$ module load python3/3.10.4
$ ./build_environment_module.py dea/modulespec.yaml

The module version number is the current date in format YYYYMMDD, as it is a snapshot
of all of our pip/conda dependencies on that date.

## Updating the Default Version

Once a module has been tested and approved, it can be made the default.
Expand Down Expand Up @@ -108,14 +92,13 @@ Eg. For `dea` this is: `/g/data/v10/public/modules/modulefiles/dea/.version`
## Setup

Copy the 3 lines below and modify the VERSION value
to the dea and dea-env module version you would
to the dea module version you would
like the tests to be run on. Paste them in a brand
new shell session/terminal

VERSION="20230710"
module use /g/data/v10/public/modules/modulefiles
module load dea/$VERSION
module load dea-env/$VERSION

## Execution
On gadi, just run the tests with in this fashion:
Expand Down
27 changes: 2 additions & 25 deletions nci_environment/build_environment_module.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,13 @@
- (opt) Conda environment to create
- (opt) Pip style requirements.txt to install to a directory
It requires python 3.9+ and pyyaml.
It requires python 3.10+ and pyyaml.
Use a qsub interactive copyq job on raijin with sufficient memory to run the following commands at the NCI:
New DEA-Env Module
$ module use /g/data/v10/public/modules/modulefiles/
$ module load python3/3.9.2
$ module load python3/3.10.4
# if pyyaml is not installed in gadi
$ pip install PyYAML --user
$ # Building a new Environment Module:
$ python3 build_environment_module.py dea-env/modulespec.yaml
New DEA Module
$ module use /g/data/v10/public/modules/modulefiles/
$ module load python3/3.9.2
$ # Building a new DEA Module
$ python3 build_environment_module.py dea/modulespec.yaml
Expand Down Expand Up @@ -361,20 +353,6 @@ def run_final_commands_on_module(commands, module_path):
run_command(cmd)


# def include_stable_module_dep_versions(config):
# """
# Include stable module dependency versions

# :param config: Dictionary of configuration variables
# :return: None
# """
# stable_module_deps = config.get("stable_module_deps", [])
# for dep in stable_module_deps:
# default_version = find_default_version(dep)
# dep = dep.replace("-", "_")
# config["variables"][f"fixed_{dep}"] = default_version


def main(config_path):
"""
Build new environment module
Expand All @@ -394,7 +372,6 @@ def main(config_path):
if "module_version" not in variables:
variables["module_version"] = date()
include_templated_vars(config)
# include_stable_module_dep_versions(config)

pre_check(config)
prep(config_path)
Expand Down
8 changes: 6 additions & 2 deletions nci_environment/dea/modulefile.template
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ prepend-path PYTHONPATH ${module_path}/share/qgis/python
remove-path PATH ${module_path}/bin
prepend-path PATH ${module_path}/bin

# Remove duplicate entries for python path and prepend again
remove-path PYTHONPATH ${python_path}
prepend-path PYTHONPATH ${python_path}

# To avoid user packages conflicting with Environment Module packages, point the PYTHONUSERBASE and PATH
# variables to point to a directory based on the Environment Module version which is loaded so that extra
# packages must be re-installed when a new dea module is released
Expand All @@ -55,8 +59,8 @@ if {[module-info mode load] && [is-loaded $$name/$$version]} {
}

# Remove duplicate entries for HOME dir and prepend at the top
remove-path PYTHONPATH ~/.dea-sandbox/${module_name}/${module_version}/local/lib/python3.9/site-packages
prepend-path PYTHONPATH ~/.dea-sandbox/${module_name}/${module_version}/local/lib/python3.9/site-packages
remove-path PYTHONPATH ~/.dea-sandbox/${module_name}/${module_version}/local/lib/python3.10/site-packages
prepend-path PYTHONPATH ~/.dea-sandbox/${module_name}/${module_version}/local/lib/python3.10/site-packages


#############################################################
Expand Down
6 changes: 4 additions & 2 deletions nci_environment/dea/modulespec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,19 @@ variables:
conda_path: "/g/data/v10/private/mambaforge/bin/mamba"
dbhost: dea-db.nci.org.au
dbport: 6432
python_version: 3.9
python_version: '3.10'

templated_variables:
module_path: "{modules_dir}/{module_name}/{module_version}"
python_path: "{modules_dir}/{module_name}/{module_version}/lib/python{python_version}/site-packages/"
dea_module: "{module_name}/{module_version}"
pip_path: "{modules_dir}/{module_name}/{module_version}/bin/pip3"

install_conda_packages: ../../docker/env.yaml

install_pip_packages:
pip_cmd: "module load python3/3.9.2; pip install --no-warn-script-location --prefix {module_path} --requirement requirements.txt; pip install --no-warn-script-location --prefix {module_path} --requirement requirements-private.txt"
# need to specify which python/pip to use otherwise it defaults to the incorrect one
pip_cmd: "{pip_path} install --no-warn-script-location --prefix {module_path} --requirement requirements.txt; {pip_path} install --no-warn-script-location --prefix {module_path} --requirement requirements-private.txt"

copy_files:
- src: ../../docker/env.yaml
Expand Down
4 changes: 0 additions & 4 deletions nci_environment/dea/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
datacube
eodatasets3
odc-algo
odc-apps-dc-tools
odc-dscache
odc-geom
odc-io
odc-ui

0 comments on commit 4f28d76

Please sign in to comment.