Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update worker main() function docstrings #121

Open
douglatornell opened this issue Nov 18, 2022 · 0 comments
Open

Update worker main() function docstrings #121

douglatornell opened this issue Nov 18, 2022 · 0 comments
Assignees
Labels
documentation Improvements or additions to documentation Workers
Milestone

Comments

@douglatornell
Copy link
Member

douglatornell commented Nov 18, 2022

Example of what we want:

"""For command-line usage see:

:command:`python -m nowcast.workers.make_averaged_dataset --help`
"""
  • Several workers are missing main() docstrings.
  • Most that have docstrings have an not informative "Set up and run the worker." line at the beginning.
@douglatornell douglatornell added documentation Improvements or additions to documentation Workers labels Nov 18, 2022
@douglatornell douglatornell added this to the v22.1 milestone Nov 18, 2022
@douglatornell douglatornell self-assigned this Nov 18, 2022
@douglatornell douglatornell modified the milestones: v22.1, v23.1 Jan 10, 2023
@douglatornell douglatornell modified the milestones: v23.1, v23.2 Sep 28, 2023
@douglatornell douglatornell modified the milestones: v23.2, v24.1 Jan 8, 2024
douglatornell added a commit that referenced this issue Feb 3, 2024
Update docstring re: issue #121.

Return worker so that modernized unit tests for main() work; re issue #81.
douglatornell added a commit that referenced this issue Feb 29, 2024
… v3 (#234)

* Modernize test_get_onc_ctd

Replace unittest.mock.patch decorator with pytest.fixture for mock worker.

Add unit tests for production YAML config file elements related to worker;
re: issue #117.

Replace unittest.mock.patch decorator with pytest caplog fixture for tests of
logging; re: issue #82.

* Add unit test for YAML config file elements

Added unit test for production YAML configuration file ctd data observations
elements used in after_* () functions.

* Improve get_onc_ctd.main() function

Update docstring re: issue #121.

Return worker so that modernized unit tests for main() work; re issue #81.

* Update get_onc_ctd worker to use ONC API v3

Modified the get_onc_ctd worker to use ONC API v3 'getByLocation' method. The
new method uses a location code instead of a station name. Also, these changes
include adding a 'dateTo' parameter to limit the data to a specific time range
for the day. Additionally, small changes to variable and parameter names were
made to match the new method requirements.

* Remove test skipping due to resolved issue #174

The pytest skip marker for "_resample_nav_coord()" function in
"tests/workers/test_get_onc_ferry.py" file has been removed. This has happened
following the successful resolution of issue number #174, making it unnecessary
to skip these specific unit tests anymore.

Fixes issue #174

* Add unit tests for YAML config file elements

Add unit tests for production YAML config file elements related to worker;
re: issue #117.

* Update YAML config with Tsawwassen - Duke Point ferry route

The 'nowcast.yaml' configuration file has been updated to include the
Tsawwassen - Duke Point ferry route details including devices, sensors,
and other relevant data.

* Update dataset ID for TWDP-ferry

The dataset ID for TWDP-ferry was updated in the 'nowcast.yaml' configuration
file and in respective test. The change is reflected accurately to match the
new details for the Tsawwassen - Duke Point ferry route.

* Update get_onc_ferry worker to use ONC API v3

Modified the get_onc_ferry worker to use ONC API v3 'getByLocation' method. The
new method uses a location code instead of a station name. Also, these changes
include adding a 'dateTo' parameter to limit the data to a specific time range
for the day. Server-side averaging into 1-second bins is used to ensure that an
entire day's observations from each sensor can be obtained by a single API
request; there is a limit of 100,000 "rows" per sensor. Additionally, small
changes to variable and parameter names were made to match the new method
requirements.

* Update ONC_data_product_url attr in ferry datasets

Updated the ONC_data_product_url attribute in the datasets produced by the
get_onc_ferry worker to reflect changes in the data source's domain name and API
query parameter names. Code changes include updating the ONC data API domain
name, adding the 'locationCode' query parameter, and changing the
'deviceCategory' query parameter name to 'deviceCategoryCode'. All of those
changes are for compatibility with the ONC data API v3.

* Correct representation of relative humidity attribute

Changed the naming representation of the relative humidity attribute in the
'get_onc_ferry' worker. The attribute was initially named "REL_HUMIDITY" and it
was renamed to "rel_humidity", matching the common low-caps format of the other
attributes for consistency of naming convention.

* Update QA/QC filter criteria in get_onc_ferry worker

Adjust the QA/QC filter settings in the 'get_onc_ferry' worker to now include
data where the 'qaqcFlag' attribute is less than or equal to 1 or greater than
or equal to 7. This update is necessary because of the change to server-side
averaging in the request.

* Improve empty data array handling

ONC API v3 has more/different ways of returning empty responses to data requests
that we have to handle. It's not as clean as I would like, but it is good
enough for the purpose.
douglatornell added a commit that referenced this issue Mar 29, 2024
Removed not informative "Set up and run the worker." line at the beginning.

re: issue #121
douglatornell added a commit that referenced this issue Mar 30, 2024
* Change NowcastWorker mock to pytest fixture

Test suite maintenance.

re: issue #81

* Update make_plots worker main() function docstring

Removed not informative "Set up and run the worker." line at the beginning.

re: issue #121

* Change logging mocks to pytest caplog fixture

Replace unittest.mock.patch decorator with pytest caplog fixture for tests of
logging.

Test suite maintenance re: issue #82.

* Add unit tests for YAML config file elements

Add unit tests for production YAML config file elements related to worker;
re: issue #117.

* Remove unused ferry_data_dir configuration retrieval

The ferry_data_dir configuration was retrieved but not used in the make_plots.py
worker script. This change removes the unneeded line to tidy up the code and
avoid potential confusion in the future.

* Update to V21-11 dataset URLs in config & tests

The URLs for '3d tracer fields' and '3d biology fields' have been updated in the
test_make_plots.py and nowcast.yaml files.

* Rename physics dataset keys to '3d physics fields'

This commit changes the '3d tracer fields' key in the nowcast.yaml configuration
file, the make_plots.py worker and test_make_plots.py tests to
'3d physics fields'. This change better reflects in the relevant data source
URLs.

* Update zooplankton field var names in make_plots

Update zooplankton field variable names for the time series plot function in the
make_plots worker. Specifically, change "mesozooplankton" and "microzooplankton"
to "z1_zooplankton" and "z2_zooplankton". This update ensures consistency with
the V21-11 model output variable names.

* Replace Mesodinium rubrum w/ Diatoms in time series plots

In the 'make_plots' worker, the field variable 'mesodinium' was changed to
'diatoms' for the time series plots. The 'diatoms_flagellates_timeseries'
dictionary key is adjusted accordingly. This is necessary due to the removal of
the Mesodinium rubrum variable from the V21-11 model calculations and output.

* Add z1 & z2 zooplankton to color dict in website_theme

Two new types of zooplankton, 'z1_zooplankton' and 'z2_zooplankton', have been
added to the color dictionary of nowcast/figures/website_theme.py file. This
change would allow the correct color to be displayed for these new types in the
corresponding plots.
douglatornell added a commit that referenced this issue May 17, 2024
Removed not informative "Set up and run the worker." line at the beginning.

re: issue #121
douglatornell added a commit that referenced this issue May 19, 2024
* Update make_plots worker main() function docstring

Removed not informative "Set up and run the worker." line at the beginning.

re: issue #121

* Refactor to use textwrap for config file writing

This commit introduces the use of `textwrap.dedent` to maintain the
configuration file's content in the tests for the surface current tiles worker
module. Using `textwrap.dedent` makes the inline configuration text more
manageable and readable by removing the initial indentations.

* Change NowcastWorker mock to pytest fixture

Test suite maintenance.

re: issue #81

* Change logging mocks to pytest caplog fixture

Replace unittest.mock.patch decorator with pytest caplog fixture for tests of
logging.

Test suite maintenance re: issue #82.

* Replace `PyPDF2` package with `pypdf`

Updated the project dependencies and scripts to use the `pypdf` package instead
of `PyPDF2`. Also, adapted the code in `make_surface_current_tiles.py` to use
`pypdf`'s `PdfWriter` instead of `PyPDF2`'s `PdfFileMerger`.

This resolves the Jun-2023 security alert re: CVE-2023-36464 infinite loop
vulnerability and deprecation of PyPDF2.
douglatornell added a commit that referenced this issue Jun 6, 2024
Removed not informative "Set up and run the worker." line at the beginning.

re: issue #121
douglatornell added a commit that referenced this issue Jun 6, 2024
* Update make_live_ocean_file workers main() function docstrings

Removed not informative "Set up and run the worker." line at the beginning.

re: issue #121

* Update make_live_ocean_files re: single filepath

`salishsea_tools.LiveOcean_BCs.create_LiveOcean_TS_BCs()` returns a single
 filepath not a list of filepaths. As a result of this change, the related
 logging message and checklist assignment have been updated and corresponding
 test cases have been adjusted as well.
douglatornell added a commit that referenced this issue Jun 7, 2024
Removed not informative "Set up and run the worker." line at the beginning.

re: issue #121
douglatornell added a commit that referenced this issue Jun 13, 2024
* Add h5netcdf pkg dependency

Hoping to improve the reliability of the make_ww3_current_file and
make_ww3_wind_file workers by changing them to use h5netcdf for dataset reads.

The change results in version updates for multiple dependencies in
requirements .txt, and inclusion of the h5netcdf package across multiple
environment files and the pyproject.toml file.

* Update make_ww3_*_file workers main() function docstrings

Removed not informative "Set up and run the worker." line at the beginning.

re: issue #121

* Update wind file generation re: unneeded variables

Refactored the WWatch3 wind file generation to remove unnecessary variables and
improved related tests. The code now drops more unneeded variables to reduce
the memory load. The corresponding tests are also enhanced to accurately
represent these changes.

* Switch to h5netcdf engine for xarray dataset reads

The h5netcdf engine has been set as the engine for opening datasets in
'make_ww3_wind_file.py'. The intent is to avoid the netcdf4 package
thread-safety issues. The corresponding test cases in
'test_make_ww3_wind_file.py' have also been updated to reflect this change.

* Increase wind time_counter chunk size from 1 to 24

The chunk size for the time_counter variable in the make_ww3_wind_file worker
and corresponding tests has been increased. This change is anticipated to
improve efficiency by processing data in larger batches.

* Use dask processes for wwatch3 wind file generation

The wind file creation process for the WWatch3 model has been updated to use
processes rather than the default threads for the dask scheduler. Processes have
been found to be more reliable for dask operations on the types of workloads we
use in SalishSeaCast.

* Use netcdf4 to save dataset in make_ww3_wind_file

Explicitly use netcdf4 as the engine for dataset writing. This avoids
incompatibilities in the resulting file that arise if it is written using
h5netcdf.

* Update make_ww3_current_file re: unneeded variables

Refactored the WWatch3 current file generation to remove unnecessary variables
and improved related tests. The code now drops more unneeded variables to reduce
the memory load. The corresponding tests are also enhanced to accurately
represent these changes.

* Switch to h5netcdf engine for xarray dataset reads

The h5netcdf engine has been set as the engine for opening datasets in
'make_ww3_current_file.py'. The intent is to avoid the netcdf4 package
thread-safety issues. The corresponding test cases in
'test_make_ww3_current_file.py' have also been updated to reflect this change.

* Decrease current time_counter chunk size to 1

The chunk size for the time_counter variable in the make_current_wind_file
worker has decreased from 3 to 1. Testing showed that the smaller chunk size
resulted in slightly faster processing.

* Use dask processes for wwatch3 currents file generation

The currents file creation process for the WWatch3 model has been updated to use
processes rather than the default threads for the dask scheduler. Processes have
been found to be more reliable for dask operations on the types of workloads we
use in SalishSeaCast.

* Use netcdf4 to save dataset in make_current_wind_file

Explicitly use netcdf4 as the engine for dataset writing. This avoids
incompatibilities in the resulting file that arise if it is written using
h5netcdf.
@douglatornell douglatornell modified the milestones: v24.1, v24.2 Jun 13, 2024
douglatornell added a commit that referenced this issue Jul 16, 2024
Remove non-informative "Set up and run the worker." line at the beginning.

re: issue #121
douglatornell added a commit that referenced this issue Jul 16, 2024
* Add get_vfpa_hadcp after 06 weather collection

Collect the previous day's observations from the VFPA HADCP located at the 2nd
Narrows railway bridge early in the morning after the 06Z weather forecast
products have been downloaded. This restores daily collection of the HADCP obs
that were inadvertently stopped when we stopped running the VHFR FVCOM model in
Mar-2023.

* Update get_vfpa_hadcp main() function docstring

Remove non-informative "Set up and run the worker." line at the beginning.

re: issue #121

* Change NowcastWorker mock to pytest fixture

Test suite maintenance.

re: issue #81

* Change logging mocks to pytest caplog fixture

Replace unittest.mock.patch decorator with pytest caplog fixture for tests of
logging.

Test suite maintenance re: issue #82.

* Modernize unit tests for get_vfpa_hadcp worker

Modified the unit tests in test_get_vfpa_hadcp.py to use pytest fixtures and
monkeypatching for more accurate and isolated tests. This includes new mocks
for make_hour_dataset and write_netcdf, as well as changes in logging and
managing file paths. Particular attention given was given to ensuring accurate
capture of log messages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation Workers
Projects
None yet
Development

No branches or pull requests

1 participant