Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This will address the issue raised in #8. The problem there is that netCDF really doesn't like "fancy indexing" (see e.g. here), where an attempt is made to access a bunch of separate small areas in the file (e.g. a list of separated points). This is only going to be more of a problem when the file is being supplied by a remote THREDDS server as reported in the issue - if
max_bytes
is set large enough in the call toget_value_at_coords
then the server will time out attempting to do the fancy indexing, and the only alternative is to set a small maximum request size that gets a few points each time, which is also quite slow.This PR changes this by indexing the dataset with contiguous slices if
max_bytes
allows, which is processed much, much faster. I tested with the notebook inexamples/2_geophys_netcdf_grid_utils_demo.ipynb
(this uses the same dataset referenced in #8) and was able to retrieve 466 points at 10 km spacing in less than 2 seconds with a fast connection to NCI andmax_bytes=50000000
(50 MB), compared to a minimum of 6.9 seconds with the current implementation using the minimum request size ofmax_bytes=1
.The changes aren't quite ready yet because the computation of the slice indices assumes that the list of points is "sorted" in a way that the rectangle bounded by the
i
th point andj
th point is entirely contained in the rectangle bounded by thei
th point andj+1
th point. This will probably hold for lists of points that are along an almost straight line, but not otherwise.