Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot access compound dataset which contains array of enum #39

Open
tmick0 opened this issue Nov 10, 2017 · 2 comments
Open

Cannot access compound dataset which contains array of enum #39

tmick0 opened this issue Nov 10, 2017 · 2 comments
Assignees

Comments

@tmick0
Copy link

tmick0 commented Nov 10, 2017

I am trying to access a dataset which contains an enum array via h5serv, however h5pyd throws the following exception:

  File "$HOME/project/venv/lib/python2.7/site-packages/h5pyd-0.2.6-py2.7.egg/h5pyd/_hl/group.py", line 335, in __getitem__
    tgt = getObjByUuid(link_json['collection'], link_json['id'])
  File "$HOME/project/venv/lib/python2.7/site-packages/h5pyd-0.2.6-py2.7.egg/h5pyd/_hl/group.py", line 311, in getObjByUuid
    tgt = Dataset(DatasetID(self, dataset_json))
  File "$HOME/project/venv/lib/python2.7/site-packages/h5pyd-0.2.6-py2.7.egg/h5pyd/_hl/dataset.py", line 416, in __init__
    self._dtype = createDataType(self.id.type_json)
  File "$HOME/project/venv/lib/python2.7/site-packages/h5pyd-0.2.6-py2.7.egg/h5pyd/_hl/h5type.py", line 725, in createDataType
    dt = createDataType(field['type'])  # recursive call
  File "$HOME/project/venv/lib/python2.7/site-packages/h5pyd-0.2.6-py2.7.egg/h5pyd/_hl/h5type.py", line 732, in createDataType
    dtRet = createBaseDataType(typeItem)  # create non-compound dt
  File "$HOME/project/venv/lib/python2.7/site-packages/h5pyd-0.2.6-py2.7.egg/h5pyd/_hl/h5type.py", line 638, in createBaseDataType
    raise TypeError("Array Type base type must be integer, float, or string")
TypeError: Array Type base type must be integer, float, or string

We can create a minimal dataset to reproduce the error using h5py as follows:

import h5py
import numpy as np

f = h5py.File('test.h5', 'w')
enum_type = h5py.special_dtype(enum=('i', {"FOO": 0, "BAR": 1, "BAZ": 2}))
comp_type = np.dtype([('my_enum_array', enum_type, 10), ('my_int', 'i'), ('my_string', np.str_, 32)])
dataset = f.create_dataset("test", (4,), comp_type)
f.close()

We then put it in h5serv's data directory and try to access it:

import h5pyd
f = h5pyd.File("test.hdfgroup.org", endpoint="http://127.0.0.1:5000")
print(f['test'])

This yields the above exception. Note that we are able to access the dataset as expected using regular h5py.

Applying the following patch to h5pyd prevents the exception and returns a dataframe, however it doesn't seem to give the correct behavior (the enum array seems to be treated as an int array):

diff --git a/h5pyd/_hl/h5type.py b/h5pyd/_hl/h5type.py
index 4ce6cb4..10ce562 100644
--- a/h5pyd/_hl/h5type.py
+++ b/h5pyd/_hl/h5type.py
@@ -637 +637 @@ def createBaseDataType(typeItem):
-            if arrayBaseType["class"] not in ('H5T_INTEGER', 'H5T_FLOAT', 'H5T_STRING'):
+            if arrayBaseType["class"] not in ('H5T_INTEGER', 'H5T_FLOAT', 'H5T_STRING', 'H5T_ENUM'):

I'm not sure how to properly proceed in working around this. Thanks in advance for your advice.

@jreadey
Copy link
Member

jreadey commented Nov 15, 2017

Hi, it looks like the test coverage for enum types is pretty thin - we'll want to beef this up.

I'm a bit confused just using h5py with your HDF5 file.
If I do this:

f = h5py.File("test.h5", 'r')
dset = f['test']
print(dset.dtype)
dt = dset.dtype["my_enum_array"]
print("enum dt: {}".format(dt))
print(h5py.check_dtype(enum=dt))

I'm getting "None" for the last output line. Is this what you see?

@tmick0
Copy link
Author

tmick0 commented Nov 15, 2017

Yes, it seems that the metadata is lost if we access it that way. However, if I write f['test']['my_enum_array'].dtype.metadata (or equivalently, h5py.check_dtype(enum=f['test']['my_enum_array'].dtype)), the enum dictionary is retrieved as expected. This is pretty confusing behavior indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants