Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Error 301 #5360

Open
yepw opened this issue Apr 10, 2024 · 2 comments
Open

HTTP Error 301 #5360

yepw opened this issue Apr 10, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@yepw
Copy link

yepw commented Apr 10, 2024

Short description
Error when loading dataset "berkeley_autolab_ur5". I didn't run it on Colab but tried to download the dataset locally. I tried to disable GCS following the comments here

Environment information

  • Operating System: Ubuntu 20.04

  • Python version: 3.11.8

  • tensorflow-datasets/tfds-nightly version: 4.9.4+nightly

  • tensorflow/tf-nightly version: 2.17.0-dev20240409

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ? Yes

Reproduction instructions

import tensorflow_datasets as tfds
tfds.core.utils.gcs_utils._is_gcs_disabled = True
import os
os.environ['NO_GCE_CHECK'] = 'true'
tfds = tfds.load('berkeley_autolab_ur5')

Link to logs

2024-04-10 12:13:41.581284: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-10 12:13:41.581593: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-10 12:13:41.583676: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-10 12:13:41.610749: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-10 12:13:42.168805: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-10 12:13:42.461787: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:13:44.161517: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:13:46.965011: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:13:51.745913: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:13:59.867438: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:14:16.302598: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:14:49.168077: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:15:22.071999: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:15:54.555505: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:16:26.999251: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2024-04-10 12:16:59.968858: I external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
Traceback (most recent call last):
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/utils/py_utils.py", line 436, in try_reraise
    yield
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/load.py", line 222, in builder
    return cls(**builder_kwargs)  # pytype: disable=not-instantiable
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py", line 288, in decorator
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_builder.py", line 1370, in __init__
    super().__init__(**kwargs)
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py", line 288, in decorator
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_builder.py", line 287, in __init__
    self.info.initialize_from_bucket()
    ^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py", line 168, in __call__
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_builder.py", line 482, in info
    info = self._info()
           ^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/robotics/dataset_importer_builder.py", line 82, in _info
    features = self.get_ds_builder().info.features
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/robotics/dataset_importer_builder.py", line 149, in get_ds_builder
    ds_builder = tfds.builder_from_directory(ds_location)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/read_only_builder.py", line 150, in builder_from_directory
    return ReadOnlyBuilder(builder_dir=builder_dir)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py", line 288, in decorator
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/read_only_builder.py", line 66, in __init__
    info_proto = dataset_info.read_proto_from_builder_dir(builder_dir)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_info.py", line 1059, in read_proto_from_builder_dir
    return read_from_json(info_path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_info.py", line 1035, in read_from_json
    json_str = epath.Path(path).read_text()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/etils/epath/abstract_path.py", line 157, in read_text
    return f.read()
           ^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow/python/lib/io/file_io.py", line 118, in read
    length = self.size() - self.tell()
             ^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow/python/lib/io/file_io.py", line 96, in size
    return stat(self.__name).length
           ^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow/python/lib/io/file_io.py", line 908, in stat
    return stat_v2(filename)
           ^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow/python/lib/io/file_io.py", line 924, in stat_v2
    return _pywrap_file_io.Stat(compat.path_to_str(path))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.AbortedError: All 10 retry attempts failed. The last failure: Error executing an HTTP request: HTTP response code 301 with body '<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="https://www.googleapis.com/storage/v1/b/gresearch/o/robotics%2Fberkeley_autolab_ur5%2F0.1.0%2Fdataset_info.json?fields=size%2Cgeneration%2Cupdated">here</A>.
</BODY></HTML>
'
	 when reading metadata of gs://gresearch/robotics/berkeley_autolab_ur5/0.1.0/dataset_info.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py", line 168, in __call__
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/load.py", line 643, in load
    dbuilder = _fetch_builder(
               ^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/load.py", line 498, in _fetch_builder
    return builder(name, data_dir=data_dir, try_gcs=try_gcs, **builder_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py", line 168, in __call__
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/load.py", line 219, in builder
    with py_utils.try_reraise(
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/utils/py_utils.py", line 438, in try_reraise
    reraise(e, *args, **kwargs)
  File "/home/yeping/anaconda3/envs/tf-n/lib/python3.11/site-packages/tensorflow_datasets/core/utils/py_utils.py", line 405, in reraise
    raise exception from e
RuntimeError: AbortedError: Failed to construct dataset "berkeley_autolab_ur5", builder_kwargs "{'data_dir': None}": All 10 retry attempts failed. The last failure: Error executing an HTTP request: HTTP response code 301 with body '<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="https://www.googleapis.com/storage/v1/b/gresearch/o/robotics%2Fberkeley_autolab_ur5%2F0.1.0%2Fdataset_info.json?fields=size%2Cgeneration%2Cupdated">here</A>.
</BODY></HTML>
'
	 when reading metadata of gs://gresearch/robotics/berkeley_autolab_ur5/0.1.0/dataset_info.json

Expected behavior
It starts downloading the dataset.

@yepw yepw added the bug Something isn't working label Apr 10, 2024
@marcenacp
Copy link
Collaborator

It seems you cannot disable GCS for this dataset as it downloads all files from the buckets (source). Can you download the dataset from GCS and build it using Beam?

@lgeiger
Copy link
Collaborator

lgeiger commented Jun 14, 2024

I am running into the same issue with the latest TF nightly but this seems unrelated to TFDS. I opened an issue for this on upstream TF tensorflow/tensorflow#69789

@marcenacp Could you have a look at the upstream issue and forward it to the relevant TF team?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants