Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() #46322

Open
chatsmile opened this issue Jun 28, 2024 · 0 comments
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@chatsmile
Copy link

chatsmile commented Jun 28, 2024

What happened + What you expected to happen

A simple example but fails TAT
Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.init()

Failure # 1 (occurred at 2024-06-28_21-09-46)
The actor died because of an error raised in its creation task, �[36mray::PPO.init()�[39m (pid=227821, ip=192.168.0.110, actor_id=baeb6937fabac97b336e66f101000000, repr=PPO)
File "/home/yujintong/anaconda3/envs/torch/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 229, in _setup
self.add_workers(
File "/home/yujintong/anaconda3/envs/torch/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 682, in add_workers
raise result.get()
File "/home/yujintong/anaconda3/envs/torch/lib/python3.8/site-packages/ray/rllib/utils/actor_manager.py", line 497, in _fetch_result
result = ray.get(r)
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, �[36mray::RolloutWorker.init()�[39m (pid=227899, ip=192.168.0.110, actor_id=2800cc7cdc700cfb962c6dba01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f198c773b80>)
File "/home/yujintong/anaconda3/envs/torch/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 407, in init
self.env = env_creator(copy.deepcopy(self.env_context))
File "/home/yujintong/anaconda3/envs/torch/lib/python3.8/site-packages/ray/rllib/env/utils/init.py", line 125, in _gym_env_creator
env = env_descriptor(env_context)
TypeError: init() takes 1 positional argument but 2 were given

During handling of the above exception, another exception occurred:

�[36mray::PPO.init()�[39m (pid=227821, ip=192.168.0.110, actor_id=baeb6937fabac97b336e66f101000000, repr=PPO)
File "/home/yujintong/anaconda3/envs/torch/lib/python3.8/site-packages/ray/rllib/algorithms/algorithm.py", line 533, in init
super().init(
File "/home/yujintong/anaconda3/envs/torch/lib/python3.8/site-packages/ray/tune/trainable/trainable.py", line 161, in init
self.setup(copy.deepcopy(self.config))
File "/home/yujintong/anaconda3/envs/torch/lib/python3.8/site-packages/ray/rllib/algorithms/algorithm.py", line 631, in setup
self.workers = WorkerSet(
File "/home/yujintong/anaconda3/envs/torch/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 181, in init
raise e.args[0].args[2]
TypeError: init() takes 1 positional argument but 2 were given

Versions / Dependencies

Ray 2.10.0
python 3.8

Code below

from future import absolute_import
from future import division
from future import print_function

import argparse
import gymnasium as gym

from Task_env import ASEnv
import ray
from ray.tune import register_env
from ray.rllib.models import ModelCatalog
from ray.rllib.algorithms.ppo import PPOConfig
from ray.rllib.env.multi_agent_env import ENV_STATE
from ray import air
from ray import tune
from ray import train
from ray.rllib.algorithms.algorithm import Algorithm

if name == "main":
parser = argparse.ArgumentParser()
parser.add_argument("--num-iters", type=int, default=1000)
parser.add_argument("--num-workers", type=int, default=16)
args = parser.parse_args()

ray.init()
def policy_mapping_fn(agent_id):
    return agent_id

config = PPOConfig()
config.environment(env=ASEnv)
config.training(train_batch_size=8000)
config.resources(num_gpus=0)
config.rollouts(num_rollout_workers=args.num_workers, sample_async=False)
config.multi_agent(
    policies={
        agent: (None, gym.spaces.Box(-1.0, 1.0, (18,)),
                gym.spaces.MultiDiscrete([3,3]), {
                    "gamma": 0.99,
                    "model": {
                        "use_lstm": True
                    }
                }) for agent in ["Agent{}".format(i) for i in range(3)] #  + ["main"]
    },
    policy_mapping_fn=policy_mapping_fn
)
#训练

tune.Tuner(
    "PPO",
    run_config=air.RunConfig(
        checkpoint_config=train.CheckpointConfig(checkpoint_frequency=1, num_to_keep=5,
                                                 checkpoint_score_attribute='episode_reward_mean'),
        name="IPPO",
        stop={
            "timesteps_total": 40000,
        }
    ),
    param_space=config.to_dict(),
).fit()

Issue Severity

High: It blocks me from completing my task.

@chatsmile chatsmile added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

No branches or pull requests

1 participant