Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDPDataBunch (and Dataset, Item, and ItemList) #1

Open
2 of 6 tasks
josiahls opened this issue Aug 12, 2019 · 1 comment
Open
2 of 6 tasks

MDPDataBunch (and Dataset, Item, and ItemList) #1

josiahls opened this issue Aug 12, 2019 · 1 comment
Labels
Discussion Generally something we should talk / debate about

Comments

@josiahls
Copy link
Owner

josiahls commented Aug 12, 2019

Discussion
I will be moving a lot of the text in the README regarding DataBunches into here to be more constructive/interactive once the basic requirements of the repository are met. Current goals of the data pipeline are:

  • Increase num_workers to more than 0. Presently, the dataset class crashes when doing parallel computing obviously due to sharing a single environment... Is this going to ever be possible especially for agents like DQNs?
  • With parallel computing in mind, will there require major changes if we try implementing HAC or A3C?
  • Is there a way to make this code more pythonic? Current code seems rigid. What would happen if we wanted to add a new Item such as a SemiMDPSlice? What if we added agents that use Options?
  • The dataset class forces purely sequential access. Perhaps investigate ways to make this cleaner for different samplers? Need to consider how the DataLoader class treats objects with __getitem__.

Most Important

  • Memory management: Not sure this was going to be such an immediate issue... but the memory management in MDP datasets is horrific. It grows by 100-200 mb every 20 steps in the dqn notebook for the gym_maze. No problem. Moving to options for reducing the size of the datasets. What we will do is "null out" unimportant episodes with variables (state and image based fields most likely) to reduce the memory size. We can keep reward information. We want to be able to keep certain episodes of interest for the interpreter to work with. Maybe in the future we can try a harddrive caching scheme??? Maybe thats a bad idea...

Proposing:

  • keep high fidelity k top episodes

  • keep quartile worst best episodes

  • keep k top worst and best

  • keep k top worst

  • None, only load into memory (always keep first)

  • all / small.

  • How are we going to delineate between an epoch, a step, and a batch? At present a single iteration through an episode is an epoch. Both a single step and a batch are treated as the same think being a single frame in the environment. How do we plan to separate this?

@josiahls josiahls added the Discussion Generally something we should talk / debate about label Aug 12, 2019
@josiahls josiahls changed the title MDPDataBunch (and corresponding Dataset, Item, and ItemList) MDPDataBunch (and Dataset, Item, and ItemList) Aug 12, 2019
@josiahls
Copy link
Owner Author

  • Additional Idea related to the issue of a single env training model. Some models might allow for envs running at the same time. This might make more workers make more sense (?). Main issue is these envs running in different processes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion Generally something we should talk / debate about
Projects
None yet
Development

No branches or pull requests

1 participant