Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GangaLHCb: lbexec support in GaudiExec for Run 3 applications #2042

Open
ryuwd opened this issue Jul 25, 2022 · 4 comments
Open

GangaLHCb: lbexec support in GaudiExec for Run 3 applications #2042

ryuwd opened this issue Jul 25, 2022 · 4 comments

Comments

@ryuwd
Copy link
Contributor

ryuwd commented Jul 25, 2022

gaudirun.py support is removed from recent nightlies of DaVinci (as announced in the last LHCb week), meaning that submitting DV jobs with ganga will not work in the next Run3 release of DV.

It is possible to run lbexec jobs with a hack (here is an example), but it would be good eventually for ganga to support lbexec natively.

I have been thinking about how best to implement, but got stuck when thinking about data.py.

cc @chrisburr

@egede
Copy link
Member

egede commented Jul 25, 2022

Yes, we discussed this already with @chrisburr . Is there any documentation that you can point to now about how lbexec works?

@ryuwd
Copy link
Contributor Author

ryuwd commented Jul 26, 2022

Yes, we discussed this already with @chrisburr . Is there any documentation that you can point to now about how lbexec works?

I suppose

Nothing in a neat sphinx site yet AFAIK https://lhcb-davinci.docs.cern.ch/tutorials/running.html

To support lbexec the GaudiExec application would probably need these parameters

  • (maybe unnecessary) use_lbexec (bool): a flag telling GaudiExec not to use gaudirun.py
  • a new one: entrypoint (str): the 'function' to run e.g. MyOptionsFile:alg_config
  • options (list[str] or str) can be kept, accepting instead yaml, not .py files, to be passed to lbexec. Alternatively it could even accept a dict and generate+upload the yaml file for the user. (this would be really useful for passing things like conddb and dddb tags without having to code generate and write a file)

Options files (e.g. MyOptionsFile.py) containing the function specified in the entrypoint to configure the job would need to be added to the input sandbox. Seems like an easy place for end-users to slip up.

Where I got a bit stuck is how ganga configures input files. At the moment a file called data.py is generated from some gaudi options code templates in LHCbDataset which is dumped into job._splitterdata then into the input sandbox. But I'm not sure how that could be best adapted to generate YAML that works with lbexec. I suppose PFNs could be predetermined in advance but that seems like it would constrain what sites the configured subjob could run at to one site, and I don't know if input_files in lbexec YAML are able to deal with LFNs and XML catalogs either. I guess in this situation you and @chrisburr know best

(I take an interest in this issue because I have been doing some run3 studies recently with ganga + for the earlier mentioned reasons)

@chrisburr
Copy link
Contributor

I'm away this week but I'll respond soon with a suggestion of how Ganga can support this.

@egede
Copy link
Member

egede commented May 25, 2023

@chrisburr Ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants