-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New NMT option - choose ClearML queue #151
Comments
This would also allow us to run jobs on A100 GPUs on QA. |
@Enkidu93 - we may need to update how getting the queue works. One option is to add another parameter to the endpoint to specify the queue to preserve backwards compatibility. What do you think? |
@pmachapman - is SF using the /translation/engines/queues endoint right now? If not, then we can change it without needing to deprecate first. |
@johnml1135 No, we are not using that endpoint. We get the queueDepth from the |
I initially thought this would be very straightforward, but it'll take a bit more effort in order to if we want to keep the |
Just wanted to record a couple ideas: 1) This option will be part of |
If the requirement is to be able to specify the priority for the build, then that is probably what we should do. We could add a |
@johnml1135 In this morning's stand-up, we were discussing this issue and thought it would be beneficial if you could record here what the purpose is of being able to configure the queue (since you're the one who opened it). Where is this requirement coming from? |
The requirement is unrealized right now, but would be if idx or another customer wanted dedicated gpus for their jobs. Another requirement would be to choose a different setup, either one for a quick turn test, or a dual GPU queue. As all of these needs are for the future and not urgent, I am fine pushing this off until it is needed. Also, I do not believe that prioritizing serval jobs is a requirement. |
So @johnml1135 @ddaspit , should I 'finish' this since I already have the logic and just add it as a build option OR should I commit what I have to a branch and revisit it when we have a firmer requirement? |
Before proceeding further, I would like to clarify the requirements for this issue. We should meet to discuss. |
So we can run non-urgent jobs on the production instance of Serval. This would be a standard option possibly
clearml_queue
.The text was updated successfully, but these errors were encountered: