Skip to content

Commit

Permalink
Document SGE monitor
Browse files Browse the repository at this point in the history
  • Loading branch information
wlandau-lilly committed Jan 8, 2024
1 parent 410d746 commit d18240b
Show file tree
Hide file tree
Showing 4 changed files with 53 additions and 2 deletions.
4 changes: 3 additions & 1 deletion R/crew_monitor_sge.R
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,9 @@ crew_class_monitor_sge <- R6::R6Class(
#' @param jobs Character vector of job names or job IDs to terminate.
#' Ignored if `all` is set to `TRUE`.
#' @param all Logical of length 1, whether to terminate all the jobs
#' under your user name. If `TRUE`, the `jobs` argument is ignored.
#' under your user name. This terminates ALL your SGE jobs,
#' regardless of whether `crew.cluster` launched them,
#' so use with caution!
terminate = function(jobs = NULL, all = FALSE) {
# Cannot be tested with automated tests.
# Tested in tests/sge/monitor.R.
Expand Down
22 changes: 22 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,28 @@ Remember to terminate the controller when you are done.
controller$terminate()
```

# Monitoring

To manage resource usage, you may choose to list and manually terminate cluster jobs using `crew_monitor_sge()` and other supported monitors. Example for SGE:

```{r}
monitor <- crew_monitor_sge()
job_list <- monitor$jobs()
job_list
#> # A tibble: 2 × 9
#> job_number prio name owner state start_time queue_name jclass_name slots
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <chr>
#> 1 131853812 0.05000 crew-m… USER… r 2024-01-0… all.norma… NA 1
#> 2 131853813 0.05000 crew-m… USER… r 2024-01-0… all.norma… NA 1
monitor$terminate(jobs = job_list$job_number)
#> USER has registered the job 131853812 for deletion
#> USER has registered the job 131853813 for deletion
monitor$jobs()
#> data frame with 0 columns and 0 rows
```

`monitor$terminate(all = TRUE)` terminates all your SGE jobs, regardless of whether `crew.cluster` created them.

# Tips

* `crew.cluster` submits jobs over the local network using system calls to the resource manager (e.g SGE or SLURM). Please invoke `crew.cluster` on a node of the cluster, either a login node (head node) or a compute node.
Expand Down
25 changes: 25 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,31 @@ Remember to terminate the controller when you are done.
controller$terminate()
```

# Monitoring

To manage resource usage, you may choose to list and manually terminate
cluster jobs using `crew_monitor_sge()` and other supported monitors.
Example for SGE:

``` r
monitor <- crew_monitor_sge()
job_list <- monitor$jobs()
job_list
#> # A tibble: 2 × 9
#> job_number prio name owner state start_time queue_name jclass_name slots
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <chr>
#> 1 131853812 0.05000 crew-m… USER… r 2024-01-0… all.norma… NA 1
#> 2 131853813 0.05000 crew-m… USER… r 2024-01-0… all.norma… NA 1
monitor$terminate(jobs = job_list$job_number)
#> USER has registered the job 131853812 for deletion
#> USER has registered the job 131853813 for deletion
monitor$jobs()
#> data frame with 0 columns and 0 rows
```

`monitor$terminate(all = TRUE)` terminates all your SGE jobs, regardless
of whether `crew.cluster` created them.

# Tips

- `crew.cluster` submits jobs over the local network using system calls
Expand Down
4 changes: 3 additions & 1 deletion man/crew_class_monitor_sge.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit d18240b

Please sign in to comment.