Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using specific colors per activity name in trace_explorer #38

Open
dkoekkoek opened this issue Nov 8, 2021 · 1 comment
Open

Using specific colors per activity name in trace_explorer #38

dkoekkoek opened this issue Nov 8, 2021 · 1 comment

Comments

@dkoekkoek
Copy link

Hi,

For my project I would like to give each unique activity discovered with the trace_explorer a specific assigned color.
The event log consists of a column with activities and sub-activities, in order to create different levels of traces. Ideally, I want the sub-activities to be a color variant of the main activity.

For example one main activity "Place order" is blue. And subactivities describe the specific order "Scheduele appointment", "Request assistance", " etc. and I would like those sub-activities in other shades of blue.

Is there a way to assign a color to a particular activity name, and a color range to sub-activities?

Thank you in advance!

@gertjanssenswillen
Copy link
Member

Hi

Sorry for the delay, hopefully the answer is still helpful.

The trace explorer plot is a ggplot object, so you can add your own scale to it as an extra layer. In this way, you can use the scale_fill_manual() function of ggplot to manually set the colors.

For example:

patients %>%
	trace_explorer(n_traces = 7) +
	scale_fill_manual(values = c("Check-out" = "blue",
	                                             "Blood test" = "red",
						      "Discuss Results" ="yellow",
						      "X-Ray" = "orange",
						      "Registration" = "green",
						      "Triage and Assessment"="purple",
						      "MRI SCAN"="brown"))

Of course, this requires that you enumerate all activities with their specific colors. I don't think there exists an out-of-the-box scale_fill function (to apply a hierarchy like this. Nevertheless, the creation of the values argument vector can be somewhat automated if you have a large number of activities.

E.g. you can start from the scales in the R color brewer. A vector of x colors from a palette t can be created with RColorBrewer::brewer.pal(n = x, name = y). In this way, you can work as follows:

Create a table with the activities, grouped on the "superactivity" > this is going to be your "main activity". I have created one here with mutate,

patients %>%
	mutate(superactivity = ifelse(handling %in% c("Registration","Check-out", "Discuss Results"), "cat 1","cat 2")) %>%
	group_by(superactivity) %>%
	activities()

You can then decide on a specific scale for each main activity. (Depending on the number, this might need some automisation. For simplicity, I am just going with Reds for cat 1 and Blues for cat 2.

	mutate(fill_scale = ifelse(superactivity == "cat 1","Reds","Blues")) %>%

Creating some helper variables: the number of colors needed within each group, as well wel as a numeric id (note that the data.frame is still grouped on the superactivity). `

	mutate(n_colors = n(), color_id = 1:n()) %>%

Then, with some purrr magic we can create the the color for each activity, by iterating over the RColorBrewer::brewer.pal function with the n_colors value as n, the fill_scale as name, and the color_id as a index-value to the resulting vector.

mutate(color = pmap_chr(list(fill_scale, n_colors, color_id), ~RColorBrewer::brewer.pal(n = ..2, name = ..1)[..3]))

Let's store the resulting data frame as "colors".

The full code:

patients %>%
	mutate(superactivity = ifelse(handling %in% c("Registration","Check-out", "Discuss Results"), "cat 1","cat 2")) %>%
	group_by(superactivity) %>%
	activities() %>%
	mutate(fill_scale = ifelse(superactivity == "cat 1","Reds","Blues")) %>%
	mutate(n_colors = n(), color_id = 1:n()) %>%
	mutate(color = pmap_chr(list(fill_scale, n_colors, color_id), ~RColorBrewer::brewer.pal(n = ..2, name = ..1)[..3])) -> colors

The output looks likes this:

  superactivity handling              absolute_frequency relative_frequency fill_scale n_colors color_id color  
  <chr>         <fct>                              <int>              <dbl> <chr>         <int>    <int> <chr>  
1 cat 1         Registration                         500              0.336 Reds              3        1 #FEE0D2
2 cat 1         Discuss Results                      495              0.333 Reds              3        2 #FC9272
3 cat 1         Check-out                            492              0.331 Reds              3        3 #DE2D26
4 cat 2         Triage and Assessment                500              0.405 Blues             4        1 #EFF3FF
5 cat 2         X-Ray                                261              0.212 Blues             4        2 #BDD7E7
6 cat 2         Blood test                           237              0.192 Blues             4        3 #6BAED6
7 cat 2         MRI SCAN                             236              0.191 Blues             4        4 #2171B5

Based on this, we can create the scalevector we need for the ggplot function as follows (handling here is the actual name of the activity classifier in your case).

color_scale <- colors$color
names(color_scale) <- colors$handling

Which you can input in the scale_fill_manual:

patients %>%
	trace_explorer(n_traces = 7) +
	scale_fill_manual(values = color_scale)

Result:

image

Of course it will need some tweaking to find readable and nice color, but hopefully this is something to start from.

@gertjanssenswillen gertjanssenswillen transferred this issue from bupaverse/bupaR Jan 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants