Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

highlight_entities function may cause elements to be omitted and element misalignment #33

Open
BioLaoXu opened this issue Jul 3, 2024 · 1 comment

Comments

@BioLaoXu
Copy link

BioLaoXu commented Jul 3, 2024

dear ggkegg team,Thanks to ggkegg for providing a more flexible KEGG pathway visualization scheme,But I found some minor problems in the process of using it, which may be caused by my own misunderstanding.

  • highlight_entities function may cause elements omit,After looking at the contents of the xml, I found that some genes have a '...' at the end, highlight_entities function and highlight_set_nodes function do not remove this character.

image

hsn=c("K01568","D18Wsu181e")
g <- pathway("mmu00010",group_rect_nudge=0)
g <- g |> mutate(gene1=highlight_set_nodes(hsn,sep=" ",how="any",name="name"),
                 gene2=highlight_set_nodes(hsn,sep=", ",how="any",name="graphics_name"))
gg=ggraph(g, layout="manual", x=x, y=y)
gg+
  overlay_raw_map()+
  geom_node_rect(fill="red",aes(filter=gene1), color="black")+
  geom_node_text(aes(
    label=graphics_name %>% strsplit(",") %>% sapply("[", 1) %>% strsplit("\\.") %>% sapply("[", 1),
    filter=gene1),size=2)+
  geom_node_rect(fill="red",aes(filter=gene2), color="black")+
  geom_node_text(aes(
    label=graphics_name %>% strsplit(",") %>% sapply("[", 1) %>% strsplit("\\.") %>% sapply("[", 1),
    filter=gene2),size=2)+
  theme_void()

#ggkegg::highlight_entities(pathway = "mmu00010",set = hsn,how = "any",name ="graphics_name" ) ## not work

image

in this example,D18Wsu181egene will be oimted

  • element misalignment,I want to add the text more personalized,this is my solution:
highlight_entities_dt=data.frame(showtext=c("P52792","Q9DBF1","P15327","P52792","Q9DBF1"),
                                 lfc=c(-2,2,4,6,8),
                                 # symbol=c("1","2","2","Akr1a1","K00138")
                                 symbol=c("P52792","Q9DBF1","Bpgm","Gck","Aldh7a1")
                                 )

pid="mmu00010"
showText="showtext"
colorBy="lfc"

g <- pathway(pid,group_rect_nudge=0,directory = getBFCOption("CACHE"))
gg=ggraph(g, layout="manual", x=x, y=y)
gt=gg$data

tmp=gt%>%mutate(idx = row_number())
gt1=tmp%>%separate_rows(.,graphics_name, convert = TRUE,sep = ", ")%>%
  mutate(symbol=gsub(" ","",graphics_name)%>%gsub("\\.\\.\\.$","",.))
gt2=tmp%>%separate_rows(.,name, convert = TRUE,sep = " ")%>%
  mutate(symbol=gsub(" ","",name)%>%gsub("\\.\\.\\.$","",.))
gt1=left_join(gt1,highlight_entities_dt,by="symbol")%>%
  mutate(sn=get(showText)%>%lapply(.,function(x){
    if(x%in%highlight_entities_dt[[showText]]){
      return(x)
    }
    return(NA)
  })%>%unlist())%>%
  .[with(., order(idx,sn,symbol,decreasing = F)),]
gt2=left_join(gt2,highlight_entities_dt,by="symbol")%>%
  mutate(sn=get(showText)%>%lapply(.,function(x){
    if(x%in%highlight_entities_dt[[showText]]){
      return(x)
    }
    return(NA)
  })%>%unlist())%>%
  .[with(., order(idx,sn,symbol,decreasing = F)),]
gt12=rbind(gt1,gt2)%>% 
  distinct(.,idx,.keep_all = T)
gt12$name=gt$name
gt12$graphics_name=gt$graphics_name
gt=gt12

g=pathway(pid,group_rect_nudge=0,directory = getBFCOption("CACHE")) %>% 
  mutate(sn = gt[[showText]],!!colorBy:=gt[[colorBy]])
### method1,more tidy graph
ggraph(g, layout = "manual",x=x,y=y)+
  overlay_raw_map(high_res=F, transparent_color="#FFFFFF")+
  geom_node_rect(aes(filter=!is.na(sn),fill=.data[[colorBy]]), color="black")+
  scale_fill_gradientn(name=colorBy,colours = scales::alpha(c("#3288BD","#D53E4F"),alpha = .99),
                        space = "lab",
                        breaks=ceiling(seq(min(gt[[colorBy]],na.rm = T),max(gt[[colorBy]],na.rm = T),
                                           (max(gt[[colorBy]],na.rm = T)-min(gt[[colorBy]],na.rm = T))/4)),
                        guide = guide_colorbar(order = 3))+
  geom_node_text(aes(filter=!is.na(sn),label=sn), size=2)+
  theme_void()

### method2
ggraph(g, layout =gt,x=x,y=y)+
  overlay_raw_map(high_res=F, transparent_color="#FFFFFF")+
  geom_node_rect(aes(filter=!is.na(sn),fill=.data[[colorBy]]), color="black")+
  scale_fill_gradientn(name=colorBy,colours = scales::alpha(c("#3288BD","#D53E4F"),alpha = .99),
                        space = "lab",
                        breaks=ceiling(seq(min(gt[[colorBy]],na.rm = T),max(gt[[colorBy]],na.rm = T),
                                           (max(gt[[colorBy]],na.rm = T)-min(gt[[colorBy]],na.rm = T))/4)),
                        guide = guide_colorbar(order = 3))+
  geom_node_text(aes(filter=!is.na(sn),label=sn), size=2)+
  theme_void()

image

With the exception of the misalignment of the elements, the results were as expected

> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BiocFileCache_2.4.0 dbplyr_2.2.1        tidyr_1.3.0         ggkegg_1.3.1        XML_3.99-0.14       ggraph_2.1.0       
 [7] ggplot2_3.4.2       igraph_2.0.3        dplyr_1.1.2         tidygraph_1.2.2    

look forward to hearing from you,thanks

@noriakis
Copy link
Owner

noriakis commented Jul 4, 2024

Thank you very much for raising this important point. For point 1, I have implemented the remove_dot option in highlight_entities and highlight_set_nodes now and it should handle the gene names with dots after them (devel and main branches). I will go through point 2 next but thanks again for using the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants