Skip to content

Latest commit

 

History

History
23 lines (12 loc) · 2.04 KB

README.md

File metadata and controls

23 lines (12 loc) · 2.04 KB

LegNet: solving the sequence-to-expression problem with SOTA convolutional networks

Dmitry Penzar, Daria Nogina et al., LegNet: a best-in-class deep learning model for short DNA regulatory regions, Bioinformatics, 2023; doi: 10.1093/bioinformatics/btad457

[Paper] [Preprint]

Here we present a convolutional network for predicting gene expression and sequence variant effects based on data obtained by large-scale parallel reporter assays.

Our approach secured 1st place in the recent DREAM 2022 challenge in predicting gene expression from millions of promoter sequences. To achieve the top performance, we drew inspiration from EfficientNetV2, a recent state-of-the-art in image analysis, and rephrased the initial sequence-to-expression regression problem as a soft-classification task. In the framework of the DREAM challenge, our model outperformed both attention transformers and recurrent neural networks.

Furthermore, we demonstrate how LegNet can be used in diffusion generative modeling as a step toward the rational design of gene regulatory sequences.

This repository provides several resources: