rare_words_DNA

Synopsis

Objectives of the project: find tools to quantify and find exceptional subsequences of nucleotides (abnormaly rare or frequent) in a DNA sequence, potentially very long. This procedure can be interesting to biologists because exceptional DNA motifs sometimes have biological purpose (see Chi motifs and restriction sites)

Code

The code offer 2 different ways to find such motifs, both using Markov approximations. The first one, Counter_simple model the DNA as a 1-Markov Chain. The second one, Counter_m model the DNA as a Markov Chain of order k-2, where k is the lenght of the motifs studied.

To use one of these class, don't forget to:

Initialize the lenght of the motifs you want to study
Specify what DNA sequence you are working on (learn method)

References

[1] S. Robin, F. Rodolphe, S. Schbath ADN, mots et modèles 2003.

[2] J.F. Delmas, B. Jourdain Modèles aléatoires Mathematiques et Applications 57, 2007.

[3] G. Nuel Significance Score of Motifs in Biological Sequences Bioinformatics: Trends and Methodologies Intech 2011; 978-53. Relations.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
E_coli.txt		E_coli.txt
README.md		README.md
context.pdf		context.pdf
counter_m.py		counter_m.py
counter_simple.py		counter_simple.py
generate_seq.py		generate_seq.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rare_words_DNA

Synopsis

Code

References

About

Releases

Packages

Languages

PierreBoyeau/rare_words_DNA

Folders and files

Latest commit

History

Repository files navigation

rare_words_DNA

Synopsis

Code

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages