background preloader

MiRBase

MiRBase

Genome Browser Home ENCODE Project at UCSC 17 April 2014 - New Motif Displays for Transcription Factor ChIP-seq Track and New Genome Segmentations from ENCODE The latest Transcription Factor ChIP-seq track has been enhanced with the display of Factorbook motifs. Within a cluster, a green highlight indicates the highest scoring site of a Factorbook-identified canonical motif for the corresponding factor. Along with the ability to suppress motif highlights and cell abbreviations, the track configuration page now also enables the filtering of factors. The newly added Genome Segmentations from ENCODE tracks display multivariate genome-segmentation performed on six human cell types (GM12878, K562, H1-hESC, HeLa-S3, HepG2, and HUVEC), integrating ChIP-seq data for eight chromatin marks, RNA Polymerase II, the CTCF transcription factor and input data. 12 Sept 2013 - New UDR ENCODE Download Method Available 25 July 2013 - BLUEPRINT Epigenome Data Hub and Quick Reference PDF Now Available

Practical Strategies for Discovering Regulatory DNA Sequence Motifs Figures Citation: MacIsaac KD, Fraenkel E (2006) Practical Strategies for Discovering Regulatory DNA Sequence Motifs. PLoS Comput Biol 2(4): e36. doi:10.1371/journal.pcbi.0020036 Published: April 28, 2006 Copyright: © 2006 MacIsaac and Fraenkel. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: The authors acknowledge support from the Whitaker Foundation to EF. Competing interests: The authors have declared that no competing interests exist. Abbreviations: A, adenine; C, cytosine; ChIP, chromatin immunoprecipitation; EM, expectation maximization; G, guanine; ROC, receiver operating characteristic; ROC-AUC, area under the receiver operating characteristic curve; T, thymine Many functionally important regions of the genome can be recognized by searching for sequence patterns, or “motifs.” Figure 1.

Functional Classification Using Phylogenomic Inference Figures Citation: Brown D, Sjölander K (2006) Functional Classification Using Phylogenomic Inference. PLoS Comput Biol 2(6): e77. doi:10.1371/journal.pcbi.0020077 Editor: Fran Lewitter, Whitehead Institute, United States of America Published: June 30, 2006 Copyright: © 2006 Brown and Sjölander. Funding: The authors received no specific funding for this article. Competing interests: The authors have declared that no competing interests exist. Abbreviations: HMM, hidden Markov model; MSA, multiple sequence alignment; SCOP, Structural Classification of Proteins Phylogenomic inference of protein (or gene) function attempts to address the question, “What function does this protein perform?” Figure 1. In the example shown above, a phylogenetic tree has been constructed for a set of G protein–coupled receptors. doi:10.1371/journal.pcbi.0020077.g001 In practice, phylogenomic inference of gene function is not often used. Gene duplication. Domain shuffling. Evolutionary distance. Table 1. Figure 2.

Modularity and Dynamics of Cellular Networks Figures Citation: Qi Y, Ge H (2006) Modularity and Dynamics of Cellular Networks. PLoS Comput Biol 2(12): e174. doi:10.1371/journal.pcbi.0020174 Editor: Fran Lewitter, Whitehead Institute, United States of America Published: December 29, 2006 Copyright: © 2006 Qi and Ge. Funding: HG is supported by the Whitehead Institute. Competing interests: The authors have declared that no competing interests exist. Abbreviations: PPI, protein–protein interaction; TF, transcription factor Understanding how the phenotypes and behaviors of cells are controlled is one of the major challenges in biological research. The responses of cells to genetic perturbations or environmental cues are controlled by complex networks, including interconnected signaling pathways and cascades of transcriptional programs. Figure 1. Recent high-throughput technologies have produced massive amounts of gene expression, macromolecular interaction, or other type of “omic” data. doi:10.1371/journal.pcbi.0020174.g001 Figure 2.

Automated Querying of Genome Databases Figures Citation: Schattner P (2007) Automated Querying of Genome Databases. PLoS Comput Biol 3(1): e1. doi:10.1371/journal.pcbi.0030001 Editor: Fran Lewitter, Whitehead Institute, United States of America Published: January 26, 2007 Copyright: © 2007 Peter Schattner. Funding: The authors received no specific funding for this article. Competing interests: The author has declared that no competing interests exist. Abbreviations:: ADAR, adenosine deaminase enzymes that convert specific adenosines in RNA to inosine; API, application programmer interface; dbSNP, Single Nucleotide Polymorphism Database; EST, expressed sequence tag; MGD, Mouse Genome Database; NCBI, National Center for Biotechnology Information; SGD, Saccharomyces Genome Database; SQL, Structured Query Language; UCSC, University of California Santa Cruz Introduction The number of molecular biology databases continues to explode. Interactive Batch Database–Querying Figure 1. doi:10.1371/journal.pcbi.0030001.g001 An Example

Machine Learning and Its Applications to Biology Figures Citation: Tarca AL, Carey VJ, Chen X-w, Romero R, Drăghici S (2007) Machine Learning and Its Applications to Biology. PLoS Comput Biol 3(6): e116. doi:10.1371/journal.pcbi.0030116 Editor: Fran Lewitter, Whitehead Institute, United States of America Published: June 29, 2007 This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. Funding: The authors received no specific funding for this article. Competing interests: The authors have declared that no competing interests exist. Abbreviations: k-NN, k-nearest neighbor; PAM, partitioning around medoids; PC, principal component; PCA, principal component analysis; SV, support vector; SVM, support vector machine; x, vector; x, scalar; X, matrix; X, feature space. Introduction Supervised Learning where and

DAS registration - server FactorBook A Primer on Learning in Bayesian Networks for Computational Biology Figures Citation: Needham CJ, Bradford JR, Bulpitt AJ, Westhead DR (2007) A Primer on Learning in Bayesian Networks for Computational Biology. PLoS Comput Biol 3(8): e129. doi:10.1371/journal.pcbi.0030129 Editor: Fran Lewitter, Whitehead Institute, United States of America Published: August 31, 2007 Copyright: © 2007 Needham et al. Funding: The authors would like to thank the Biotechnology and Biological Sciences Research Council for funding on grant BBSB16585 during which this article was written. Competing interests: The authors have declared that no competing interests exist. Abbreviations: BN, Bayesian network; BIC, Bayesian information criterion; CPD, conditional probability distribution, CPT, conditional probability table; DAG, directed acyclic graph; DBN, dynamic Bayesian network; EM, expectation–maximisation; HMM, hidden Markov model; JPD, joint probability distribution; MAP;, maximum a posteriori; MCMC, Markov chain Monte Carlo; ML, maximum likelihood Introduction Bayesian Networks

Advanced Genomic Data Mining Figures Citation: Fernández-Suárez XM, Birney E (2008) Advanced Genomic Data Mining. PLoS Comput Biol 4(9): e1000121. doi:10.1371/journal.pcbi.1000121 Editor: Fran Lewitter, Whitehead Institute, United States of America Published: September 26, 2008 Copyright: © 2008 Fernández-Suárez, Birney. Funding: The Ensembl project receives primary funding from the Wellcome Trust. Competing interests: The authors have declared that no competing interests exist. Introduction As data banks increase their size, one of the current challenges in bioinformatics is to be able to query them in a sensible way. Data mining is vital to bioinformatics as it allows users to go beyond simple browsing of genome browsers, such as Ensembl [1],[2] or the UCSC Genome Browser [3], to address questions; for example, the biological meaning of the results obtained with a microarray platform, or how to identify a short motif upstream of a gene, amongst others. Figure 1. doi:10.1371/journal.pcbi.1000121.g001 Figure 2. Table 1.

miRBase: integrating microRNA annotation and deep-sequencing data.
Kozomara A, Griffiths-Jones S.
NAR 2011 39(Database Issue):D152-D157

miRBase: tools for microRNA genomics.
Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ.
NAR 2008 36(Database Issue):D154-D158

miRBase: microRNA sequences, targets and gene nomenclature.
Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ.
NAR 2006 34(Database Issue):D140-D144

The microRNA Registry.
Griffiths-Jones S.
NAR 2004 32(Database Issue):D109-D111 by paul.adrian Sep 14

One of the central public databases with general information on miRNAs is miRBase. In version 18, miRBase contains 1,527 different human pre-miRNAs with 1,921 different mature miRNA entries. It offers details about current miRNA nomenclatures, sequences, genomic locations, precursor forms,and literature references.
Potential target genes of miRNAs are not contained in miRBase,but other databases
have been implemented to provide such information. by paul.adrian Sep 12

Related: