background preloader

Bioinformaticsj

Facebook Twitter

Life Sciences & Mathematics & Physical Sciences. RNAseqViewer: Visualization tool for RNA-Seq data. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. GAT: a simulation framework for testing the association of genomic intervals.

+ Author Affiliations ↵*To whom correspondence should be addressed.

GAT: a simulation framework for testing the association of genomic intervals

Received March 21, 2013. Revision received June 1, 2013. Accepted June 7, 2013. Motivation: A common question in genomic analysis is whether two sets of genomic intervals overlap significantly. Summary: We present Genomic Association Test (GAT), a tool for estimating the significance of overlap between multiple sets of genomic intervals. Availability: GAT’s source code, documentation and tutorials are available at. Joint network and node selection for pathway-based genomic data analysis.

+ Author Affiliations ↵*To whom correspondence should be addressed.

Joint network and node selection for pathway-based genomic data analysis

Received February 9, 2013. Revision received June 5, 2013. Accepted June 6, 2013. Motivation: By capturing various biochemical interactions, biological pathways provide insight into underlying biological processes. Results: In this article, we propose a novel sparse Bayesian model for joint network and node selection. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc.

ViRome: an R package for the visualization and analysis of viral small RNA sequence datasets. + Author Affiliations ↵*To whom correspondence should be addressed.

viRome: an R package for the visualization and analysis of viral small RNA sequence datasets

Received December 20, 2012. Revision received May 17, 2013. Accepted May 21, 2013. Summary: RNA interference (RNAi) is known to play an important part in defence against viruses in a range of species. Availability and implementation: viRome is released under the BSD license as a package for R available for both Windows and Linux Additional information and a tutorial is available on the ARK-Genomics website: MiRTCat: a comprehensive map of human and mouse microRNA target sites including non-canonical nucleation bulges.

Joint analysis of expression profiles from multiple cancers improves the identification of microRNA–gene interactions. Data-based filtering for replicated high-throughput transcriptome sequencing experiments. + Author Affiliations ↵*To whom correspondence should be addressed.

Data-based filtering for replicated high-throughput transcriptome sequencing experiments

Received November 20, 2012. Revision received May 24, 2013. Accepted June 17, 2013. Motivation: RNA sequencing is now widely performed to study differential expression among experimental conditions. Results: We propose a data-driven method based on the Jaccard similarity index to calculate a filtering threshold for replicated RNA sequencing data. Availability: The proposed filtering method is implemented in the R package HTSFilter available on Bioconductor. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc.

Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Mechanistic insights into mutually exclusive splicing in dynamin 1. Quikr: a method for rapid reconstruction of bacterial communities via compressive sensing. Linc2GO: a human LincRNA function annotation resource based on ceRNA hypothesis. Differential gene expression analysis using coexpression and RNA-Seq data. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs. + Author Affiliations ↵*To whom correspondence should be addressed.

Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs

Colin N. Dewey, E-mail: cdewey@biostat.wisc.edu Received September 23, 2011. Revision received July 3, 2013. Accepted July 4, 2013. Motivation: Alternative splicing and other processes that allow for different transcripts to be derived from the same gene are significant forces in the eukaryotic cell. Results: We present RNA-Seq models and associated inference algorithms based on the concept of probabilistic splice graphs, which alleviate these issues. Availability: Software implementing our methods is available at. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. MIG: Multi-Image Genome Viewer. A Sensitive and Accurate protein domain cLassification Tool (SALT) for short reads. Nhmmer: DNA homology search with profile HMMs. + Author Affiliations ↵*To whom correspondence should be addressed.

nhmmer: DNA homology search with profile HMMs

Travis J. Wheeler, E-mail: travis@traviswheeler.com Received June 17, 2013. Revision received June 17, 2013. Accepted July 5, 2013. Summary: Sequence database searches are an essential part of molecular biology, providing information about the function and evolutionary history of proteins, RNA molecules, and DNA sequence elements. Availability: nhmmer is a part of the new HMMER3.1 release. Contact: wheelert@janelia.hhmi.org,eddys@janelia.hhmi.org. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. GAT: a simulation framework for testing the association of genomic intervals. + Author Affiliations ↵*To whom correspondence should be addressed.

GAT: a simulation framework for testing the association of genomic intervals

Received March 21, 2013. Revision received June 1, 2013. Accepted June 7, 2013. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. Rnomics/bioinfo (by Fabrice Leclerc. MiRTCat: a comprehensive map of human and mouse microRNA target sites including non-canonical nucleation bulges. Joint analysis of expression profiles from multiple cancers improves the identification of microRNA-gene interactions.

GAT: a simulation framework for testing the association of genomic intervals. + Author Affiliations ↵*To whom correspondence should be addressed.

GAT: a simulation framework for testing the association of genomic intervals

Andreas Heger, E-mail: andreas.heger@dpag.ox.ac.uk Received March 21, 2013. Revision received June 1, 2013. Accepted June 7, 2013. Motivation: A common question in genomic analysis is whether two sets of genomic intervals overlap significantly. Summary: We present GAT, a tool for estimating the significance of overlap between multiple sets of genomic intervals. Availability: GAT's source code, documentation and tutorials are available at. A Sensitive and Accurate protein domain cLassification Tool (SALT) for short reads. ViRome: an R package for the visualization and analysis of viral small RNA sequence datasets. + Author Affiliations ↵*To whom correspondence should be addressed.

viRome: an R package for the visualization and analysis of viral small RNA sequence datasets

Received December 20, 2012. Revision received May 17, 2013. Accepted May 21, 2013. Summary: RNA interference (RNAi) is known to play an important part in defence against viruses in a range of species. Availability and implementation: viRome is released under the BSD license as a package for R available for both Windows and Linux Additional information and a tutorial is available on the ARK-Genomics website: Quikr: a Method for Rapid Reconstruction of Bacterial Communities via Compressive Sensing. Updating RNA-Seq analyses after re-annotation. + Author Affiliations ↵*To whom correspondence should be addressed.

Updating RNA-Seq analyses after re-annotation

Received April 11, 2013. Revision received April 11, 2013. Accepted April 21, 2013. iFUSE: integrated fusion gene explorer. Genome-wide identification and predictive modeling of tissue-specific alternative polyadenylation. + Author Affiliations ↵*To whom correspondence should be addressed. Motivation: Pre-mRNA cleavage and polyadenylation are essential steps for 3′-end maturation and subsequent stability and degradation of mRNAs.

This process is highly controlled by cis-regulatory elements surrounding the cleavage/polyadenylation sites (polyA sites), which are frequently constrained by sequence content and position. More than 50% of human transcripts have multiple functional polyA sites, and the specific use of alternative polyA sites (APA) results in isoforms with variable 3′-untranslated regions, thus potentially affecting gene regulation. Elucidating the regulatory mechanisms underlying differential polyA preferences in multiple cell types has been hindered both by the lack of suitable data on the precise location of cleavage sites, as well as of appropriate tests for determining APAs with significant differences across multiple libraries. Integrating sequence, expression and interaction data to determine condition-specific miRNA regulation.

+ Author Affiliations ↵*To whom correspondence should be addressed. Motivation: MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression post-transcriptionally. MiRNAs were shown to play an important role in development and disease, and accurately determining the networks regulated by these miRNAs in a specific condition is of great interest.

Early work on miRNA target prediction has focused on using static sequence information. The RNA Newton polytope and learnability of energy parameters. Motivation: Computational RNA structure prediction is a mature important problem that has received a new wave of attention with the discovery of regulatory non-coding RNAs and the advent of high-throughput transcriptome sequencing. Despite nearly two score years of research on RNA secondary structure and RNA–RNA interaction prediction, the accuracy of the state-of-the-art algorithms are still far from satisfactory. So far, researchers have proposed increasingly complex energy models and improved parameter estimation methods, experimental and/or computational, in anticipation of endowing their methods with enough power to solve the problem. The output has disappointingly been only modest improvements, not matching the expectations.

Even recent massively featured machine learning approaches were not able to break the barrier. Why is that? Results: We demonstrated the application of our theory to a simple energy model consisting of a weighted count of A-U, C-G and G-U base pairs. Poly(A) motif prediction using spectral latent features from human DNA sequences. + Author Affiliations ↵*To whom all correspondence should be addressed. Motivation: Polyadenylation is the addition of a poly(A) tail to an RNA molecule. Identifying DNA sequence motifs that signal the addition of poly(A) tails is essential to improved genome annotation and better understanding of the regulatory mechanisms and stability of mRNA. Existing poly(A) motif predictors demonstrate that information extracted from the surrounding nucleotide sequences of candidate poly(A) motifs can differentiate true motifs from the false ones to a great extent.

A variety of sophisticated features has been explored, including sequential, structural, statistical, thermodynamic and evolutionary properties. Results: We propose a novel machine-learning method for poly(A) motif prediction by marrying generative learning (hidden Markov models) and discriminative learning (support vector machines). A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotide distribution. + Author Affiliations ↵*To whom correspondence should be addressed. Motivations: The design of RNA sequences folding into predefined secondary structures is a milestone for many synthetic biology and gene therapy studies. Most of the current software uses similar local search strategies (i.e. a random seed is progressively adapted to acquire the desired folding properties) and more importantly do not allow the user to control explicitly the nucleotide distribution such as the GC-content in their sequences.

However, the latter is an important criterion for large-scale applications as it could presumably be used to design sequences with better transcription rates and/or structural plasticity. GeneScissors: a comprehensive approach to detecting and correcting spurious transcriptome inference owing to RNA-seq reads misalignment. IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels. Identifying proteins controlling key disease signaling pathways. A hidden Markov model to identify combinatorial epigenetic regulation patterns for estrogen receptor α target genes. + Author Affiliations. AffyRNADegradation: control and correction of RNA quality effects in GeneChip expression data.