background preloader

Biostar

Facebook Twitter

Archive1

A simple question on RNA-Seq terminology. Bioinformatics Answers. How to generate RNA multiple sequence alignment based on the secondary structure that is already known of one or several sequences? Bioinformatics Answers. Precursor mirna and a mature mirna. Bioinformatics Answers. MiRNA-seq datasets. Bioinformatics Answers. About RNA-Seq data from cell lines. RNA-seq with reference genome but no gene annotations. How to get the ITGB1 mRNA seq in Deuterostomia from database. Bioinformatics Answers. RNA seq on unannotated genome. 9 months ago by United States "Un-annotated" is a somewhat ambiguous term in this context.

RNA seq on unannotated genome

If you mean to say that there are no known genes in your organism, then you will have to do de novo transcriptome assembly with your RNA Seq data to describe the transcriptome, and then you can go back and quantify your transcriptome from the RNA Seq data. You could do your assembly with or without the genome reference (i.e. reference guided vs. completely de novo). It's a lot of work either way, and there are caveats to each approach.

However it could also be the case that you have fasta sequences describing some genes in your organism, but these sequences are not annotated to the genome assembly. Bioinformatics Answers. Illumina stranded RNA-Seq background problem. 'Potential RNA-DNA binding site in Alu Repeat', or 'Why should I allow multiple mapping positions when analyzing ChIPseq data?' 9 months ago by Cambridge, MA Congratulations on your interesting finding!

'Potential RNA-DNA binding site in Alu Repeat', or 'Why should I allow multiple mapping positions when analyzing ChIPseq data?'

I will admit to only having limited knowledge in this area. I have some concern with allowing reads that map to multiple regions. Lets say we had two regions A and B, and that both have identical sequence. One solution is to attribute duplicate reads to A and B in a ratio consistent with the unique reads mapped to both. Which aligner is best suited for CLIP-seq data? Counting mapped read of an exon in RNA-Seq analysis. Course/Workshop: Informatics for RNA-sequence Analysis (June, 2013) 15 months ago by St Louis, MO, USA We are offering a new course in the Canadian Bioinformatics Workshop (CBW) series this year on Informatics for RNA-seq Analysis.

Course/Workshop: Informatics for RNA-sequence Analysis (June, 2013)

Date: June 3 - June 4, 2013 Location: Toronto, Canada. What does the term "steady state mRNA" mean? How do you justify your RNA-Seq expression threshold (FPKM/RPKM) ? Tools to calculate conservation of non-coding RNAs. Bioinformatics Answers on BioStar. Error Message: RNA-Seq Alignment. Bioinformatics Answers. Obtaining expression values (FPKM) from RNA-Seq Data [Cufflinks/CuffDiff] Hola!

Obtaining expression values (FPKM) from RNA-Seq Data [Cufflinks/CuffDiff]

I have a couple of questions about obtaining expression values from the RNA-Seq dataset. I would like to get the FPKM(replaced RPKM) values for the all the genes from RNA expression dataset (RNA-SEQ). I analysed using tophat and cufflinks. 1) Can I just take the values from the genes.fpkm_tracking file obtained after running the cuffdiff. (which has values for WT and condition being tested) which has the same values as in gene_exp.diff file. Example set: 2) Can also take the values from the genes.fpkm_tracking file obtained after running the cufflinks (though it lacks genenames). 3) What should be the cutoff for the raw FPKM value to say its significant without taking the condition into account. Also, can FPKM value contribute directly to the expression value or is there any other factor to be taken into account as well.

Thanks a lot for your time. RNA seq transcript aligns on the wrong strand. Filtering variants in RNA-seq data. How to get UCSC locations for mRNA and non-coding. Bacterial RNA-seq. Does the new tophat2 - bowtie2 pipeline really map 100% of the RNA-Seq reads ? Is this real ? Which aligner is most suited for viral RNA-seq data? Aligning RNA-Seq to Repetitive LINE-1 elements. 15 months ago by Hello, I would like to check whether L1 repetitive elements are modulated between my treatment and control via RNA-Seq.

aligning RNA-Seq to Repetitive LINE-1 elements

I have read several papers that have done so but their methods are not clear enough for me as a biologist to reproduce. I have analyzed my data using the Tuxedo suite and have analyzed the "unique" genes. Generating internal exon based on known mRNA isoform. 16 months ago by Canada This is kind of a coding strategy question. For given gene, it has two isoforms with 3 exons. Isoform_A is exon1-exon2-exo3, while IsoformB is exon1-exon3. Thus, the exon2 here is what I want to filter out, as internal exon. Now I have downloaded all the exon data from UCSC genome browser UCSC genes track (selected from primary and related fields). The input is somehow like: #isoform_name chr strand ex_start ex_end gene_name isoformA chr1 + 10,30, 15,35 geneM isoformB chr1 + 10,20,30, 15,25,35 geneM isoformC chr1 + 40,50, 45,55 geneM Thus the exon [20-25] is called the internal exon.

Appending gene IDs on RNA-editing sites. 15 months ago by Seattle, WA USA You might try the BEDOPS suite conversion script vcf2bed to convert your VCF file to BED format (our script tries to preserve as much information as possible — see the comments in the script to see how the various columns are mapped), along with the Noble lab's gtf2bed script to convert GTF to BED.

appending gene IDs on RNA-editing sites

Finally, use the BEDOPS application bedmap to map the genes in the converted GTF file to the sites in the converted VCF file. What does a zero expression level mean in the ENCODE RNA-Seq data? Where I can find RNA-seq data related to the Alzheimer disease. Survey: RNA-Seq analysis for Differential Gene/Transcript Expression. 16 months ago by Hi all I've finally put together the results of the survey!

Survey: RNA-Seq analysis for Differential Gene/Transcript Expression

First of all, thanks to everyone who participated - the response has been great, with 93 people completing the survey as of today. The respondents have been a varied bunch, including all levels of academia (pre-docs, grad-students, pot-docs and PIs), core bioinformaticians and bioinformatics managers, as well as many from the industry. The majority of respondents appear to be based in the US and Europe but also in China, Korea and Australia. I provide below my own summary of the survey's findings, and I have a document which contains all the results, including all unedited comments. As with any survey, we should probably be aware of potential biases (e.g. skews caused by people who are really annoyed with a particular tool!).

Now for the summary. Selecting probes when RNA degradation is obvious. How does a single-stranded RNA bind to a double-stranded DNA to form a “triplex structure”? The XS:A in Tophat output for strand specific pair-end RNA-Seq data. Bioinformatics Answers on BioStar. RNA-Seq p-values. Calculating RPKM for RNA-seq data including several samples each condition.

First, you need to ask the people who submitted the samples if they are true biological replicates, or technical replicates.

Calculating RPKM for RNA-seq data including several samples each condition

Technical replicates are good for knowing how much variability your library prep and instrument add. A technical replicate would be like taking the liver from one mouse, cutting it into 4 pieces, and treating them like 4 different samples. Any variation between the samples should be an artifact of the library prep and sequencing procedure. So if the exact same sample prepped multiple ways leads to big swings in RPKM, then you know that your prep is lousing things up, and you are going to have very little precision in your estimate of what the "real" expression was in the one sample. In general, Illumina instruments do a good job with technical replicates, your samples should be very, very similar to each other, and I think if that's the case, combining them might be okay.

For instance, let's say you check one control animal, and one condition animal.