background preloader

Assembly

Facebook Twitter

Publications | ALLPATHS-LG. MIRA. MIRA 4 - Whole Genome Shotgun and EST Sequence Assembler for Sanger, 454 and Solexa / Illumina MIRA is the swiss army knife of sequence assembly that I've used and developed during the past 16 years to get assembly jobs I work on done efficiently - and especially accurately. That is, without me actually putting too much manual work into it. Over time, other labs and sequencing providers have found mira useful for assembly of extremely 'unfriendly' projects containing lots of repetitive sequences. As always, your mileage may vary. Hybrid de-novo assemblies with Sanger, 454 and Illumina, Ion Torrent and PacBio MIRA 4 is able to perform true hybrid de-novo assemblies using reads gathered through Sanger, 454, Solexa, IonTorrent or PacBio sequencing technologies. 1) get your DNA sequenced at ~100x polymerase read coverage with PacBio 2) get the very same DNA sequenced at ~30-100x coverage with Illumina 3) get the error corrected sequences from the HGAP pipeline and assemble them with MIRA.

Software for next gen sequence. Trans-ABySS. Analyze ABySS multi-k-assembled shotgun transcriptome data. Project Description Trans-ABySS is a software pipeline for analyzing ABySS-assembled contigs from shotgun transcriptome data. The pipeline accepts assemblies that were generated across a wide range of k values in order to address variable transcript expression levels. It first filters and merges the multi-k assemblies, generating a much smaller set of nonredundant contigs.

It contains scripts that map assembled contigs to known transcripts, currently supporting the Blat contig-to-genome aligner. It identifies novel splicing events like exon-skipping, novel exons, retained introns, novel introns, and alternative splicing. Its scripts can also identify candidate gene-fusions, single-nucleotide variants, insertions, deletions, and inversions.

The pipeline can be used with other assembly versions, and with other species, once genome and transcript annotation files are available. All Releases. Denovo. Oases. What is Oases? Oases is a de novo transcriptome assembler designed to produce transcripts from short read sequencing technologies, such as Illumina, SOLiD, or 454 in the absence of any genomic assembly. It was developed by Marcel Schulz (MPI for Molecular Genomics) and Daniel Zerbino (previously at the European Bioinformatics Institute (EMBL-EBI), now at UC Santa Cruz). Oases uploads a preliminary assembly produced by Velvet, and clusters the contigs into small groups, called loci. It then exploits the paired-end read and long read information, when available, to construct transcript isoforms. Can I see the source code? The Oases source code is freely available under the GPL agreement. What do I need to run it? Oases was designed with a 64bit Linux-compatible environment in mind, with gcc.

It should compile and run on a 32-bit machine (albeit with a few secondary compiler warnings), but you might find memory to be a limiting factor. Where can I get Oases? What do the confidence scores mean? BGI-SOAP. Hi all, I'm working on a project looking at identifying SNPs from transcriptome data in a fish species (multiple individuals sequenced in a pool). While I wait for an account on a high performance computer with sufficient RAM, I was wondering if anyone could tell me if this is a feasible workflow using the SOAP suite: 1. de novo assembly with SOAPdenovo2. join contig sequences together into one 'reference'3. run SOAPalign to make an alignment to the new 'reference'4. run SOAPsnp to identify SNPs (uses the output of SOAP align)5. import to MapView to view alignment and SNPs I guess the unconventional thing about this is that our 'reference' would simply be the joined contigs, as a basic requirement of SOAPalign and MapView seems to be a single reference, rather than thousands of individual contigs (contig size not particularly important, depth + minor allele frequency at SNPs most important) Can anyone comment on this?

Regards,Matt.