Revue. Biochemical properties and base excision repair complex formation of apurinic/apyrimidinic endonuclease from Pyrococcus furiosus -- Kiyonari et al. 37 (19): 6439 -- Nucleic Acids Research. Abstract | Intergenic regions of Borrelia plasmids contain phylogenetically conserved RNA secondary structure motifs. Repeat sequences of intergenic nucleotide sequences of Borrelia plasmids were analyzed for secondary structure motifs using the Zuker m-fold program [28,29]. In addition, the RNAz program was used to confirm thermodynamically stable and evolutionarily conserved RNA secondary structures [30].
Intergenic sequences from plasmids lp60 and lp28 of B. afzelii Pko were completely scanned manually for repeat sequences and RNA motifs. In addition, selected regions that contain relatively large intergenic regions from B. burgdorferi B31 and Borrelia garinii PB plasmids were also scanned. Most regions did not yield conserved stem loop structures, however five intergenic nucleotide sequences were found to display evolutionary conserved stem loop structures (Table 1). Table 1. Sequence #1 A 60 nt intergenic sequence (Sequence #1, Table 1) was found in nine plasmids from B. afzelii Pko and B. burgdorferi B31. Figure 1.
Figure 2. Additional file 1. Format: DOC Size: 469KB Download file Sequence #2. Role of PCNA-dependent stimulation of 3'-phosphodiesterase and 3'-5' exonuclease activities of human Ape2 in repair of oxidative DNA damage -- Burkovics et al. 37 (13): 4247 -- Nucleic Acids Research. + Author Affiliations *To whom correspondence should be addressed. Tel/Fax: +36 62 599666; Email: haracska@brc.hu Received February 17, 2009. Revision received April 21, 2009. Accepted April 22, 2009. Human Ape2 protein has 3′ phosphodiesterase activity for processing 3′-damaged DNA termini, 3′–5′ exonuclease activity that supports removal of mismatched nucleotides from the 3′-end of DNA, and a somewhat weak AP-endonuclease activity. However, very little is known about the role of Ape2 in DNA repair processes. . © 2009 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract | Characterization of taxonomically-restricted genes in a phylum-restricted cell type. Shuttle Vector System for Methanococcus maripaludis with Improved Transformation Efficiency -- Walters et al. 77 (7): 2549 -- Applied and Environmental Microbiology. A genome-wide view of mutation rate co-variation using multivariate analyses.
A greedy, graph-based algorithm for the alignment of multiple homologous gene lists. Dalliance: interactive genome viewing on the web. + Author Affiliations * To whom correspondence should be addressed. Received September 28, 2010. Revision received December 22, 2010. Accepted January 9, 2011. Summary: Dalliance is a new genome viewer which offers a high level of interactivity while running within a web browser. All data is fetched using the established distributed annotation system (DAS) protocol, making it easy to customize the browser and add extra data. Availability and Implementation: Dalliance runs entirely within your web browser, and relies on existing DAS server infrastructure.
Contact: thomas@biodalliance.org. Bambino: a variant detector and alignment viewer for next-generation sequencing data in the SAM/BAM format. Diverse Borrelia burgdorferi Strains in a Bird-Tick Cryptic Cycle -- Hamer et al. 77 (6): Novel Virulence Gene and Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) Multilocus Sequence Typing Scheme for Subtyping of the Ma. Functional and Structural Microbial Diversity in Organic and Conventional Viticulture: Organic Farming Benefits Natural Biocontrol Agents -- Schmid et al. 77 (6): 2188 -- Applied and Environmental Microbiology. Abstract | BIGSdb: Scalable analysis of bacterial genome variation at the population level. Abstract | Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss.
Abstract | The non-clonality of drug resistance in Beijing-genotype isolates of Mycobacterium tuberculosis from the Western Cape of South Africa. Abstract | Reticular Alignment: A progressive corner-cutting method for multiple sequence alignment. In this section, we describe the algorithms and theorems which are the theoretical background of the Reticular Alignment algorithm.
The Waterman-Byers algorithm and x-networks Let A and B be two sequences over an alphabet Σ, of lengths n and m, respectively. Let Ai denote the i long prefix of sequence A, and let Ai denote the suffix of A starting in the i + 1st position. In this way, Ai ◦ Ai = A, where ◦ denotes concatenation. Let ai denote the character of A in position i. Let s : Σ × Σ → R be a similarity function. go will denote the gap opening and ge will denote the gap extension penalty. The score of any alignment, and thus all introduced concepts based on the alignment scores depend on the choices on similarity function, gap opening and gap extension penalty. The Waterman-Byers algorithm [24] produces all alignments that have a score no less than the score of the optimal alignment minus some constant value. . • alignments ending in two aligned (matched) characters. 1. 2. 3. Theorem 1. Abstract | DeltaProt: a software toolbox for comparative genomics.
This section outlines the algorithms and performance of the different modules of DeltaProt by considering several illustrative examples of results. Throughout the rest of this paper, we will demonstrate the use of DeltaProt by applying it to a dataset of 65 membrane proteins from six microbial genomes [4]. Users only need to define their sequence data via simple and compact input files. Advanced users can 'tweak' many configuration settings in order to fine-tune for the different data sets. In DeltaProt we present statistical methods and trend-tests which are useful when the protein sequences in the alignments can be divided into two or more groups based on known phenotypic traits of the host organism, such as preference of optimal growth temperature (mesophile, intermediate, psychrophile) or environmental metagenomic samples.
The phenotypic group is assumed to be an ordinal stochastic variable in the statistical model. The toolbox consists of a set of statistical routines. Table 1. Abstract | iGTP: A software package for large-scale gene tree parsimony analysis. Abstract | webPRANK: a phylogeny-aware multiple sequence aligner with interactive alignment browser.
The webPRANK server (Figure 1) supports the alignment of DNA, protein and codon sequences, input in FASTA format [8], using evolutionary substitution models [9-11]. It can translate, align as protein and back-translate protein-coding DNA sequences. In addition, webPRANK includes built-in support for two structure models [6], FAST/SLOW and FAST/SLOW/CODON, designed for aligning genomic DNA sequences with sites evolving with different substitution dynamics and differences in the patterns of alignment gaps. webPRANK accepts a user-defined phylogeny (Newick format) to guide its progressive alignment procedure, or can compute one from the unaligned input sequences.
For each alignment task, the full combination of parameters, and the structure model if used, are provided in the output so that the analyses can easily be repeated or recreated with the stand-alone PRANK program. The size of alignment tasks is limited to 4 GB of memory and 24 hours of run time. Additional file 1. FragGeneScan: predicting genes in short and error-prone reads — Nucleic Acids Res. Genome Analysis of Deep-Sea Thermophilic Phage D6E -- Wang and Zhang 76 (23): 7861 -- Applied and Environmental Microbiology.
Abstract | Units of plasticity in bacterial genomes: new insight from the comparative genomics of two bacteria interacting with invertebrates, Photorhabdus and Xenorhabdus. Abstract | Analysis of intra-genomic GC content homogeneity within prokaryotes. A computational genomics pipeline for prokaryotic sequencing projects -- Kislyuk et al. 26 (15): 1819 -- Bioinformatics. + Author Affiliations * To whom correspondence should be addressed. Received January 25, 2010. Revision received May 21, 2010. Accepted May 25, 2010. Motivation: New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data.
The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. Results: We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. Contact: king.jordan@biology.gatech.edu Supplementary information: Supplementary data are available at Bioinformatics online. Cassis: detection of genomic rearrangement breakpoints -- Baudet et al. 26 (15): 1897 -- Bioinformatics. + Author Affiliations * To whom correspondence should be addressed.
Received March 3, 2010. Revision received May 10, 2010. Accepted June 3, 2010. Summary: Genomes undergo large structural changes that alter their organization. The chromosomal regions affected by these rearrangements are called breakpoints, while those which have not been rearranged are called synteny blocks. Availability: Perl and R scripts are freely available for download at Contact: Marie-France.Sagot@inria.fr Supplementary information: Supplementary data are available at Bioinformatics online. Abstract | A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes.
Dispersal of a tandemly arrayed repeat protein sequence in the genomes of unicellular microbes from diverse phylogenetic clades and ecological niches To establish an operational motif we first constructed an HMM based on a training dataset of ORFs from genomes of Mollicutes that contained a previously reported 25-residue amino acid sequence pattern [15], then refined the HMM using iterations of data sets expanded from successive searches of the non-redundant protein sequence database [24]. The HMM included sequences from diverse organisms representing very different phylogenetic histories, genomic sizes and G+C contents. We interrogated a recent version of this database (nr; October 30, 2009; 9,967,556 sequences) with the HMM to inventory the current set of unique ORFs bearing the motif. ORFs retrieved (Additional file 1, sheet 1) were organized using NCBI taxonomic classifiers [24]. Additional file 1. Format: XLS Size: 744KB Download file Additional file 2.
Figure 1. Additional file 3. Abstract | Comparative genome analysis of a large Dutch Legionella pneumophila strain collection identifies five markers highly correlated with clinical strains. Abstract | Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. Abstract | MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics. ML framework Trees are estimated in MetaPIGA-2.0 with the Maximum Likelihood criterion (ML) using any of 5 nucleotide substitution models ([2] and refs therein, [30]): Jukes Cantor (JC), Kimura's 2 parameters (K2P), Hasegawa-Kishino-Yano 1985 (HKY85), Tamura-Nei 1993 (TN93), and General Time Reversible (GTR). Analyses can be performed with rate heterogeneity among sites using a proportion of invariant sites (Pinv) [33] and/or a discrete Gamma distribution of rates (γ-distr) [31,32].
All parameters of the model (transition/transversion ratio or components of the rate matrix, the shape parameter of the γ-distr, and Pinv) can be set by the user or estimated from a Neighbor Joining (NJ) tree [36]. The same parameters plus branch lengths and among-partition relative rates can experience intra-step optimization either periodically during the search and/or at the end of the search. Tools shared among heuristics smaller distances, where NTax is the number of taxa (sequences) in the dataset. And. Gb-2010-11-6-306.pdf (Objet application/pdf)
Abstract | Accessing the SEED genome databases via Web services API: tools for programmers. Before the services are described, a couple of formalities about the underlying SEED database are introduced. These are provided to orient new users of the database. Internal Identifiers The SEED family of databases and services has their own internal identifiers, called FIG identifiers (FIDs), in the format fig|xxxxx.i.type.yyyy.
In this representation, the fig| denotes that it is a FIG internal identifier, the xxxxx is usually the NCBI taxon ID of the genome, the .i is the increment of the genome (advanced when major changes are performed), the type is the feature type, and the yyyy is the number of the feature on the genome. Thus, "fig|243277.1.peg.4400" refers to the 4400th protein encoding gene in the 1st increment of the genome with taxonomy ID 243277 (Vibrio cholerae O1 biovar eltor str. External Identifiers In addition to these internal identifiers, the SEED database maintains mapping to other commonly used identifiers wherever possible.
Accessing the SEED via Web services. Abstract | Scaffold filling, contig fusion and comparative gene order inference. There are two different aspects of the comparison of a completely assembled genome G1 with a genome in scaffold form G2. One is scaffold filling, which predicts where in G2 to locate potential genes that have not been identified in the sequence but are present in G1. The second is contig fusion, which suggests how to piece G2 contigs together to form chromosomes. In Figure 1, only scaffold filling is necessary for scenario (d) and only contig fusion is required for scenario (b). Scenario (c) requires both. We have shown how to handle the contig fusion problem in previous publications on papaya [2] and on Drosophila [3], and this will be reviewed in a separate section below. In the present paper we design and analyze an efficient exact algorithm for scaffold filling that simultaneously carries out contig fusion.
We use this algorithm to analyze real and simulated data. Filling in scaffolds Statement of the combinatorial optimization problem is at a minimum rearrangement distance from G1. . De novo assembly using low-coverage short read sequence data from the rice pathogen Pseudomonas syringae pv. oryzae — assembly edena Genome Research. + Author Affiliations ↵5 These authors contributed equally to this work. Abstract We developed a novel approach for de novo genome assembly using only sequence data from high-throughput short read sequencing technologies.
By combining data generated from 454 Life Sciences (Roche) and Illumina (formerly known as Solexa sequencing) sequencing platforms, we reliably assembled genomes into large scaffolds at a fraction of the traditional cost and without use of a reference sequence. We applied this method to two isolates of the phytopathogenic bacteria Pseudomonas syringae. Sequencing and reassembly of the well-studied tomato and Arabidopsis pathogen, PtoDC3000, facilitated development and testing of our method. New technologies have _rapidly reduced the time and cost of whole-genome sequencing. We focus on P. syringae because it is a plant pathogen that infects crop plants worldwide and is related to the human pathogen, P. aeroginosa. Results Table 1. Figure 1. Table 2. Figure 2. Figure 3. Pathogens: Genes and Genomes » Tips for de novo bacterial genome assembly. I have found Velvet to be an excellent option for short-read de novo assembly of bacterial genomes.
We are routinely seeing Velvet assemblies of Illumina data producing similar statistics to Newbler assemblis of 454 data. It seems that Illumina run with 75-base pair reads and paired-end libraries can be competitive with Newbler assemblies from 454 fragment libraries. Amazing that just a year or two the dogma was that short-read technologies weren”t useful for de novo assemby. However, getting the best out of Velvet is a bit trickier than using Newbler. Roche’s informatics is really slick and running a Newbler assembly is pretty much “fire and forget”. Firstly, de novo assembly with short-reads gives much more effective results if you aggressively pre-filter the reads passed to the assembler (whether it is Velvet or something else).
Filtering reads is a question of trial and error but I have found the following techniques are all useful. Run quality_breakdown.py on your FASTQ file. Abstract | Prediction of CpG-island function: CpG clustering vs. sliding-window methods. The way sliding-window approaches and CpGcluster detect CGIs are conceptually different. While SWA detect regions above the thresholds of G + C, O/E, min CpG and length, CpGcluster predicts statistically significant clusters of CpGs as CGIs.
As a first consequence, the statistical properties of the predicted islands are different as well (Figure 1); e.g. in SWA approaches the distributions of important CGI properties like %G + C and O/E ratio are heavily biased towards the user thresholds. Figure 1. Comparison of the distributions of the island length for both the CpGcluster and Takai/Jones algorithm (top left); the observed to expected ratios of CpG frequencies (top right); the island GC-content (bottom left); and the island CpG density (bottom right).
It can be seen that, for all this four properties, the SWA distributions are heavily biased towards their respective thresholds. However, CpGcluster distributions do not show this artifact. CpG islands in the promoter region Table 1. Comparative genomics of the FtsK-HerA superfamily of pumping ATPases: implications for the origins of chromosome segregation, cell division and viral capsid packaging -- Iyer et al. 32 (17): 5260 -- Nucleic Acids Research. Abstract | Small variable segments constitute a major type of diversity of bacterial genomes at the species level.
Crystal structure of the antitoxin-toxin protein c... [J Mol Biol. 2009] - PubMed result. Biochemical properties and base excision repair co... [Nucleic Acids Res. 2009] - PubMed result. Property of cold inducible DEAD-box RNA helicase i... [Biochem Biophys Res Commun. 2009] - PubMed result. The C-terminal portion of an archaeal toxin, aRelE... [Biosci Biotechnol Biochem. 2009] - PubMed result. The glycine-rich motif of Pyrococcus abyssi DNA po... [J Mol Biol. 2010] - PubMed result.
Thermococcus kodakarensis genetics: TK1827-encoded... [Appl Environ Microbiol. 2010] - PubMed result. 'Junk' DNA gets credit for making us who we are - life - 19 March 2010. FtsK translocation on DNA stops at XerCD-dif -- Graham et al. 38 (1): 72 -- Nucleic Acids Research. Abstract | State of the art: refinement of multiple sequence alignments.