The Workflow The workflow demonstrates how a chemist could create a virtual library of Amides based on a set of Acids and Amines.For the enumerated products, some molecular properties are then calculated and the products are filtered based on the Lipinski "rule of 5". In order to demonstrate the interoperability of the different community contributions, the workflow uses nodes from the RDKit, CDK, and Indigo integration.
To create a new QSAR model in OCHEM the user must prepare the training and (optional) validation sets, configure the preprocessing of molecules (standardization and 3D optimization), choose and configure the molecular descriptors and the machine learning method, select the validation protocol (N-fold cross-validation or bagging) and, when the model has been calculated, review the predictive statistics and save or discard the model. The following sections describe each of the aforementioned steps in detail. Training and validation sets, machine learning method and validation Training and validation datasets. One of the most important steps in model development is the preparation of input data, i.e., a training set that contains experimentally measured values of the predicted property.

ACD/ChemSketch 12 Freeware. ACD continues to release updates to the outstanding Chemsketch freeware. Chemsketch is an all-purpose chemical drawing and graphics software. Use templates or free-hand.
The NCI/CADD group is a research unit within the Chemical Biology Laboratory at the National Cancer Institute. Read more about the CADD Group's Chemoinformatics Tools and User Services.
This file contains the structures downloaded from the PubChem FTP site that have at least one assay result associated with them that was obtained in the context of the NIH Common Fund (previously: NIH Roadmap) Molecular Libraries Probe Production Centers Network (previously: Molecular Libraries Screening Center Network), part of the Common Fund's Molecular Libraries and Imaging program . It is organized by unique chemical structures ("Compounds" in PubChem parlance), i.e. assay results for possibly multiple different samples ("Substances" in PubChem parlance) have been combined into the one record representing the unique chemical structure. Placeholder assays (assays containing a single record only) have been filtered out. Explanation of the property data fields in the SD file (note - properties present in the original PubChem files have been copied unchanged, for the explanation of those properties we point directly to the appropriate PubChem document ):
ASN.1 Format Summary An International Standards Organization (ISO) data representation format used to achieve interoperability between platforms. For data specifications and conversion tools, see NCBI Data Specification below. BLAST Microbial Genomes Performs a BLAST search for similar sequences from selected complete eukaryotic and prokaryotic genomes. BLAST RefSeqGene
This is the new version of the ChemMine Database that supports chemical genomics research at the Institute for Integrative Genome Biology at UC Riverside. The general purpose cheminformatics tool box, provided by this service in the past, is now available on a separate site: ChemMine Tools. The ChemMine Database itself is a compound mining portal that facilitates drug and agrochemical discovery and chemical genomics screens. This web service is divided into two major functional components: Compound DatabaseScreening Database A detailed tutorial for using ChemMine's online services is available on the ReadMe page.
ChemSpider is a free chemical structure database providing fast text and structure search access to over 30 million structures from hundreds of data sources. Watch our introduction video. Search by chemical names Systematic namesSynonymsTrade namesDatabase identifiers
PowerMV: A software environment for statistical analysis, molecular viewing, descriptor generation, and similarity search. Jack Liu, Jun Feng, Atina Brooks and Stan Young National Institute of Statistical Sciences Basic Functions: • Supports MDL SDF format • Displays molecules in multiple columns. • Displays properties contained in SD file in a table. • Anti-alias technology for best picture quality. • Table of molecule pictures and properties can be exported to Excel (Office XP and above) to generate personalized reports. • Calculates three types of binary atom pair descriptors and continuous weighed burden numbers.
Avogadro is an advanced molecule editor and visualizer designed for cross-platform use in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas. It offers flexible high quality rendering and a powerful plugin architecture. Cross-Platform: Molecular builder/editor for Windows, Linux, and Mac OS X. Free, Open Source: Easy to install and all source code is available under the GNU GPL.