| • Science | • People | • Locations | • Timeline |
The SMILES specification was developed by David Weininger in the late 1980s. It has since been modified and extended by others, most notably by Daylight Chemical Information Systems Inc. Other 'linear' notations include the Wiswesser Line Notation (WLN), ROSDAL and SLN (Tripos Inc).
In terms of a graph based computational procedure, SMILES is a string obtained by printing the symbol nodes encountered in a depth-first tree-traversal of a chemical graph. The chemical graph is first trimmed to remove Hydrogen atoms and cycles are broken to make it into a spanning tree. Where cycles have been broken, numeric suffix labels are included to indicate the connected nodes. Brackets are used to indicate points of branching on the tree.
SMARTS is a modification of SMILES that allows, in addition to the SMILES elements, the specification of wildcard atoms and bonds. This is used in specifying search structures and is widely used in chemical databaseA Chemical database is a database specifically designed to store chemical information. Most chemical databases store information on stable molecules. Chemical structures are traditionally represented using lines indicating bonds between atoms and drawn on search applications. This practise has led to a common misconception that chemical substructure search is achieved computationally by matching SMILES/SMARTS strings, when in fact it is achieved by the computationally more intensive search for subgraph isomorphismIn mathematics, an isomorphism is a kind of interesting mapping between objects. Douglas Hofstadter provides an informal definition: :The word "isomorphism" applies when two complex structures can be mapped onto each other, in such a way that to each part in the graphs reconstructed from the SMILES representations.
Since SMILES is generated by tree-traversal, the string can vary depending on the root node chosen as well as the order in which nodes are encountered. A unique or 'canonical' form of the SMILES representation can be generated by applying rules to preprocess the tree before tree-traversal. A common application of unique SMILES is for exact matching of two structures and also for ensuring uniqueness among molecules in a database.
Important enhancements to SMILES include extensions to store information on stereochemistry.