Molecular drug targets and structure based drug design: A holistic approach.

Access to the complete human genome sequence as well as to the complete sequences of pathogenic organisms provides information that can result in an avalanche of therapeutic targets. Structure-based design is one of the first techniques to be used in drug design. Structure based design refers specifically to finding and complementing the 3D structure (binding and/or active site) of a target molecule such as a receptor protein. The aim of this review is to give an outline of studies in the field of structure based drug design that has helped in the discovery process of new drugs. The emphasis will be on comparative/homology modeling.

trials. The structure-based design methods used to optimize these leads into drugs are now often applied much earlier in the drug discovery process. Protein structure is used in target identification and selection (the assessment of the 'druggability' or tractability of a target), in the identification of hits by virtual screening and in the screening of fragments. Additionally, the key role of structural biology during lead optimization to engineer increased affinity and selectivity into leads remains as important as ever.

Description: Common drug targets
The introduction of genomics, proteomics and metabolomics has paved the way for biology-driven process, leading to plethora of drug targets. The list of potential drug targets encoded in a genome includes most natural choice of virulent genes and species-specific genes. Other options include targeting RNA, enzymes of the intermediary metabolism, systems for DNA replication, translation apparatus or repair and membrane proteins ( Figure 1).

Species-specific genes as drug targets
Comparative analysis of the complete genome sequences of bacterial pathogens available in the public databases offers the first insights into drug discovery approaches of the near future. [11] An interesting approach to the prediction of potential drug targets designated as the differential genome display has been proposed by Bork and co-workers. [12] This approach relies on the fact that genome of parasitic microorganisms are generally much smaller and code for fewer proteins than the genomes of free-living organisms. The genes that are present in the genome of a parasitic bacterium, but absent in a closely related genome of free

Nucleic acid as drug targets
Nucleic acids are the repository of genetic information. DNA itself has been shown to be the receptor for many drugs used in cancer and other diseases. These work through a variety of mechanisms including chemical modification and cross linking of DNA (cisplatin) or cleavage of the DNA (bleomycin). Much work either by intercalation of a polyaromatic ring system into the double stranded helix (actinomycin D, ethidium) or by binding to the major and minor grooves of DNA (e.g., netropsin) ( Figure 2) [13] has been reported. DNA has been shown to be the target for chemotherapy with efforts to design sequence-specific reagents for gene therapy.

RNA as drug target
Recent advances in the determination of RNA structure and function have led to new opportunities that will have a significant impact on the pharmaceutical industry. RNA, which, among other functions, serves as a messenger between DNA and proteins, was thought to be an entirely flexible molecule without significant structural complexity. However, recent studies have revealed a surprising intricacy in RNA structure. This observation unlocks opportunities for the pharmaceutical industry to target RNA with small molecules. Perhaps more importantly, drugs that bind to RNA might produce effects that cannot be achieved by drugs that bind to proteins. [14] Proof of the principle has already been provided by success of several classes of drugs obtained from natural sources that bind to RNA or RNA-protein complexes.

Membranes as drug targets
Membranes are significant structural elements, both in defining the boundaries of a cell as well as providing interior compartments within the cell associated with particular functions. Cell membranes themselves can also act as targets for molecular recognition. An understanding of the structural and dynamic functions of the membranes (e.g., plasma membranes and intercellular membranes) may add to a more rational design of drug molecules with improved permeation characteristics or specific membrane effects. Many general anesthetics are believed to work by their physical effects when dissolved in membranes. Several classes of antibiotics like gramicidin A, antifungals like alamethicin and toxins such as mellitin found in bee venoms have direct effects on planar lipid bilayers, causing transmembrane pores.

Proteins as drug targets
Proteins continue to assume significant attention from the pharmaceutical and biotechnology industries as a valuable source of potential drug targets. [15] Proteins provide the critical link between genes and disease, and as such are the key to the understanding of basic biological processes including disease pathology, diagnosis, and treatment. Researchers have discovered many potential therapeutic targets, and there are currently more than 700 products in various phases of development. However, translating the study of proteins into validated drug targets poses substantial challenges. Genome sequences instruct cells on how and when to make proteins. The proteins in turn are the active players in the cell. Proteins form the machinery of cells, allow cells to communicate, and can control growth or death of an organism. Because of their role in cells, most of the drug targets are proteins. Drugs work by binding specifically to a protein. Extensive knowledge about the function of a protein can guide the selection of targets for pharmaceutical chemists. Studying the complex domain of 200,000-300,000 distinct and interactive proteins poses substantial challenges. Most target proteins for drug development participate in key regulatory steps in the human body or in an infectious organism. As such, they tend to be present in few copies only and often within specific cells. Their isolation and purification using traditional preparative biochemical means and in quantities required for routine assays has been a formidable challenge. This situation has been radically changed by the ability to clone and express proteins. Thus many key target proteins are now becoming available in sufficient amounts to make them amenable not only to biological assays but also to NMR studies in solution and to crystallization for X-ray analysis. The number of protein structures solved using X-ray or NMR has begun to rise sharply and more than 40,000 protein three-dimensional structures have been deposited in the Protein Data Bank [16] till date (December 2006). Various classes of proteins can be categorized as potential drug targets.
Small molecules such as drugs, insecticides or herbicides usually exert their effects by binding to protein targets. In the past, many of these molecules were found empirically with little or no knowledge of the mechanism of action involved. In many cases, the targets that are modified by these substances were identified in retrospect. Interestingly, the majority of drugs currently in use modulate either enzymes or receptors, most of them G-protein-coupled receptors.
a. Enzymes -The macromolecule responsible for the catalysis of biochemical reactions are an obvious target when a disease state is associated with production of a biologically active species. Enzymes are a classic target for therapeutic intervention and numerous well-studied examples exist. b. Receptor proteins -G-protein-coupled receptors are a super family of seven transmembrane spanning proteins that are activated by a wide range of extracellular ligands and are expressed in virtually all tissues. Signaling through these receptors regulates a wide variety of physiological processes such as neurotransmission, chemotaxis, inflammation, cell proliferation, cardiac and smooth muscle contraction as well as visual and chemosensory perception. In view of their widespread distribution and importance in health and disease, it is not surprising that GPCRs are the most successful class of target proteins for drug discovery research. [17] The sequencing of human genome has led to the prediction of as many as 1000 GPCRs, of which 400 are nonchemosensory receptors and can therefore be considered as potential drug-targets. [18] It has been estimated that up to 50 % of all marketed drugs directly target this family of receptors [19], some of which are listed in Table 2.
The goal in developing drugs against the targets listed above is often to modulate the function of the human protein while the goal in developing drugs against pathogenic organisms is total inhibition, leading to the death of the pathogen. Antimicrobial drugs should be essential to the pathogen, have a unique function in the pathogen, be present only in the pathogen, and be able to be inhibited by a small molecule.
The target should be essential, in that it is a part of a crucial cycle in the cell, and its elimination should lead to the pathogen's death. The target should be unique: no other pathway should be able to supplement the function of the target and overcome the presence of the inhibitor. If the macromolecule satisfies all the outlined criteria to be a drug target but functions in healthy human cells as well as in a pathogen, specificity can often be engineered into the inhibitor by exploiting structural or biochemical differences between the pathogenic and human forms. Finally, the target molecule should be capable of inhibition by binding of a small molecule. Enzymes are often excellent drug targets because compounds are designed to fit within the active site pocket.

Structure based drug design
Drug discovery referred to, as 'rational' did not take flight until the first structures of the targets were solved. In 1897, Ehrlich suggested a theory called the side chain theory wherein he proposed that specific groups on the cells combine with the toxin. Ehrlich coined these side chains as receptors. Structure-based drug design of protein ligands has emerged as a new tool in medicinal chemistry. [20] The central assumption of structure-based drug design [21] is an iterative one as shown in Figure 3 and often proceeds through multiple cycles before an optimized lead goes into clinical trials.
The first cycle includes the cloning, purification and structure determination of the target protein or nucleic acid by one of three principal methods: X-ray crystallography, NMR or comparative modeling. Using computer algorithms, compounds or fragments of compounds from a database are positioned into a selected region of the structure.
These compounds are scored and ranked based on their steric and electrostatic interactions with the target site and the best compounds are tested further with biochemical assays. In the second cycle, structure determination of the target in complex with a promising lead from the first cycle, one with at least micromolar inhibition in vitro, reveals sites on the compound that can be optimized to increase potency. Additional cycles include synthesis of the optimized lead, structure determination of the new target: lead complex, and further optimization of the lead compound.
After several cycles of the drug design process, the optimized compounds usually show marked improvement in binding, and often, specificity for the target.

Evaluating a structure for structure based drug design
Once a target has been identified, it is necessary to obtain accurate structural information. There are three primary methods for structure determination that are useful for drug-design: X-ray crystallography, NMR, and homology modeling.
High-resolution crystal structures are the most common desired source of structural information for drug design, particularly for proteins that range in size from a few amino acids to 998kD. [22] Another advantage of crystallography is that ordered water molecules are visible in the experimental data and are often useful in drug design. A crystal structure should be evaluated for the resolution of the diffracted amplitudes (often simply called resolution); reliability, or R factors; coordinate error; temperature factors; and chemical correctness. Typically, crystal structures determined with data extending below 2.5 A 0 are acceptable for drug design purposes since they have a high data to parameter ratio, and the placement of residues in the electron density map is unambiguous. The R factor and R free reported for a model are measures for the correlation between the model and experimental data. The R free value should be below 28% and ideally below 25%, and the R factor should be well below 25% in order to use the structure in drug design. If the only structure available for a particular target does not meet the resolution or R factor Structures determined by nuclear magnetic resonance, using a concentrated protein or nucleic acid in solution are also valuable sources for drug design. [23] Since the target is in solution it is sometimes possible to interpret the dynamics of the target from the data. If no experimentally determined structure is available, a homology model can be used for drug design. [24,25] To evaluate a homology model, SWISS MODEL outputs a confidence factor per residue that reflects the amount of structural information used to create that portion of the model.
Using the structural information obtained through the above techniques, the structure is then prepared for drug design programs.

Present state of the art: Computer-aided drug design
Given the vast size of organic chemical space [26], drug discovery cannot be reduced to a simple "synthesize and test" drudgery. There is an urgent need to identify and/ or design drug-like molecules [27] from the vast expanse of what could be synthesized. In silico methods have the potential to reduce both time and cost in developing suggestions on drug/ lead-like molecules. Computational tools have the advantage for delivering new lead candidate more quickly and at lower cost. Drug discovery in the 21 st century is expected to be different in at least two distinct ways: development of individualized medicine departing from genomic information and extensive use of in silico simulations to facilitate target identification, structure prediction and lead/drug discovery. The expectations from computational methods for reliable and expeditious protocols for developing suggestions on potential leads are continuously on the increase. Several conceptual and methodological concerns remain before an automation of drug design in silico could be contemplated.
Computational methods are needed to exploit the structural information to understand specific molecular recognition events and to elucidate the function of the target macromolecule (Figure 4). This information should ultimately lead to the design of small molecule ligands for the target, which will block/activate its normal function and thereby act as improved drugs.
As structural genomics, bioinformatics, and computational power continue to explode with new advances, further successes in structure-based drug design are likely to follow. Each year, new targets are being identified; structures of those targets are being determined at an amazing rate, and capability to capture a quantitative picture of the interactions between macromolecules and ligands is accelerating.

Success of computer-assisted molecular design
The greatest success of computer-aided structure-based drug design to date is the HIV-1 protease inhibitors that have been approved by the United States Food and Drug Administration and reached the market. [28] There have been many successful computer-assisted molecular design attempts to involve the use of QSAR to improve activity of lead compounds. An example of the success story is that of SAR work carried out on antibacterial agent, Norfloxacin   Table 2: Some currently marketed drugs that target GPCRs

Utility of Homology Models in the Drug Discovery Process
Advances in bioinformatics and protein modeling algorithms, in addition to the enormous increase in experimental protein structure information, have aided in the generation of databases that comprise homology models of a significant portion of known genomic protein sequences. Currently, 3D structure information can be generated for up to 56% of all known proteins. However, there is considerable controversy concerning the real value of homology models for drug design. Despite the numerous uncertainties that are associated with homology modeling,