click to enlarge

Figure 1. Field focused library design methodology. (All figures: Cresset Biomolecular Discovery Ltd.) 

Screening compound libraries is a cornerstone of the current hit and lead discovery process. Screening techniques have developed rapidly over the last few years, moving away from expensive and wasteful “shotgun” screening to targeted screening of smaller libraries, which potentially offer greater novelty, higher hit rates, and lower costs.

There is a delicate balance to be struck between identifying all potentially active chemical scaffolds and retaining a manageable library of compounds that is tractable in scale and cost-effective for routine screening. With this in mind, two main library design strategies may be employed. The first is targeted sub-selection of existing in-house compound collections, so-called “rear-view mirror” libraries.1 The second, de novo library design involves compounds from targeted synthesis campaigns or massive virtual compound collections being selected for their predicted activity at a specific target. This approach typically offers more innovation and diversity, but at a higher cost. Both strategies offer different mixes of innovation potential and up-front/downstream costs, leaving the key question unanswered: which compounds should be included in the library to offer the best mix of novelty, IPR potential, and downstream development tractability?


click to enlarge

Figure 2. The generation and descriptive power of a molecular field template for H3. 

When designing libraries against specific targets, a wide range of activities—therapeutic, off-target, and toxicity—must be predicted. Experience has shown that a compound’s biological activity cannot be predicted solely by its 2D structure and we need to use a more detailed description of the molecule. At its most basic level, activity is determined by the interactions of the molecular fields (surfaces) of the target protein with those of the ligand in their respective binding conformations. The target protein and its ligand both present a range of complementary fields, and the interactions between these fields drive whether a ligand can fit well into the target’s active site.2

Field-based tools compare the patterns of these fields to predict the similarity between the properties and activities of compounds. The most important regions of the fields—the extrema, where molecular interactions are likely to be strongest—are first summarized by a field point pattern as shown in Figure 2. Any compound that is capable of presenting a complementary set of field points—in a 3D conformation that is accessible under physiological conditions—is very likely to have the same biological activity and properties as the natural ligand. This pattern of field points becomes a template for the specific biological activity. As fields can be computed from the structures of active ligands alone, they can be used to identify diverse potential new lead structures even when the X-ray crystal structure of the target protein is not known.

Field templates can be used to predict activity at both therapeutic targets as well as known toxicity targets such as CYP 2D6 and hERG. A range of templates can be derived and used to counterscreen an aggregated library of compounds derived from multiple congeneric series and other sources as shown in Figure 1.


click to enlarge

Figure 3. Filtering of the H3 library to remove hERG liabilities (Inset: hERG field template). 

One such tool3 was used in the example shown in Figure 2 to select a diverse library of potential H3 antagonists. A series of seven highly active H3 antagonists was identified from the literature and aligned in their bioactive conformations. The fields around these molecules were then compared and a consensus field template was determined. As confirmation of the predictive capability of this template, the field match score was compared against the known activity (Ki) scores of 68 further H3 antagonists described in the scientific literature and outside the original training set. A good match of fields to activity was confirmed and a threshold field similarity score of 0.6 was established as the baseline above which compounds could confidently be predicted to exhibit H3 antagonism.

The H3 template was then used to counterscreen an existing compound collection to identify potential H3 antagonist structures. A large number of matches were identified, with 68 distinct chemical scaffolds. Since chemical scaffolds that can be expected to show liabilities for serious off-target or toxicity effects should be avoided, the compound matches were also screened against toxicity templates for CYP 2D6 and hERG activity (Figure 3, page 22). Approximately 4% of the compounds were rejected due to potential 2D6 toxicity and another 8% due to potential hERG toxicity (evaluated on >0.59 and >0.62 similarity score thresholds for the respective toxicity templates).

Although this set of structures is interesting and useful as an H3 screening library, it only takes into account existing compounds, which may have low innovation and IPR potential. Field-based methods can also be used to predict novel bioisosteric compounds that will exhibit the same activity when key fragments of their structure are replaced. Such a tool4 was used to replace the central core of some of the 68 compounds resulting from the above screening (Figure 4, page 22).


click to enlarge

Figure 4. Novel predicted H3 antagonists showing significant structural diversity and high predicted activity. 

The highlighted structures in red on Figure 4 represent some of the most active H3 antagonists known from literature, while the blue structures are novel compounds generated by the software. The graph shows a number of novel compounds with diverse central cores that have significantly higher predicted activities at H3 (as shown by higher field similarity score). Five of the more interesting compounds (all of which are novel and have high similarity scores) have been highlighted in yellow. These compounds would be ideal candidates for inclusion in the final library as they combine innovation with chemical tractability and high predicted activity.

The 2D-similarity score of the majority of the dataset, including all of the highlighted molecules (measured against the canonical structure shown top left), is less than 0.7, which is a de facto cut-off for 2D-based scoring methods. This means that most of these structures would be very unlikely to be considered in a traditional library design process as there would be no reliable way to predict their activity.

Using fields to derive an activity template is both more inclusive of structural diversity and more specific about the features that drive the desired activity. These twin attributes generate more useful screening libraries with a wider set of chemical starting points. Field-focused libraries are highly novel, containing leads that are generally more patentable and diverse than traditional targeted libraries. This is critical to help move away from ”me-too” chemistry with confidence, enhancing innovation and activity while avoiding a range of known toxicity issues. Molecular field-based methods can, therefore, provide library designers with new tools to inform all types of library design,5 overcoming the structural bias that has constrained library designers in the past.

1. Hulme C. Presented at 237th ACS National Meeting, Salt Lake City; 2009.
2. Cheeseright T, Mackey M, Rose, S and Vinter JG. Molecular field extrema as descriptors of biological activity. J Chem Inf Model. 2006:46(2);665-676.
3. FieldScreen:
4. FieldStere:
5. Harris J. Letting the target determine your compound acquisition strategy, DDW; Spring, 2009.