All Notebooks | Help | Support | About
9th June 2015 @ 03:35

Last semester at Sydney Uni, Alice Williamson created a new lab course in what's known as the "Special Studies Program" in Chemistry - high achieving undergrads who are given the freedom to try something new and challenging. Alice, working with Adam Bridgeman and Peter Rutledge in the School of Chemistry, designed a set of experiments based on where we're up to with the chemistry in OSM's Series 4 - a research project funded by the ARC and MMV.

The lab manual is openly available, and the students had to keep open lab notebooks. Here they are! I'm not sure there's ever been a lab class quite like it. Students engaged in real research, where everything is shared. We're now going to incorporate what they discovered into OSM itself.

Excitingly, I think this would scale. One can imagine lab courses based around current needs in any open source drug discovery and development project, meaning we could, with proper mentorship, bring to bear very large levels of skilled human resource to tackle actual project needs, with global coordination between cohorts. Undergrads in other countries have already contributed to OSM.

The students had to make short videos talking about any aspect of their project - they were given complete creative control. I had the pleasure of watching these during a showing in one of our lecture theatres last month - I was deeply impressed (and, occasionally, slightly disturbed). Here's the full playlist.

SSP OSM Playlist

These students have done a fantastic job, and Alice deserves enormous kudos for driving this through from scratch. Well done!

Attached Files
16th April 2015 @ 01:36

One of the key issues to solve in open source drug discovery and development programs is how to obtain physical samples of molecules for evaluation. A number of different methods can be envisaged, ranging from direct funding of synthesis (e.g. from grant money, hiring local researchers) through student classes to individual pro bono contributions from academia or industry. A possibility we’d like to build with community help this year is the Molecular Craigslist. We were speaking with Assay Depot about this idea - CEO Kevin Lustig has for a long time been pioneering new ways to source inputs for research projects.

But what about direct purchase of compounds from contract research organizations (CROs)? I believe that such an approach is gaining in popularity with NGOs requiring rapid synthesis of small molecules as part of in-house or sponsored projects. To get a feel for the commercial cost of single inputs to open source research projects (where the data are needed along with the compound) I decided to run one molecule, required by OSM, through Assay Depot’s suite of synthetic chemistry contract research organizations. The requirements we listed were for a few mg of the sample suitable for biological evaluation, plus all the preparative and characterization data obtained during the molecule’s preparation. There is ample synthetic precedence for the compounds in this series, but the specific analog requested (below) contains an oxetane, representing a not-completely-trivial extension of that methodology.


OSM Compounds Requested via Assay Depot

While we agreed not to share details of the companies providing quotations for this request, we can provide the figures quoted. I will be gradually assembling those over on Github so we can decide whether to go ahead with one of the quotes and to what extent this represents a general solution for analog synthesis for cases where we have already established precedence for the methodology.






Attached Files
24th March 2015 @ 23:48

There is evidence from Kiaran Kirk's lab that Series 4 of the OSM Consortium inhibits ATP4, an ion homeostasis pump that has become an important target in antimalarial drug discovery. This importance stems from it being an apparently validated target with a candidate, KAE609, in the clinical pipieline. There are recent reports that other, very promising, compounds also hit this target (1, 2).

But because it's an integral membrane protein, there's no structure yet. So there's no direct physical evidence for a binding interaction. There are resistance studies (grow parasite with drug, develop resistance, sequence genome, observe mutations in gene relevant to ATP4) but in a recent paper it was found that resistance is also related to mutations in a second gene (for the protein PfCDPK5). A homology model has been used to build a structure for the protein in silico; analysis of where the mutations occurred (Fig 2 in the paper) seemed to suggest they cluster in a certain region, but to my eye they did not.

In a startling paper that came out last year, Kiaran Kirk, Adele Lehane and others screened the MMV Malaria Box in their phenotypic ion regulation assay. They found that a truly astonishing array of chemotypes were active - those shown below. The assignment list (i.e. which structure (given by the MMV number) belongs to which chemotype (denoted "N") is given as a list at the end of this post.

ATP4 Active Chemotypes

How is this possible? To me it's not clear.

What we need to do is build a pharmacophore model using the known actives and inactives. We had a quick, informal go at this last year in OSM, but we didn't write it up since Kiaran's paper was under embargo. We'll now get this part of the project going in the public domain in the lab notebook and invite everyone to collaborate. We need an in silico model, the list of actives/inactives, the relevant known mutations known to be associated with resistance and people willing to digest the numbers.

This issue is important because i) it'd be nice to see how these different chemotypes can bind the same protein and in so doing actually provide evidence that they *do* bind this protein, ii) we could be predictive about which molecules to make to increase the chances of securing a hit against ATP4 and iii) we could gain some clarity in whether these diverse structure possess the same mechanism of action, and therefore whether developing them all is a risk (single mutation confers resistance to multiple drug candidates).


Assignment List

MMV006427 N1
MMV000642 Guy Ch-I
MMV000662 Guy Ch-I
MMV0006429 Guy Ch-I
MMV011567 N2 Ch-II
MMV665805 N2 Ch-II
MMV665878 N3
MMV665800 N4 Ch-III
MMV000648 Guy Ch-I
MMV000653 Guy Ch-I
MMV665918 N5 (possible OSM S4)
MMV007617 N6
MMV665803 N4 Ch-III
MMV665796 N4 Ch-III
MMV665826 N7
MMV665890 N2 Ch-II
MMV396719 N8 Ch-IV spiro?
MMV396715 N8 Ch-IV spiro?
MMV396749 N8 Ch-IV spiro?
MMV020660 N2 Ch-II
MMV665949 N9
MMV000917 N10
MMV666025 N11
MMV008455 Guy?
MMV006764 N12
MMV007275 N13

MMV666124 N14?
MMV006656 N12


Compound Strings

Representative spiroindolone (KAE609) C[C@@H](C1)N[C@@]2(C(NC3=C2C=C(Cl)C=C3)=O)C4=C1C5=CC(F)=C(Cl)C=C5N4 InChI=1S/C19H14Cl2FN3O/c1-8-4-11-10-6-14(22)13(21)7-16(10)23-17(11)19(25-8)12-5-9(20)2-3-15(12)24-18(19)26/h2-3,5-8,23,25H,4H2,1H3,(H,24,26)/t8-,19+/m0/s1 CKLPLPZSUQEDRT-WPCRTTGESA-N

Representative dihydroisoquinolone (SJ733) O=C1C2=CC=CC=C2[C@H](C(NC3=CC(C#N)=C(F)C=C3)=O)[C@@H](C4=CN=CC=C4)N1CC(F)(F)F InChI=1S/C24H16F4N4O2/c25-19-8-7-16(10-15(19)11-29)31-22(33)20-17-5-1-2-6-18(17)23(34)32(13-24(26,27)28)21(20)14-4-3-9-30-12-14/h1-10,12,20-21H,13H2,(H,31,33)/t20-,21+/m0/s1 VKCPFWKTFZAOTO-LEWJYISDSA-N

Representative aminopyrazole (GNF-Pf-4492) O=C(NC1=CC=C(F)C=C1F)NC2=C(C3=CC=C(Br)C=C3)C(C(F)(F)F)=NN2C InChI=1S/C18H12BrF5N4O/c1-28-16(26-17(29)25-13-7-6-11(20)8-12(13)21)14(15(27-28)18(22,23)24)9-2-4-10(19)5-3-9/h2-8H,1H3,(H2,25,26,29) YDMCNKUCKHWGIM-UHFFFAOYSA-N

Representative OSM Series 4 (OSM-S-202) O=C(NC1=CC(Cl)=CC=C1)C2=CN=CC3=NN=C(C4=CC=C(OC(F)F)C=C4)N32 InChI=1S/C19H12ClF2N5O2/c20-12-2-1-3-13(8-12)24-18(28)15-9-23-10-16-25-26-17(27(15)16)11-4-6-14(7-5-11)29-19(21)22/h1-10,19H,(H,24,28) AJGOFYWOTIIYLR-UHFFFAOYSA-N

MMV006427 O=S1(C2=CC=CC=C2C(SC(C(N(C)C3=C(OC)C=C(OC)C(Cl)=C3)=O)=C4)=C4C1)=O InChI=1S/C21H18ClNO5S2/c1-23(15-9-14(22)16(27-2)10-17(15)28-3)21(24)18-8-12-11-30(25,26)19-7-5-4-6-13(19)20(12)29-18/h4-10H,11H2,1-3H3 RXQJJBRAANNIDY-UHFFFAOYSA-N

MMV011567 O=C(C1=CC(Cl)=C(OC)C=C1)NC2=NON=C2C3=CC=C(OC)C(OC)=C3 InChI=1S/C18H16ClN3O5/c1-24-13-6-5-11(8-12(13)19)18(23)20-17-16(21-27-22-17)10-4-7-14(25-2)15(9-10)26-3/h4-9H,1-3H3,(H,20,22,23) RUPCNQRICRCGRU-UHFFFAOYSA-N

MMV665878 O=C1N([C@H](C(NC2=CC(OC)=CC=C2)=O)C(C)C)C(C3=CC=CC=C3N1)=O InChI=1S/C20H21N3O4/c1-12(2)17(18(24)21-13-7-6-8-14(11-13)27-3)23-19(25)15-9-4-5-10-16(15)22-20(23)26/h4-12,17H,1-3H3,(H,21,24)(H,22,26)/t17-/m0/s1 HGHQOEUVKUMVFT-KRWDZBQOSA-N

MMV665800 CC(C1=CC(OCC)=C(OCC)C=C1Cl)NC(C2=CC=CC=C2)=O InChI=1S/C19H22ClNO3/c1-4-23-17-11-15(16(20)12-18(17)24-5-2)13(3)21-19(22)14-9-7-6-8-10-14/h6-13H,4-5H2,1-3H3,(H,21,22) LAXUZBKJDUGZGQ-UHFFFAOYSA-N

MMV665918 FC1=CC(C2=NN=C3N2N=C(SCC(NC4=CC(OCO5)=C5C=C4)=O)C=C3)=CC=C1 InChI=1S/C20H14FN5O3S/c21-13-3-1-2-12(8-13)20-24-23-17-6-7-19(25-26(17)20)30-10-18(27)22-14-4-5-15-16(9-14)29-11-28-15/h1-9H,10-11H2,(H,22,27) HCHXPMOIMAHZOG-UHFFFAOYSA-N

MMV007617 CC(NCCC1=NC2=C(N1CC3=CC=C(C(C)C)C=C3)C=CC=C2)=O InChI=1S/C21H25N3O/c1-15(2)18-10-8-17(9-11-18)14-24-20-7-5-4-6-19(20)23-21(24)12-13-22-16(3)25/h4-11,15H,12-14H2,1-3H3,(H,22,25) VLJWUPFUQDYBCW-UHFFFAOYSA-N

MMV665826 CC1=C(C#N)C2=C(N1CC(NC3=CC=CC(OC)=C3)=O)C=CC=C2 InChI=1S/C19H17N3O2/c1-13-17(11-20)16-8-3-4-9-18(16)22(13)12-19(23)21-14-6-5-7-15(10-14)24-2/h3-10H,12H2,1-2H3,(H,21,23) MEDFKFHQIXAAGU-UHFFFAOYSA-N

MMV396719 CC(NC1=C2C=CC=C1)(C3=CC=CC(OC)=C3)N(C2=N4)C5=C4C=CC=C5 InChI=1S/C22H19N3O/c1-22(15-8-7-9-16(14-15)26-2)24-18-11-4-3-10-17(18)21-23-19-12-5-6-13-20(19)25(21)22/h3-14,24H,1-2H3 GOFKBYMLZNXKGI-UHFFFAOYSA-N

MMV665949 Cl/C(Cl)=C(C1=CC=C(O)C=C1)/C2=CC=C(O)C=C2 InChI=1S/C14H10Cl2O2/c15-14(16)13(9-1-5-11(17)6-2-9)10-3-7-12(18)8-4-10/h1-8,17-18H OWEYKIWAZBBXJK-UHFFFAOYSA-N

MMV000917 O=C1N2C3=CC=CC=C3N=C(C4=CC=C(C)C=C4)CC2C5=C1C(OC)=C(OC)C=C5 InChI=1S/C25H22N2O3/c1-15-8-10-16(11-9-15)19-14-21-17-12-13-22(29-2)24(30-3)23(17)25(28)27(21)20-7-5-4-6-18(20)26-19/h4-13,21H,14H2,1-3H3 CFQPOWRJAPPJEW-UHFFFAOYSA-N

MMV666025 BrC1=CC(C(C=C(Br)C=C2)=C2N3CC(CN4N=NC5=C4C=CC=C5)O)=C3C=C1 InChI=1S/C21H16Br2N4O/c22-13-5-7-19-16(9-13)17-10-14(23)6-8-20(17)26(19)11-15(28)12-27-21-4-2-1-3-18(21)24-25-27/h1-10,15,28H,11-12H2 UKHSRZJVKOZGAD-UHFFFAOYSA-N

MMV006764 CN1C=C(C(OC)=O)C(C2=CC(OC)=C(OCC)C(Br)=C2)C(C(OC)=O)=C1 InChI=1S/C19H22BrNO6/c1-6-27-17-14(20)7-11(8-15(17)24-3)16-12(18(22)25-4)9-21(2)10-13(16)19(23)26-5/h7-10,16H,6H2,1-5H3 GHWXGIMICMXCPZ-UHFFFAOYSA-N

MMV007275 ClC(C=C1C(NC2=C(C)C=CC(F)=C2)=O)=CC=C1NC3=CC=CC=C3 InChI=1S/C20H16ClFN2O/c1-13-7-9-15(22)12-19(13)24-20(25)17-11-14(21)8-10-18(17)23-16-5-3-2-4-6-16/h2-12,23H,1H3,(H,24,25) YDYIMBIBJGKUCE-UHFFFAOYSA-N

MMV666124 O=C(NC1=CC=C(C(C)=O)C=C1)CSC2=NC(C3=C(Cl)C=CC=C3)=NS2 InChI=1S/C18H14ClN3O2S2/c1-11(23)12-6-8-13(9-7-12)20-16(24)10-25-18-21-17(22-26-18)14-4-2-3-5-15(14)19/h2-9H,10H2,1H3,(H,20,24) QHKAIZZULJHVRC-UHFFFAOYSA-N

Attached Files
24th November 2014 @ 10:40

Solubility is becoming an increasingly crucial aspect of design in OSM Series 4, and a major outcome of the latest meeting

OSM Online Project Meeting 9 (24th Nov 2014)

was to ensure that all compounds being made have a healthy predicted logP value of below 3.5 or so.

It's important we are predicting these values accurately for proposed molecules. I'll shortly post what we know of the correlation between reality and prediction for OSM compounds. We have Chemdraw available to us, and are typically using this, and typically looking at the CLogP value. In tinkering with these calculations I previously noticed that drawing in explicit H's can give striking differences in these predicted values, in what is a known Chemdraw bug.

Explicit H's and Chemdraw ClogP

In tinkering a little more just now I also found that the predicted values are sensitive to explict H's on heteroatoms too, i.e. whether the H is draw as attached (N-H) or implicit (NH). Which tautomer is drawn also impacts significantly on the calculated value, though less so for LogP (rather than CLogP). This is more reasonable, but given that we have, below, four representations of what is one compound in solution, it's important we are aware of such variance (3.32 to 6.50) when using calculated values as a criterion for whether to make a given compound.

Chemdraw CLogP and Tautomers

Attached Files
24th October 2013 @ 13:28

How do we best view a collection of molecular structures and their biological activities?

This is a central challenge we have in OSM. It would be good if the casual medicinal chemist could simply browse the project's structures.

Contributors are making molecules every day and regularly but less frequently receiving potencies or other bio/chemdata. We need to be able to share the structures and the activities most effectively. We want the data to be easily shared but also easily browsed. So we need a sheet/something with:

1) Structures, i.e. 2D pictures of the molecules that are human-friendly
2) Associated informatics data (e.g. InChI) that are machine-friendly
3) Potencies or other data
4) Any associated ID numbers
5) A weblink or two to where the molecule is featured/made

It's useful I think for the project to have a discrete place where the data are kept, just to maintain identity - i.e. not just to be subsumed by a larger database. Or at least for the project's structures to be group-able if they are part of a larger database. But really this is a problem about human visualization.

The initial solution was a Google sheet, but we found that beyond about 50 structures the sheet didn't handle the images well.

An alternative is a shared Excel sheet, but as I understand it we would need a plugin to handle the chemical structures. That's do-able, but not if we expect all the readers to have the same plugin.
The current solution is an sd file - a succinct and easy-to-update text file that contains all the information. HOWEVER, reading the data (i.e. browsing the structures) is not easy to do for the casual observer.

So what is needed? Well, we're batch-uploading data to Chembl. If we could do an auto-upload to Chembl (daily) then this would be problem solved, since Chembl is very cool and are doing cool things with visualization.

But another possible solution is for us to be able to set up a system where: the sd file is displayed on a webpage with a static address (can be bookmarked). When the sd file is updated, that would lead to a new rendering of the webpage when it is loaded. The page would need to have the structures, and be displayed in an active way such that the data can be re-ordered on demand, like in a spreadsheet.

The main sd file has now been joined by an Excel sheet and sd file of lots of exciting new molecules for the latest series the project's looking at:

We coincidentally need to combine these two sd files, and we need to browse the new structures because we need to think about which molecules to make next in that new series.

I know that there are solutions that are appropriate for cheminformaticians. We need solutions for people who are happy with email and web browsers only.

Any ideas?

Egon Willighagen spoke about possible solutions using the sd file during the previous OSM project meeting (


This post at: