All Notebooks | Help | Support | About
14th July 2015 @ 01:20

Hi,

I'm Sebastien, currently a second year undergraduate at the University of Sydney. I've been helping out with data mangement online and have started doing volunteer work in the lab with Alice.

My goal over the next upcoming weeks is to seperate and purify four compounds synthesised by students in the Special Students Programme at the University of Sydney, SSP-1:SSP-4 . 

 SSP compounds

1H NMR will be used to confirm that the compounds were correctly seperated and purified.



Strings:

SSP-1:
C12=NN=C(C3=CC=CC=C3)N1C(OCCC4=CC=CC=C4)=CN=C2

InChI=1S/C19H16N4O/c1-3-7-15(8-4-1)11-12-24-18-14-20-13-17-21-22-19(23(17)18)16-9-5-2-6-10-16/h1-10,13-14H,11-12H2

SSP-2:
ClC(C=CC=C1)=C1C2=NN=C3C=NC=C(OCCC4=CC=CC=C4)N32
 
InChI=1S/C19H15ClN4O/c20-16-9-5-4-8-15(16)19-23-22-17-12-21-13-18(24(17)19)25-11-10-14-6-2-1-3-7-14/h1-9,12-13H,10-11H2
 
SSP-3:
ClC1=CC=CC(C2=NN=C3C=NC=C(OCCC4=CC=CC=C4)N32)=C1
 
InChI=1S/C19H15ClN4O/c20-16-8-4-7-15(11-16)19-23-22-17-12-21-13-18(24(17)19)25-10-9-14-5-2-1-3-6-14/h1-8,11-13H,9-10H2
 
SSP-4:
ClC(C=C1)=CC=C1C2=NN=C3C=NC=C(OCCC4=CC=CC=C4)N32
 
InChI=1S/C19H15ClN4O/c20-16-8-6-15(7-9-16)19-23-22-17-12-21-13-18(24(17)19)25-11-10-14-4-2-1-3-5-14/h1-9,12-13H,10-11H2

Attached Files
24th November 2014 @ 10:40

Solubility is becoming an increasingly crucial aspect of design in OSM Series 4, and a major outcome of the latest meeting

OSM Online Project Meeting 9 (24th Nov 2014)

was to ensure that all compounds being made have a healthy predicted logP value of below 3.5 or so.

It's important we are predicting these values accurately for proposed molecules. I'll shortly post what we know of the correlation between reality and prediction for OSM compounds. We have Chemdraw available to us, and are typically using this, and typically looking at the CLogP value. In tinkering with these calculations I previously noticed that drawing in explicit H's can give striking differences in these predicted values, in what is a known Chemdraw bug.

Explicit H's and Chemdraw ClogP

In tinkering a little more just now I also found that the predicted values are sensitive to explict H's on heteroatoms too, i.e. whether the H is draw as attached (N-H) or implicit (NH). Which tautomer is drawn also impacts significantly on the calculated value, though less so for LogP (rather than CLogP). This is more reasonable, but given that we have, below, four representations of what is one compound in solution, it's important we are aware of such variance (3.32 to 6.50) when using calculated values as a criterion for whether to make a given compound.

Chemdraw CLogP and Tautomers

Attached Files
18th January 2014 @ 23:21

I've just noticed a problem with SMILES that makes we wonder whether we ought to be using it at all for this project. Comments welcome in case I've missed something. (Those with OpenID's or Google accounts can easily login and comment below, or I will link this post on G+ here)

OSM recently had a meeting. Several of the compounds inherited in Series 4 contain a difluoromethyl group, which is a pain to make. We were wondering whether we could synthesize a compound with a trifluoromethyl on it instead, since those should be much easier to make (Action Item is here). It appeared, via quick searches in the meeting, that no such compound had been made. The R = CH3 was known. I therefore needed to make sure that the R = CF3 was not known, and, as a control, to see whether the R = H compound was known.

 Substitution Analysis

I therefore went to the spreadsheet of knowns in this series, which contains SMILES strings for all the compounds. In Chemdraw I constructed the strings for each compound and searched the sheet, coming up a blank. OK, I thought, we should make these compounds. But then I happened to notice that the strings I had generated actually looked *nothing* like the strings in the sheet (compare the red strings below). So I copied one of the strings from the sheet and pasted it into Chemdraw, giving me the structure with circles in the aromatic rings. It seemed to matter how you drew the structure. "Surely not" I laughed, but on quick inspection, I can see that others have known about this for some time. This in itself means we should stop using SMILES for this project. This is astonishing, given how widespread is the use of SMILES in medchem. Not to downplay the inherent difficulty of dealing with variations in how molecules are represented, but we can't live with this kind of ambiguity if we are searching compound databases.

 Smiles comparison

But it gets worse. I copied the SMILES string from the sheet and generated a structure in Chemdraw. I then used that structure to generate a new SMILES string from within Chemdraw, which is different to the original string. It seems as though the nature of the string is dependent on which software is used to generate it? Can that be right?

 Smiles generation

Luckily there are other means. Chris Southan has been at pains to point out the benefits to the consortium of using InChI and InChiKey. As you can see above (blue and green respectively), those perform just fine, and are immune to the way the compound is drawn. Unless there are any objections, shall we move to InChiKey from this point on? Is there a benefit to using SMILES in e.g. similarity searching? 

(comment below or here) Chemdraws of schemes are below if you want to play. Author of this post: Mat Todd.

Attached Files