Skip to content

Conversation

@nyee
Copy link
Contributor

@nyee nyee commented Jun 14, 2016

This PR is to fix the bug where tree siblings should actually be parent/child. Not quite done with all the databases yet, but I'd like somebody to start reviewing this as its already quite large.

Most of the commits are actually pretty small with the exception of those pertaining to groups.py. Most of these changes are because of 81ae116 and 9e69de8 where I removed all CS groups and subsequently added them back in because of an issue where we could not detect CS properly. I was very careful counting that I removed and added the same number of nodes.

@nyee nyee assigned nyee and KEHANG and unassigned nyee Jun 14, 2016
nyee added 20 commits June 15, 2016 14:45
A unit test has been added to databaseTest to see if any siblings in a tree shuold actually be parents. For now, I have only implemented it for thermo groups. It should be implemented to kinetics, solvation, statmech and transport as well, but I don't think we have any commits of RMG-database that will pass those tests.
Because CS atomType is not currently recognized correctly, I have removed all thermo groups containing CS, and written them out in a separate directory.
Previously every instance of Cds-Cd ligand was not comprehensively represented in the adjlist. There would only be one Cd atomType in the adjList, instead of a Cd connected to another C. This is likely to have occurred because the original benson group did not have many heteroatoms, so it was previously a safe assumption.

This fix will help with making groups work as intended and make groups more exclusive.
Some (CdN3d) ligands were placed underneath (Cd-Cds) because the (Cd-Cds) were previously defined as only Cd in the adjlist.
Used a script to re-order groups that were listed as siblings, but should have been child/parents, moving about 30 groups total. To the best of my knowledge, all these groups with the exception of CO, were previously completely inaccessible.

Below is a list of moved groups with the format of newParent/formersibling, newChild:

CsJ-S, CsJ-SS
CsJ-S, CsJ-SsSsSs
cyclopropane, bicyclo[1.1.0]butane-secondary
cyclopropane, spiro[2.2]pentane-secondary
cyclopropane, bicyclo[2.1.0]pentane-secondary-C3
cyclopropane, bicyclo[2.1.0]pent-2-ene-C5
cyclopropane, tricyclo[1.1.1.0(1,3)]pentane-C2
cyclopropane, bicyclo[3.1.0]hexane-C3
cyclopropane, tricyclo[2.1.1.0(1,4)]hex-2-ene-C5
cyclopropane, tricyclo[2.1.1.0(1,4)]hexane-C5
cyclopropane, bicyclo[4.1.0]heptane-C3-7
cyclopropane, bicyclo[4.1.0]heptane-C3-7
cyclopropane, tricyclo[3.1.1.0(1,5)]heptane-C6
cyclopropane, tricyclo[2.2.1.0(1,4)]heptane-C7
cyclobutane, bicyclo[2.1.0]pentane-secondary-C4
cyclobutane, bicyclo[2.2.0]hexane-secondary
cyclobutane, bicyclo[3.2.0]heptane-C5-6
cyclopentene-4, bicyclo[2.1.1]hex-2-ene-C5
bicyclo[2.1.1]hexane-C2, tricyclo[2.1.1.0(1,4)]hexane-C2
cycloheptane, bicyclo[3.2.0]heptane-C5-2
cycloheptane, bicyclo[3.2.0]heptane-C5-3
bicyclo[3.1.1]heptane-C2, tricyclo[3.1.1.0(1,5)]heptane-C2
bicyclo[3.1.1]heptane-C3, tricyclo[3.1.1.0(1,5)]heptane-C3
bicyclo[2.2.2]octane-C2, tricyclo[2.2.2.0(1,4)]octane-C2
bicyclo[2.1.0]pent-2-ene-C5, tricyclo[2.1.1.0(1,4)]hex-2-ene-C5
CsJ-SS, CsJ-SsSsSs
cyclobutene-vinyl, bicyclo[2.1.0]pent-2-ene-C2
cyclobutene-vinyl, tricyclo[2.1.1.0(1,4)]hex-2-ene-C2
cyclobutene-vinyl, bicyclo[2.2.0]hexa-2,5-diene-C2
cyclopentene-vinyl, bicyclo[2.1.1]hex-2-ene-C2
cyclobutadiene-C1, bicyclo[2.2.0]hexa-1(4),2,5-triene-C2
bicyclo[2.1.0]pent-2-ene-C2, tricyclo[2.1.1.0(1,4)]hex-2-ene-C2
CJ2_singlet, CO
CdJ2_singlet, CCdJ2_singlet
CdJ2_singlet, CdJ2-Sd_singlet
Now that CS atomTypes detection has been fixed, I have added back the groups that were previously removed.
The saveEntry function expects either an Article object or None type. Empty strings cause an error.
Only changes the location of H rad and a formatting inconsistency in the tree. Functionally shouldn't make a difference but will past new suite of unit tests.
All *2 should be *4 under the radadd tree in Intra_RH_Add_Endocyclic and Intra_RH_Add_Exocyclic
I add explicit H atoms to the groups radadd_intra_CsHNd and radadd_intra_CsHDe, renamed and moved to correct place in tree. I'm not sure if this was the original intent, but the previous version had incorrect sibling parent/child relationship. The group labels are not consistent, and there were no rules with sources to help me make the decision, so I just made my best guess

There is only one rule for each of these families anyhow, so this minor change won't have any effect for now.
Clarified adjlist for one group and moved another to correct parent.
Based on the name and other groups on the same level, it looks like Ct-Cd_Ct-Cs was written with the incorrect AdjList.

Also changed one instance of Cd=Sd to CS, which makes it accessible.
Moved two groups Cds-CdH_Cds-TwoDe and Cds-CdCs_Cds-TwoDe in the tree.
It looks like some of these groups were added before we had enabled lone pairs in the Adjlist. As such they tried to incorperate both triplet/singlet and doublet/quartet in the one node. However, this made the tree fail the sibling child/parent check. For now, I have removed the possibility for singlet bi-rads and doublet tri-rads.

Additionally the Nitrites group was just written incorrectly from the start. I have added one extra atom to the Adjlist after consulting with alongd
In addition to moving them, I changed the atomTypes on a few to make them more explicit.
for child2 in node.children[index+1:]:
#Don't check a node against itself
if child1 is child2: continue
nose.tools.assert_false(group.matchNodeToChild(child1, child2),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you need to check reversely as well nose.tools.assert_false(group.matchNodeToChild(child2, child1),...)?

Previously only checked siblings in one directions, now the test checks in both directions
@KEHANG KEHANG merged commit 6ba1c55 into master Jun 15, 2016
@nyee nyee deleted the fixTrees branch June 28, 2016 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants