Working with @hyejang, I found an issue thematically related to #98 -- Basically, if an input molecule has an atom map, then bad things can happen:
Some internal pathways in QCSubmit run the equivalent of Molecule.from_json(offmol.to_json()), which converts the atom map to a string, and then confounds use of atom_map later. Reproducing case here:
from openff.qcsubmit.factories import TorsiondriveDatasetFactory
from openff.toolkit.topology import Molecule
mol = Molecule.from_smiles('CCCC([O-])=O')
mol.properties['atom_map'] = {1:1, 2:2, 3:3, 4:4}
print(mol.properties)
print(mol.to_smiles(isomeric=True, explicit_hydrogens=True, mapped=True))
dataset_factory = TorsiondriveDatasetFactory()
dataset = dataset_factory.create_dataset(
dataset_name="XXXXXXXXXX",
tagline="XXXXXXXXXX",
description="XXXXXXXXXX",
molecules=[mol],
)
{'atom_map': {1: 1, 2: 2, 3: 3, 4: 4}}
[H]C([H])([H])[C:1]([H])([H])[C:2]([H])([H])[C:3](=O)[O-:4]
AttributeError Traceback (most recent call last)
in
9
10
---> 11 dataset = dataset_factory.create_dataset(
12 dataset_name="XXXXXXXXXX",
13 tagline="XXXXXXXXXX",
Click to expand
~/Downloads/for-share/openff-qcsubmit/openff/qcsubmit/factories.py in create_dataset(self, dataset_name, molecules, description, tagline, metadata, processors, verbose)
840 order_mol = molecule.canonical_order_atoms()
841 rotatble_bonds = order_mol.find_rotatable_bonds()
--> 842 attributes = self.create_cmiles_metadata(molecule=order_mol)
843 for bond in rotatble_bonds:
844 # create a torsion to hold as fixed using non-hydrogen atoms
~/Downloads/for-share/openff-qcsubmit/openff/qcsubmit/factories.py in create_cmiles_metadata(self, molecule)
552 """
553
--> 554 return MoleculeAttributes.from_openff_molecule(molecule)
555
556 def create_index(self, molecule: off.Molecule) -> str:
~/Downloads/for-share/openff-qcsubmit/openff/qcsubmit/common_structures.py in from_openff_molecule(cls, molecule)
1108 isomeric=True, explicit_hydrogens=True, mapped=False
1109 ),
-> 1110 "canonical_isomeric_explicit_hydrogen_mapped_smiles": molecule.to_smiles(
1111 isomeric=True, explicit_hydrogens=True, mapped=True
1112 ),
~/projects/OpenForceField/openff-toolkit/openff/toolkit/topology/molecule.py in to_smiles(self, isomeric, explicit_hydrogens, mapped, toolkit_registry)
2300 return self._cached_smiles[smiles_hash]
2301 else:
-> 2302 smiles = to_smiles_method(self, isomeric, explicit_hydrogens, mapped)
2303 self._cached_smiles[smiles_hash] = smiles
2304 return smiles
~/projects/OpenForceField/openff-toolkit/openff/toolkit/utils/toolkits.py in to_smiles(self, molecule, isomeric, explicit_hydrogens, mapped)
1750 if atom_map is not None:
1751 # make sure there are no repeated indices
-> 1752 map_ids = set(atom_map.values())
1753 if len(map_ids) < len(atom_map):
1754 atom_map = None
AttributeError: 'str' object has no attribute 'values'
Possible solutions
Context
Working with @hyejang, I found an issue thematically related to #98 -- Basically, if an input molecule has an atom map, then bad things can happen:
Some internal pathways in QCSubmit run the equivalent of
Molecule.from_json(offmol.to_json()), which converts the atom map to a string, and then confounds use ofatom_maplater. Reproducing case here:Click to expand
Possible solutions
Context