kgcnn.molecule package¶
Subpackages¶
Submodules¶
kgcnn.molecule.base module¶
-
class
kgcnn.molecule.base.
MolGraphInterface
(mol=None, make_directed: bool = False)[source]¶ Bases:
object
The MolGraphInterface defines the base class interface to extract a molecular graph.
The method implementation to generate a molecule-instance from smiles etc. can be obtained from different backends like RDkit . The mol-instance of a chemical informatics package like RDkit is treated via composition. The interface is designed to extract a graph from a mol instance, not to make a mol object from a graph.
-
__init__
(mol=None, make_directed: bool = False)[source]¶ Set the mol attribute for composition. This mol instances will be the backend molecule class.
- Parameters
mol – Instance of a molecule from chemical informatics package.
make_directed (bool) – Whether the edges are directed. Default is False.
-
static
_check_encoder
(encoder: dict, possible_keys: list, raise_error: bool = False)[source]¶ Verify and check if encoder dictionary inputs is within possible properties. If a key has to be removed, a warning is issued.
-
static
_check_properties_list
(properties: list, possible_properties: list, attribute_name: str, raise_error: bool = False)[source]¶ Verify and check if list of string identifier match expected properties. If an identifier has to be removed, a warning is issued. Non-string properties i.e. class or functions to extract properties are ignored.
- Parameters
properties (list) – List of requested string identifier. Key matches properties.
possible_properties (list) – List of allowed string identifier for properties.
attribute_name (str) – A name for the properties. E.g. bond, node or graph.
raise_error (bool) – Whether to raise an error on wrong identifier.
- Returns
Cleaned encoder dictionary.
- Return type
-
property
edge_indices
¶ Return a list of edge indices of the molecule.
-
property
edge_number
¶ Return a list of edge number that represents the bond order.
-
from_mol_block
(mol_block: str, keep_hs: bool = True, **kwargs)[source]¶ Set mol-instance from a more extensive string representation containing coordinates and bond information.
-
from_smiles
(smile: str, **kwargs)[source]¶ Main method to generate a molecule from smiles string representation.
- Parameters
smile (str) – Smile string representation of a molecule.
- Returns
self
-
property
node_coordinates
¶ Return a list of atomic coordinates of the molecule.
-
property
node_number
¶ Return list of node numbers which is the atomic number of atoms in the molecule
-
property
node_symbol
¶ Return a list of atomic symbols of the molecule.
-
kgcnn.molecule.convert module¶
-
class
kgcnn.molecule.convert.
MolConverter
(base_path: Optional[str] = None)[source]¶ Bases:
object
-
__init__
(base_path: Optional[str] = None)[source]¶ Initialize a converter to transform smile or coordinates into mol block information.
- Parameters
base_path (str) – Base path for temporary files.
-
smile_to_mol
(smiles_path: str, sdf_path: str, external_program: Optional[dict] = None, num_workers: Optional[int] = None, sanitize: bool = True, add_hydrogen: bool = True, make_conformers: bool = True, optimize_conformer: bool = True, logger=None, batch_size: int = 5000)[source]¶ Convert a smiles file to SDF structure file.
- Parameters
smiles_path –
sdf_path –
external_program –
num_workers –
sanitize –
add_hydrogen –
make_conformers –
optimize_conformer –
logger –
batch_size –
- Returns
List of mol-strings.
- Return type
-
-
kgcnn.molecule.convert.
openbabel_smile_to_mol
(smile: str, sanitize: bool = True, add_hydrogen: bool = True, make_conformers: bool = True, optimize_conformer: bool = True, random_seed: int = 42, stop_logging: bool = False)[source]¶
-
kgcnn.molecule.convert.
openbabel_xyz_to_mol
(xyz_string: str, charge: int = 0, stop_logging: bool = False)[source]¶ Convert xyz-string to mol-string.
The order of atoms in the list should be the same as output. Uses openbabel for conversion.
-
kgcnn.molecule.convert.
rdkit_smile_to_mol
(smile: str, sanitize: bool = True, add_hydrogen: bool = True, make_conformers: bool = True, optimize_conformer: bool = True, random_seed: int = 42, stop_logging: bool = False)[source]¶
kgcnn.molecule.encoder module¶
-
class
kgcnn.molecule.encoder.
OneHotEncoder
(categories: list, add_unknown: bool = True, dtype: str = 'int')[source]¶ Bases:
object
Simple One-Hot-Encoding for python lists.
Uses a list of possible values for a one-hot encoding of a single value. The translated values must support
__eq__
operator. The list of possible values must be set beforehand. Is used as a basic encoder example forMolecularGraphRDKit
. There can not be different dtypes in categories.-
__call__
(value)[source]¶ Encode a single feature or value, mapping it to a one-hot python list. E.g. [0, 0, 1, 0]
- Parameters
value – Any object that can be compared to items in
self.one_hot_values
.- Returns
Python List with 1 at value match. E.g. [0, 0, 1, 0]
- Return type
-
kgcnn.molecule.graph_babel module¶
-
class
kgcnn.molecule.graph_babel.
MolecularGraphOpenBabel
(mol=None, make_directed: bool = False)[source]¶ Bases:
kgcnn.molecule.base.MolGraphInterface
A graph object representing a strict molecular graph, e.g. only chemical bonds. This class is an interface to
OBMol
class to retrieve graph properties.import numpy as np from kgcnn.mol.graph_babel import MolecularGraphOpenBabel mg = MolecularGraphOpenBabel() mg.from_smiles("CC(C)C(C(=O)O)N") mg.add_hs() mg.make_conformer() mg.optimize_conformer() mg.compute_partial_charges() print(MolecularGraphOpenBabel.atom_fun_dict.keys(), MolecularGraphOpenBabel.bond_fun_dict.keys()) print(mg.node_coordinates) print(mg.edge_indices) print(mg.node_attributes(properties=["NumBonds", "GasteigerCharge"], encoder={}))
-
__init__
(mol=None, make_directed: bool = False)[source]¶ Set the mol attribute for composition. This mol instances will be the backends molecule class.
- Parameters
mol (openbabel.OBMol) – OpenBabel molecule.
make_directed (bool) – Whether the edges are directed. Default is False.
-
atom_fun_dict
= {'AtomicMass': <function MolecularGraphOpenBabel.<lambda>>, 'AtomicNum': <function MolecularGraphOpenBabel.<lambda>>, 'Coordinate': <function MolecularGraphOpenBabel.<lambda>>, 'CoordinateIdx': <function MolecularGraphOpenBabel.<lambda>>, 'Data': <function MolecularGraphOpenBabel.<lambda>>, 'ExactMass': <function MolecularGraphOpenBabel.<lambda>>, 'ExplicitDegree': <function MolecularGraphOpenBabel.<lambda>>, 'ExplicitValence': <function MolecularGraphOpenBabel.<lambda>>, 'FormalCharge': <function MolecularGraphOpenBabel.<lambda>>, 'HasAlphaBetaUnsat': <function MolecularGraphOpenBabel.<lambda>>, 'HasAromaticBond': <function MolecularGraphOpenBabel.<lambda>>, 'HasBondOfOrder1': <function MolecularGraphOpenBabel.<lambda>>, 'HasBondOfOrder2': <function MolecularGraphOpenBabel.<lambda>>, 'HasBondOfOrder3': <function MolecularGraphOpenBabel.<lambda>>, 'HasDoubleBond': <function MolecularGraphOpenBabel.<lambda>>, 'HasNonSingleBond': <function MolecularGraphOpenBabel.<lambda>>, 'HasResidue': <function MolecularGraphOpenBabel.<lambda>>, 'HasSingleBond': <function MolecularGraphOpenBabel.<lambda>>, 'HeteroDegree': <function MolecularGraphOpenBabel.<lambda>>, 'HvyDegree': <function MolecularGraphOpenBabel.<lambda>>, 'Hyb': <function MolecularGraphOpenBabel.<lambda>>, 'ImplicitHCount': <function MolecularGraphOpenBabel.<lambda>>, 'Index': <function MolecularGraphOpenBabel.<lambda>>, 'IsAmideNitrogen': <function MolecularGraphOpenBabel.<lambda>>, 'IsAromatic': <function MolecularGraphOpenBabel.<lambda>>, 'IsAromaticNOxide': <function MolecularGraphOpenBabel.<lambda>>, 'IsAxial': <function MolecularGraphOpenBabel.<lambda>>, 'IsCarboxylOxygen': <function MolecularGraphOpenBabel.<lambda>>, 'IsChiral': <function MolecularGraphOpenBabel.<lambda>>, 'IsHbondAcceptor': <function MolecularGraphOpenBabel.<lambda>>, 'IsHbondAcceptorSimple': <function MolecularGraphOpenBabel.<lambda>>, 'IsHbondDonor': <function MolecularGraphOpenBabel.<lambda>>, 'IsHbondDonorH': <function MolecularGraphOpenBabel.<lambda>>, 'IsHetAtom': <function MolecularGraphOpenBabel.<lambda>>, 'IsHeteroatom': <function MolecularGraphOpenBabel.<lambda>>, 'IsInRing': <function MolecularGraphOpenBabel.<lambda>>, 'IsInRingSize5': <function MolecularGraphOpenBabel.<lambda>>, 'IsInRingSize6': <function MolecularGraphOpenBabel.<lambda>>, 'IsMetal': <function MolecularGraphOpenBabel.<lambda>>, 'IsNitroOxygen': <function MolecularGraphOpenBabel.<lambda>>, 'IsNonPolarHydrogen': <function MolecularGraphOpenBabel.<lambda>>, 'IsPeriodic': <function MolecularGraphOpenBabel.<lambda>>, 'IsPhosphateOxygen': <function MolecularGraphOpenBabel.<lambda>>, 'IsPolarHydrogen': <function MolecularGraphOpenBabel.<lambda>>, 'IsSulfateOxygen': <function MolecularGraphOpenBabel.<lambda>>, 'Isotope': <function MolecularGraphOpenBabel.<lambda>>, 'PartialCharge': <function MolecularGraphOpenBabel.<lambda>>, 'Residue': <function MolecularGraphOpenBabel.<lambda>>, 'SpinMultiplicity': <function MolecularGraphOpenBabel.<lambda>>, 'Title': <function MolecularGraphOpenBabel.<lambda>>, 'TotalDegree': <function MolecularGraphOpenBabel.<lambda>>, 'TotalValence': <function MolecularGraphOpenBabel.<lambda>>, 'Type': <function MolecularGraphOpenBabel.<lambda>>, 'Vector': <function MolecularGraphOpenBabel.<lambda>>, 'Visit': <function MolecularGraphOpenBabel.<lambda>>, 'X': <function MolecularGraphOpenBabel.<lambda>>, 'Y': <function MolecularGraphOpenBabel.<lambda>>, 'Z': <function MolecularGraphOpenBabel.<lambda>>}¶
-
bond_fun_dict
= {'Aromatic': <function MolecularGraphOpenBabel.<lambda>>, 'BeginAtom': <function MolecularGraphOpenBabel.<lambda>>, 'BeginAtomIdx': <function MolecularGraphOpenBabel.<lambda>>, 'BondOrder': <function MolecularGraphOpenBabel.<lambda>>, 'CisOrTrans': <function MolecularGraphOpenBabel.<lambda>>, 'EndAtom': <function MolecularGraphOpenBabel.<lambda>>, 'EndAtomIdx': <function MolecularGraphOpenBabel.<lambda>>, 'EquibLength': <function MolecularGraphOpenBabel.<lambda>>, 'Flags': <function MolecularGraphOpenBabel.<lambda>>, 'Id': <function MolecularGraphOpenBabel.<lambda>>, 'Idx': <function MolecularGraphOpenBabel.<lambda>>, 'IsAmide': <function MolecularGraphOpenBabel.<lambda>>, 'IsAromatic': <function MolecularGraphOpenBabel.<lambda>>, 'IsCarbonyl': <function MolecularGraphOpenBabel.<lambda>>, 'IsCisOrTrans': <function MolecularGraphOpenBabel.<lambda>>, 'IsClosure': <function MolecularGraphOpenBabel.<lambda>>, 'IsDoubleBondGeometry': <function MolecularGraphOpenBabel.<lambda>>, 'IsEster': <function MolecularGraphOpenBabel.<lambda>>, 'IsHash': <function MolecularGraphOpenBabel.<lambda>>, 'IsInRing': <function MolecularGraphOpenBabel.<lambda>>, 'IsPeriodic': <function MolecularGraphOpenBabel.<lambda>>, 'IsPrimaryAmide': <function MolecularGraphOpenBabel.<lambda>>, 'IsTertiaryAmide': <function MolecularGraphOpenBabel.<lambda>>, 'IsWedge': <function MolecularGraphOpenBabel.<lambda>>, 'IsWedgeOrHash': <function MolecularGraphOpenBabel.<lambda>>, 'Length': <function MolecularGraphOpenBabel.<lambda>>, 'Parent': <function MolecularGraphOpenBabel.<lambda>>, 'Visit': <function MolecularGraphOpenBabel.<lambda>>}¶
-
property
edge_indices
¶ Return a list of edge indices of the molecule.
-
property
edge_number
¶ Return a list of edge number that represents the bond order.
-
from_mol_block
(mol_block: str, keep_hs: bool = True, sanitize: bool = True)[source]¶ Set mol-instance from a string representation containing coordinates and bond information that is MDL mol format equivalent.
-
from_xyz
(xyz_string)[source]¶ Setting mol-instance from an external xyz-string. Does not add hydrogen or makes conformers.
- Parameters
xyz_string – String of xyz block.
- Returns
self
-
make_conformer
(**kwargs)[source]¶ Make conformer for mol-object.
- Parameters
kwargs – Not used.
- Returns
Whether conformer generation was successful
- Return type
-
mol_fun_dict
= {'ExactMass': <function MolecularGraphOpenBabel.<lambda>>, 'NumAtoms': <function MolecularGraphOpenBabel.<lambda>>, 'NumBonds': <function MolecularGraphOpenBabel.<lambda>>, 'TotalCharge': <function MolecularGraphOpenBabel.<lambda>>}¶
-
property
node_coordinates
¶ Return a list of atomic coordinates of the molecule.
-
property
node_number
¶ Return list of node numbers which is the atomic number of atoms in the molecule
-
property
node_symbol
¶ Return a list of atomic symbols of the molecule.
-
optimize_conformer
(force_field='mmff94', steps=100, **kwargs)[source]¶ Optimize conformer. Requires an initial conformer. See
make_conformer
.
-
kgcnn.molecule.graph_rdkit module¶
-
class
kgcnn.molecule.graph_rdkit.
MolecularGraphRDKit
(mol=None, make_directed: bool = False)[source]¶ Bases:
kgcnn.molecule.base.MolGraphInterface
A graph object representing a strict molecular graph, e.g. only chemical bonds using a mol-object from
RDkit
chemical informatics package.Generate attributes for nodes, edges, and graph which are in a molecular graph atoms, bonds and the molecule itself. The class is used to get a graph from a
RDkit
molecule object but also offers some functionality defined inMolGraphInterface
.import numpy as np from kgcnn.mol.graph_rdkit import MolecularGraphRDKit mg = MolecularGraphRDKit() mg.from_smiles("CC(C)C(C(=O)O)N") mg.add_hs() mg.make_conformer() mg.optimize_conformer() mg.compute_partial_charges() print(MolecularGraphRDKit.atom_fun_dict.keys(), MolecularGraphRDKit.bond_fun_dict.keys()) print(mg.node_coordinates) print(mg.edge_indices) print(mg.node_attributes(properties=["NumBonds", "GasteigerCharge"], encoder={}))
-
__init__
(mol=None, make_directed: bool = False)[source]¶ Initialize
MolecularGraphRDKit
with mol object.- Parameters
mol (rdkit.Chem.rdchem.Mol) – Mol object from rdkit. Default is None.
make_directed (bool) – Whether the edges are directed. Default is False.
-
add_hs
(**kwargs)[source]¶ Add hydrogen atoms.
- Parameters
kwargs – Kwargs for rdkit method, e.g. can specify explicit or implicit.
- Returns
self.
-
atom_fun_dict
= {'AtomFeatures': <function MolecularGraphRDKit.<lambda>>, 'AtomMapNum': <function MolecularGraphRDKit.<lambda>>, 'AtomicNum': <function MolecularGraphRDKit.<lambda>>, 'CIPCode': <function MolecularGraphRDKit.<lambda>>, 'CIPRank': <function MolecularGraphRDKit.<lambda>>, 'ChiralTag': <function MolecularGraphRDKit.<lambda>>, 'ChiralityPossible': <function MolecularGraphRDKit.<lambda>>, 'Degree': <function MolecularGraphRDKit.<lambda>>, 'DescribeQuery': <function MolecularGraphRDKit.<lambda>>, 'ExplicitValence': <function MolecularGraphRDKit.<lambda>>, 'FormalCharge': <function MolecularGraphRDKit.<lambda>>, 'GasteigerCharge': <function MolecularGraphRDKit.<lambda>>, 'GasteigerHCharge': <function MolecularGraphRDKit.<lambda>>, 'HasOwningMol': <function MolecularGraphRDKit.<lambda>>, 'Hybridization': <function MolecularGraphRDKit.<lambda>>, 'Idx': <function MolecularGraphRDKit.<lambda>>, 'ImplicitValence': <function MolecularGraphRDKit.<lambda>>, 'IsAromatic': <function MolecularGraphRDKit.<lambda>>, 'IsInRing': <function MolecularGraphRDKit.<lambda>>, 'Isotope': <function MolecularGraphRDKit.<lambda>>, 'Mass': <function MolecularGraphRDKit.<lambda>>, 'MassScaled': <function MolecularGraphRDKit.<lambda>>, 'MolFileRLabel': <function MolecularGraphRDKit.<lambda>>, 'MonomerInfo': <function MolecularGraphRDKit.<lambda>>, 'NoImplicit': <function MolecularGraphRDKit.<lambda>>, 'NumBonds': <function MolecularGraphRDKit.<lambda>>, 'NumExplicitHs': <function MolecularGraphRDKit.<lambda>>, 'NumImplicitHs': <function MolecularGraphRDKit.<lambda>>, 'NumRadicalElectrons': <function MolecularGraphRDKit.<lambda>>, 'PDBResidueInfo': <function MolecularGraphRDKit.<lambda>>, 'Rcovalent': <function MolecularGraphRDKit.<lambda>>, 'RcovalentScaled': <function MolecularGraphRDKit.<lambda>>, 'Rvdw': <function MolecularGraphRDKit.<lambda>>, 'RvdwScaled': <function MolecularGraphRDKit.<lambda>>, 'Smarts': <function MolecularGraphRDKit.<lambda>>, 'Symbol': <function MolecularGraphRDKit.<lambda>>, 'TotalDegree': <function MolecularGraphRDKit.<lambda>>, 'TotalNumHs': <function MolecularGraphRDKit.<lambda>>, 'TotalValence': <function MolecularGraphRDKit.<lambda>>}¶
-
bond_fun_dict
= {'BeginAtom': <function MolecularGraphRDKit.<lambda>>, 'BeginAtomIdx': <function MolecularGraphRDKit.<lambda>>, 'BondDir': <function MolecularGraphRDKit.<lambda>>, 'BondType': <function MolecularGraphRDKit.<lambda>>, 'BondTypeAsDouble': <function MolecularGraphRDKit.<lambda>>, 'DescribeQuery': <function MolecularGraphRDKit.<lambda>>, 'EndAtom': <function MolecularGraphRDKit.<lambda>>, 'EndAtomIdx': <function MolecularGraphRDKit.<lambda>>, 'Idx': <function MolecularGraphRDKit.<lambda>>, 'IsAromatic': <function MolecularGraphRDKit.<lambda>>, 'IsConjugated': <function MolecularGraphRDKit.<lambda>>, 'IsInRing': <function MolecularGraphRDKit.<lambda>>, 'Smarts': <function MolecularGraphRDKit.<lambda>>, 'Stereo': <function MolecularGraphRDKit.<lambda>>}¶
-
compute_partial_charges
(method='gasteiger', **kwargs)[source]¶ Compute partial charges.
- Parameters
method (str) – Method to compute partial charges. Defaults to ‘gasteiger’.
**kwargs –
- Returns
self
-
edge_attributes
(properties: list, encoder: dict)[source]¶ Return edge or bond attributes together with bond indices of the molecule. If flag
_make_directed
is set to true, then only the bonds as defined by RDkit are returned, otherwise a table of sorted undirected bond indices is returned.
-
property
edge_indices
¶ Return edge or bond indices of the molecule. If flag
_make_directed
is set to true, then only the bonds as defined by RDkit are returned, otherwise a table of sorted undirected bond indices is returned.- Returns
Array of bond indices.
- Return type
np.ndarray
-
property
edge_number
¶ Make list of the bond order or type of each bond in the molecule.
-
from_list
(atoms: Union[list, numpy.ndarray], bond_idx: Union[list, numpy.ndarray], bond_order: Union[list, numpy.ndarray], conformer: Optional[Union[list, numpy.ndarray]] = None)[source]¶ - Parameters
atoms –
bond_idx –
bond_order –
conformer –
- Returns
self.
-
from_mol_block
(mol_block, sanitize: bool = True, keep_hs: bool = True, strictParsing: bool = True)[source]¶ Set mol-instance from a mol-block string.
-
from_xyz
(xyz_string: str, charge: Optional[Union[list, int]] = None)[source]¶ Setting mol-instance from an external xyz-string. Does not add hydrogen or makes conformers.
-
graph_attributes
(properties: list, encoder: dict)[source]¶ Return graph or molecular attributes.
- Parameters
- Returns
List of molecular graph-level properties.
- Return type
-
make_conformer
(**kwargs)[source]¶ Make conformer for mol-object.
- Parameters
kwargs – Kwargs for rdkit
EmbedMolecule
.- Returns
Whether conformer generation was successful
- Return type
-
mol_fun_dict
= {'AtomsIsAromatic': <function MolecularGraphRDKit.<lambda>>, 'AtomsIsInRing': <function MolecularGraphRDKit.<lambda>>, 'BondsIsAromatic': <function MolecularGraphRDKit.<lambda>>, 'BondsIsConjugated': <function MolecularGraphRDKit.<lambda>>, 'C': <function MolecularGraphRDKit.<lambda>>, 'Cl': <function MolecularGraphRDKit.<lambda>>, 'ExactMolWt': <function <lambda>>, 'F': <function MolecularGraphRDKit.<lambda>>, 'FpDensityMorgan3': <function MolecularGraphRDKit.<lambda>>, 'FractionCSP3': <function MolecularGraphRDKit.<lambda>>, 'H': <function MolecularGraphRDKit.<lambda>>, 'MolLogP': <function MolecularGraphRDKit.<lambda>>, 'MolMR': <function MolecularGraphRDKit.<lambda>>, 'N': <function MolecularGraphRDKit.<lambda>>, 'NumAtoms': <function MolecularGraphRDKit.<lambda>>, 'NumBonds': <function MolecularGraphRDKit.<lambda>>, 'NumRotatableBonds': <function MolecularGraphRDKit.<lambda>>, 'O': <function MolecularGraphRDKit.<lambda>>, 'S': <function MolecularGraphRDKit.<lambda>>, 'fr_Al_COO': <function MolecularGraphRDKit.<lambda>>, 'fr_Al_OH': <function MolecularGraphRDKit.<lambda>>, 'fr_Ar_COO': <function MolecularGraphRDKit.<lambda>>, 'fr_Ar_OH': <function MolecularGraphRDKit.<lambda>>, 'fr_C_O_noCOO': <function MolecularGraphRDKit.<lambda>>, 'fr_NH2': <function MolecularGraphRDKit.<lambda>>, 'fr_SH': <function MolecularGraphRDKit.<lambda>>, 'fr_alkyl_halide': <function MolecularGraphRDKit.<lambda>>, 'fr_sulfide': <function MolecularGraphRDKit.<lambda>>}¶
-
node_attributes
(properties: list, encoder: dict)[source]¶ Return node or atom attributes.
- Parameters
- Returns
List of atomic properties.
- Return type
-
property
node_coordinates
¶ Return a list or array of atomic coordinates of the molecule.
-
property
node_number
¶ Return list of node number which is the atomic number of each atom in the molecule
-
property
node_symbol
¶ Return a list of atomic symbols of the molecule.
-
optimize_conformer
(force_field='mmff94', **kwargs)[source]¶ Optimize conformer. Requires an initial conformer. See
make_conformer
.
-
remove_hs
(**kwargs)[source]¶ Remove hydrogen atoms.
- Parameters
kwargs – Kwargs for rdkit method, e.g. can specify explicit or implicit.
- Returns
self.
-
kgcnn.molecule.io module¶
-
kgcnn.molecule.io.
parse_list_to_xyz_str
(mol: list, comment: str = '', number_coordinates: Optional[int] = None)[source]¶ Convert list of atom and coordinates list into xyz-string.
-
kgcnn.molecule.io.
parse_mol_str
(mol_str: str)[source]¶ Parse MDL mol table string into nested list. Only supports V2000 format and CTab. Better rely on OpenBabel to do this. This function was a temporary solution.
-
kgcnn.molecule.io.
read_mol_list_from_sdf_file
(filepath, line_by_line=False)[source]¶ Simple loader to load an SDF file by only splitting.
-
kgcnn.molecule.io.
read_smiles_file
(file_path)[source]¶ Simply python function to read smiles from file.
-
kgcnn.molecule.io.
read_xyz_file
(file_path, delimiter: Optional[str] = None, line_by_line=False)[source]¶ Simple python script to read xyz-file and parse into a nested python list. Always returns a list with the geometries in xyz file.
-
kgcnn.molecule.io.
write_list_to_xyz_file
(filepath: str, mol_list: list)[source]¶ Write a list of nested list of atom and coordinates into xyz-string. Uses
parse_list_to_xyz_str
.
kgcnn.molecule.methods module¶
-
kgcnn.molecule.methods.
get_connectivity_from_inverse_distance_matrix
(inv_dist_mat, protons, radii_dict=None, k1=16.0, k2=1.3333333333333333, cutoff=0.85, force_bonds=True)[source]¶ Get connectivity table from inverse distance matrix defined at last dimensions (…, N, N) and corresponding bond-radii. Keeps shape with (…, N, N). Covalent radii, from Pyykko and Atsumi, Chem. Eur. J. 15, 2009, 188-197. Values for metals decreased by 10% according to Robert Paton’s Sterimol implementation. Partially based on code from Robert Paton’s Sterimol script, which based this part on Grimme’s D3 code. Vectorized version of the original code for numpy arrays that take atomic numbers as input.
- Parameters
inv_dist_mat (np.ndarray) – Inverse distance matrix defined at last dimensions (…, N, N) distances must be in Angstrom not in Bohr.
protons (np.ndarray) – An array of atomic numbers matching the inv_dist_mat (…, N), for which the radii are to be computed.
radii_dict (np.ndarray) – Covalent radii for each element. If
None
, stored values are used. Otherwise, expected numpy array with covalent bonding radii. Example:np.array([0, 0.34, 0.46, 1.2, ...])
for atomic numbernp.array([0, 1, 2, ...])
that would match[None, 'H', 'He', 'Li', ...]
.k1 (float) – K1-value. Defaults to 16
k2 (float) – K2-value. Defaults to 4.0/3.0
cutoff (float) – Cutoff value to set values to Zero (no bond). Defaults to 0.85.
force_bonds (bool) – Whether to force at least one bond in the bond table per atom. Default is True.
- Returns
Connectivity table with 1 for chemical bond and zero otherwise of shape (…, N, N).
- Return type
np.ndarray
kgcnn.molecule.preprocessor module¶
-
class
kgcnn.molecule.preprocessor.
SetMolAttributes
(*, nodes: Optional[list] = None, edges: Optional[list] = None, graph: Optional[list] = None, encoder_nodes: Optional[dict] = None, encoder_edges: Optional[dict] = None, encoder_graph: Optional[dict] = None, node_coordinates: str = 'node_coordinates', node_symbol: str = 'node_symbol', node_number: str = 'node_number', edge_indices: str = 'edge_indices', edge_number: str = 'edge_number', node_attributes: str = 'node_attributes', edge_attributes: str = 'edge_attributes', graph_attributes: str = 'graph_attributes', name='set_mol_attributes', **kwargs)[source]¶ Bases:
kgcnn.graph.base.GraphPreProcessorBase
Preprocessor to compute molecular attributes from graph arrays that make a valid molecule via a
MolGraphInterface
. SeeMoleculeNetDataset
which uses a callbacks but has identical nomenclature.from kgcnn.data.datasets.QM7Dataset import QM7Dataset from kgcnn.molecule.preprocessor import SetMolAttributes ds = QM7Dataset() pp = SetMolAttributes() print(pp(ds[0]))
- Parameters
nodes (list) – List of atomic properties for attributes.
edges (list) – List of bond properties for attributes.
graph (list) – List of molecular properties for attributes.
encoder_nodes (dict) – Dictionary of node attribute encoders.
encoder_edges (dict) – Dictionary of edge attribute encoders.
encoder_graph (dict) – Dictionary of graph attribute encoders.
node_coordinates (str) – Name of numpy array storing atomic coordinates.
node_symbol (str) – Name of numpy array storing atomic symbol.
node_number (str) – Name of numpy array storing atomic number.
edge_indices (str) – Name of numpy array storing atomic bond indices.
edge_number (str) – Name of numpy array storing atomic bond order.
node_attributes (str) – Name to assign node attributes to.
edge_attributes (str) – Name to assign edge attributes to.
graph_attributes (str) – Name to assign graph attributes to.
name (str) – Name of the preprocessor.
-
call
(nodes: list, edges: list, graph: list, encoder_nodes: dict, encoder_edges: dict, encoder_graph: dict, node_coordinates: numpy.ndarray, node_symbol: numpy.ndarray, node_number: numpy.ndarray, edge_indices: numpy.ndarray, edge_number: numpy.ndarray)[source]¶
-
class
kgcnn.molecule.preprocessor.
SetMolBondIndices
(*, node_coordinates: str = 'node_coordinates', node_symbol: str = 'node_symbol', node_number: str = 'node_number', edge_indices: str = 'edge_indices', edge_number: str = 'edge_number', name='set_mol_bond_indices', **kwargs)[source]¶ Bases:
kgcnn.graph.base.GraphPreProcessorBase
Preprocessor to compute chemical bonds from coordinates via a
MolGraphInterface
.- Parameters
node_coordinates (str) – Name of atomic coordinates array of shape (N, 3) .
node_symbol (str) – Name of atomic symbol as numpy array of shape (N, ) .
node_number (str) – Name of atomic numbers array of shape (N, ) .
edge_indices (str) – Name to assign edge indices to.
edge_number (str) – Name to assign the edge number/order to.
name (str) – Name of this preprocessor.
-
call
(node_coordinates: numpy.ndarray, node_symbol: numpy.ndarray, node_number: numpy.ndarray)[source]¶