kgcnn.io package

Submodules

kgcnn.io.file module

class kgcnn.io.file.RaggedTensorHDFile(file_path: str, compressed: Optional[bool] = None)[source]

Bases: object

Class representing an HDF ‘.hdf5’ file to store a ragged tensor on disk.

For the moment only ragged tensors of ragged rank of one are supported. However, arbitrary ragged tensors can be supported in principle.

__getitem__(item: int)[source]

Get single item from the ragged tensor on file.

Parameters

item (int) – Index of the item to get.

__init__(file_path: str, compressed: Optional[bool] = None)[source]

Make class for a HDF5 file.

Parameters
  • file_path (str) – Path to file on disk.

  • compressed – Compression to use. Not used at the moment.

__len__()[source]

Length of the tensor on file.

append(item)[source]

Append single item to ragged tensor.

Parameters

item (np.ndarray, tf.Tensor) – Item to append.

Returns

None.

append_multiple(items: list)[source]

Append multiple items to ragged tensor.

Parameters

items (list) – List of items to append. Must match in shape.

Returns

None.

exists()[source]

Check if file for path information of this class exists.

read(return_as_tensor: bool = False)[source]

Read the file into memory.

Parameters

return_as_tensor – Whether to return tf.RaggedTensor.

Returns

Ragged tensor form file.

Return type

tf.RaggedTensor

write(ragged_array: List[numpy.ndarray])[source]

Write ragged array to file.

from kgcnn.io.file import RaggedTensorHDFile
import numpy as np
data = [np.array([[0, 1],[0, 2]]), np.array([[1, 1]]), np.array([[0, 1],[2, 2], [0, 3]])]
f = RaggedTensorHDFile("test.hdf5")
f.write(data)
print(f.read())
Parameters

ragged_array (list, tf.RaggedTensor) – List or list of numpy arrays.

Returns

None.

class kgcnn.io.file.RaggedTensorNumpyFile(file_path: str, compressed: bool = False)[source]

Bases: object

Class representing a NumPy ‘.npz’ file to store a ragged tensor on disk.

For the moment only ragged tensors of ragged rank of one are supported. However, arbitrary ragged tensors can be supported in principle.

__getitem__(item)[source]

Get single item from the ragged tensor on file.

Parameters

item (int) – Index of the item to get.

__init__(file_path: str, compressed: bool = False)[source]

Make class for a NPZ file.

Parameters
  • file_path (str) – Path to file on disk.

  • compressed (bool) – Whether to use compression.

__len__()[source]

Length of the tensor on file.

exists()[source]

Check if file for path information of this class exists.

read(return_as_tensor: bool = False)[source]

Read the file into memory.

Parameters

return_as_tensor – Whether to return tf.RaggedTensor.

Returns

Ragged tensor form file.

Return type

tf.RaggedTensor

write(ragged_array: Union[tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor, List[numpy.ndarray], list])[source]

Write ragged array to file.

from kgcnn.io.file import RaggedTensorNumpyFile
import numpy as np
data = [np.array([[0, 1],[0, 2]]), np.array([[1, 1]]), np.array([[0, 1],[2, 2], [0, 3]])]
f = RaggedTensorNumpyFile("test.npz")
f.write(data)
print(f.read())
Parameters

ragged_array (list, tf.RaggedTensor) – List or list of numpy arrays.

Returns

None.

kgcnn.io.file._check_for_inner_shape(array_list: List[numpy.ndarray]) → Union[None, tuple, list][source]

Simple function to verify inner shape for list of numpy arrays.

kgcnn.io.graphlist module

kgcnn.io.loader module

kgcnn.io.loader.pad_at_axis(x, pad_width, axis=0, **kwargs)[source]
kgcnn.io.loader.tf_dataset_disjoint_generator(graphs, inputs: Union[list, dict], assignment_to_id: Optional[Union[list, dict]] = None, assignment_of_indices: Optional[Union[list, dict]] = None, pos_batch_id: Optional[Union[list, dict]] = None, pos_subgraph_id: Optional[Union[list, dict]] = None, pos_count: Optional[Union[list, dict]] = None, batch_size=32, epochs=None, padded_disjoint=False, shuffle=True, seed=42)[source]

Make a tensorflow dataset for disjoint graph loading.

For the moment only IDs that have their values in inputs can be generated, as the value tensors of e.g. node or edge are used to generate batch IDs.

Inputs is a list or dictionary of keras input layer configs. The names of the layers are linked to the properties in graph .

With assignment_to_id and assignment_of_indices disjoint indices and attributes can be defined. Their IDs are marked with pos_batch_id etc. One must use a name or index for each general split, since for example edge IDs can be used for edge indices, edge attributes and edge relation tensors at the same time. Therefore, one batch ID for edges is enough. One could however assign as many as IDs as there are disjoint graph properties in graph .

Parameters
  • graphs – List of dictionaries with named graph properties.

  • inputs – List or dict of keras input layer configs.

  • assignment_to_id – Assignment of if inputs to disjoint properties to IDs.

  • assignment_of_indices – Assignment of inputs (if they are indices) to their reference.

  • pos_batch_id – Position or name of batch IDs.

  • pos_subgraph_id – Position or name of batch IDs.

  • pos_count – Position or name of batch IDs.

  • batch_size – Batch size.

  • epochs – Expected number of epochs. Only required for padded disjoint.

  • padded_disjoint – If padded disjoint tensors should be generated.

  • shuffle – Whether to shuffle each epoch.

  • seed – Seed for shuffle.

Returns

Tensorflow dataset to load disjoint graphs.

Return type

tf.data.Dataset

Module contents