kgcnn.io package¶

Submodules¶

kgcnn.io.file module¶

class kgcnn.io.file.RaggedTensorHDFile(file_path: str, compressed: Optional[bool] = None)[source]¶

Bases: object

Class representing an HDF ‘.hdf5’ file to store a ragged tensor on disk.

For the moment only ragged tensors of ragged rank of one are supported. However, arbitrary ragged tensors can be supported in principle.

__getitem__(item: int)[source]¶

Get single item from the ragged tensor on file.

Parameters: item (int) – Index of the item to get.

__init__(file_path: str, compressed: Optional[bool] = None)[source]¶

Make class for a HDF5 file.

Parameters

file_path (str) – Path to file on disk.
compressed – Compression to use. Not used at the moment.

__len__()[source]¶: Length of the tensor on file.

append(item)[source]¶

Append single item to ragged tensor.

Parameters: item (np.ndarray, tf.Tensor) – Item to append.
Returns: None.

append_multiple(items: list)[source]¶

Append multiple items to ragged tensor.

Parameters: items (list) – List of items to append. Must match in shape.
Returns: None.

exists()[source]¶: Check if file for path information of this class exists.

read(return_as_tensor: bool = False)[source]¶

Read the file into memory.

Parameters: return_as_tensor – Whether to return tf.RaggedTensor.
Returns: Ragged tensor form file.
Return type: tf.RaggedTensor

write(ragged_array: List[numpy.ndarray])[source]¶

Write ragged array to file.

from kgcnn.io.file import RaggedTensorHDFile
import numpy as np
data = [np.array([[0, 1],[0, 2]]), np.array([[1, 1]]), np.array([[0, 1],[2, 2], [0, 3]])]
f = RaggedTensorHDFile("test.hdf5")
f.write(data)
print(f.read())

Parameters: ragged_array (list, tf.RaggedTensor) – List or list of numpy arrays.
Returns: None.

class kgcnn.io.file.RaggedTensorNumpyFile(file_path: str, compressed: bool = False)[source]¶

Bases: object

Class representing a NumPy ‘.npz’ file to store a ragged tensor on disk.

For the moment only ragged tensors of ragged rank of one are supported. However, arbitrary ragged tensors can be supported in principle.

__getitem__(item)[source]¶

Get single item from the ragged tensor on file.

Parameters: item (int) – Index of the item to get.

__init__(file_path: str, compressed: bool = False)[source]¶

Make class for a NPZ file.

Parameters

file_path (str) – Path to file on disk.
compressed (bool) – Whether to use compression.

__len__()[source]¶: Length of the tensor on file.

exists()[source]¶: Check if file for path information of this class exists.

read(return_as_tensor: bool = False)[source]¶

Read the file into memory.

Parameters: return_as_tensor – Whether to return tf.RaggedTensor.
Returns: Ragged tensor form file.
Return type: tf.RaggedTensor

write(ragged_array: Union[tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor, List[numpy.ndarray], list])[source]¶

Write ragged array to file.

from kgcnn.io.file import RaggedTensorNumpyFile
import numpy as np
data = [np.array([[0, 1],[0, 2]]), np.array([[1, 1]]), np.array([[0, 1],[2, 2], [0, 3]])]
f = RaggedTensorNumpyFile("test.npz")
f.write(data)
print(f.read())

Parameters: ragged_array (list, tf.RaggedTensor) – List or list of numpy arrays.
Returns: None.

kgcnn.io.file._check_for_inner_shape(array_list: List[numpy.ndarray]) → Union[None, tuple, list][source]¶: Simple function to verify inner shape for list of numpy arrays.

kgcnn.io.graphlist module¶

kgcnn.io.loader module¶

kgcnn.io.loader.pad_at_axis(x, pad_width, axis=0, **kwargs)[source]¶

kgcnn.io.loader.tf_dataset_disjoint_generator(graphs, inputs: Union[list, dict], assignment_to_id: Optional[Union[list, dict]] = None, assignment_of_indices: Optional[Union[list, dict]] = None, pos_batch_id: Optional[Union[list, dict]] = None, pos_subgraph_id: Optional[Union[list, dict]] = None, pos_count: Optional[Union[list, dict]] = None, batch_size=32, epochs=None, padded_disjoint=False, shuffle=True, seed=42)[source]¶

Make a tensorflow dataset for disjoint graph loading.

For the moment only IDs that have their values in inputs can be generated, as the value tensors of e.g. node or edge are used to generate batch IDs.

Inputs is a list or dictionary of keras input layer configs. The names of the layers are linked to the properties in graph .

With assignment_to_id and assignment_of_indices disjoint indices and attributes can be defined. Their IDs are marked with pos_batch_id etc. One must use a name or index for each general split, since for example edge IDs can be used for edge indices, edge attributes and edge relation tensors at the same time. Therefore, one batch ID for edges is enough. One could however assign as many as IDs as there are disjoint graph properties in graph .

Parameters

graphs – List of dictionaries with named graph properties.
inputs – List or dict of keras input layer configs.
assignment_to_id – Assignment of if inputs to disjoint properties to IDs.
assignment_of_indices – Assignment of inputs (if they are indices) to their reference.
pos_batch_id – Position or name of batch IDs.
pos_subgraph_id – Position or name of batch IDs.
pos_count – Position or name of batch IDs.
batch_size – Batch size.
epochs – Expected number of epochs. Only required for padded disjoint.
padded_disjoint – If padded disjoint tensors should be generated.
shuffle – Whether to shuffle each epoch.
seed – Seed for shuffle.

Returns

Tensorflow dataset to load disjoint graphs.

Return type

tf.data.Dataset

kgcnn.io package¶

Submodules¶

kgcnn.io.file module¶

kgcnn.io.graphlist module¶

kgcnn.io.loader module¶

Module contents¶