kgcnn.io package¶
Submodules¶
kgcnn.io.file module¶
-
class
kgcnn.io.file.
RaggedTensorHDFile
(file_path: str, compressed: Optional[bool] = None)[source]¶ Bases:
object
Class representing an HDF ‘.hdf5’ file to store a ragged tensor on disk.
For the moment only ragged tensors of ragged rank of one are supported. However, arbitrary ragged tensors can be supported in principle.
-
__getitem__
(item: int)[source]¶ Get single item from the ragged tensor on file.
- Parameters
item (int) – Index of the item to get.
-
__init__
(file_path: str, compressed: Optional[bool] = None)[source]¶ Make class for a HDF5 file.
- Parameters
file_path (str) – Path to file on disk.
compressed – Compression to use. Not used at the moment.
-
append
(item)[source]¶ Append single item to ragged tensor.
- Parameters
item (np.ndarray, tf.Tensor) – Item to append.
- Returns
None.
-
append_multiple
(items: list)[source]¶ Append multiple items to ragged tensor.
- Parameters
items (list) – List of items to append. Must match in shape.
- Returns
None.
-
read
(return_as_tensor: bool = False)[source]¶ Read the file into memory.
- Parameters
return_as_tensor – Whether to return tf.RaggedTensor.
- Returns
Ragged tensor form file.
- Return type
tf.RaggedTensor
-
write
(ragged_array: List[numpy.ndarray])[source]¶ Write ragged array to file.
from kgcnn.io.file import RaggedTensorHDFile import numpy as np data = [np.array([[0, 1],[0, 2]]), np.array([[1, 1]]), np.array([[0, 1],[2, 2], [0, 3]])] f = RaggedTensorHDFile("test.hdf5") f.write(data) print(f.read())
- Parameters
ragged_array (list, tf.RaggedTensor) – List or list of numpy arrays.
- Returns
None.
-
-
class
kgcnn.io.file.
RaggedTensorNumpyFile
(file_path: str, compressed: bool = False)[source]¶ Bases:
object
Class representing a NumPy ‘.npz’ file to store a ragged tensor on disk.
For the moment only ragged tensors of ragged rank of one are supported. However, arbitrary ragged tensors can be supported in principle.
-
__getitem__
(item)[source]¶ Get single item from the ragged tensor on file.
- Parameters
item (int) – Index of the item to get.
-
read
(return_as_tensor: bool = False)[source]¶ Read the file into memory.
- Parameters
return_as_tensor – Whether to return tf.RaggedTensor.
- Returns
Ragged tensor form file.
- Return type
tf.RaggedTensor
-
write
(ragged_array: Union[tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor, List[numpy.ndarray], list])[source]¶ Write ragged array to file.
from kgcnn.io.file import RaggedTensorNumpyFile import numpy as np data = [np.array([[0, 1],[0, 2]]), np.array([[1, 1]]), np.array([[0, 1],[2, 2], [0, 3]])] f = RaggedTensorNumpyFile("test.npz") f.write(data) print(f.read())
- Parameters
ragged_array (list, tf.RaggedTensor) – List or list of numpy arrays.
- Returns
None.
-
kgcnn.io.graphlist module¶
kgcnn.io.loader module¶
-
kgcnn.io.loader.
tf_dataset_disjoint_generator
(graphs, inputs: Union[list, dict], assignment_to_id: Optional[Union[list, dict]] = None, assignment_of_indices: Optional[Union[list, dict]] = None, pos_batch_id: Optional[Union[list, dict]] = None, pos_subgraph_id: Optional[Union[list, dict]] = None, pos_count: Optional[Union[list, dict]] = None, batch_size=32, epochs=None, padded_disjoint=False, shuffle=True, seed=42)[source]¶ Make a tensorflow dataset for disjoint graph loading.
For the moment only IDs that have their values in inputs can be generated, as the value tensors of e.g. node or edge are used to generate batch IDs.
Inputs is a list or dictionary of keras input layer configs. The names of the layers are linked to the properties in graph .
With assignment_to_id and assignment_of_indices disjoint indices and attributes can be defined. Their IDs are marked with pos_batch_id etc. One must use a name or index for each general split, since for example edge IDs can be used for edge indices, edge attributes and edge relation tensors at the same time. Therefore, one batch ID for edges is enough. One could however assign as many as IDs as there are disjoint graph properties in graph .
- Parameters
graphs – List of dictionaries with named graph properties.
inputs – List or dict of keras input layer configs.
assignment_to_id – Assignment of if inputs to disjoint properties to IDs.
assignment_of_indices – Assignment of inputs (if they are indices) to their reference.
pos_batch_id – Position or name of batch IDs.
pos_subgraph_id – Position or name of batch IDs.
pos_count – Position or name of batch IDs.
batch_size – Batch size.
epochs – Expected number of epochs. Only required for padded disjoint.
padded_disjoint – If padded disjoint tensors should be generated.
shuffle – Whether to shuffle each epoch.
seed – Seed for shuffle.
- Returns
Tensorflow dataset to load disjoint graphs.
- Return type
tf.data.Dataset