gravitino.client.generic_model_catalog.GenericModelCatalog

class gravitino.client.generic_model_catalog.GenericModelCatalog(namespace: Namespace, name: str = None, catalog_type: Type = Type.UNSUPPORTED, provider: str = None, comment: str = None, properties: Dict[str, str] = None, audit: AuditDTO = None, rest_client: HTTPClient = None)

Bases: BaseSchemaCatalog

The generic model catalog is a catalog that supports model and model version operations, for example, model register, model version link, model and model version list, etc. A model catalog is under the metalake.

__init__(namespace: Namespace, name: str = None, catalog_type: Type = Type.UNSUPPORTED, provider: str = None, comment: str = None, properties: Dict[str, str] = None, audit: AuditDTO = None, rest_client: HTTPClient = None)

Methods

__init__(namespace[, name, catalog_type, ...])

alter_function(ident, *changes)

Applies FunctionChange changes to a function in the catalog.

alter_model(model_ident, *changes)

Alter the schema by applying the changes. Args: model_ident: The identifier of the model. changes: The changes to apply to the model. Raises: NoSuchSchemaException: If the schema does not exist. IllegalArgumentException: If the changes are invalid. Returns: The updated schema object.

alter_model_version(model_ident, version, ...)

Alter the model version by applying the changes. Args: model_ident: The identifier of the model. version: The version of the model version. changes: The changes to apply to the model version. Raises: NoSuchModelVersionException: If the model version does not exist. IllegalArgumentException: If the changes are invalid. Returns: The updated model version object.

alter_model_version_by_alias(model_ident, ...)

Alter the model version by applying the changes. Args: model_ident: The identifier of the model. alias: The alias of the model version. changes: The changes to apply to the model version. Raises: NoSuchModelVersionException: If the model version does not exist. IllegalArgumentException: If the changes are invalid. Returns: The updated model version object.

alter_schema(schema_name, *changes)

Alter the schema with specified identifier by applying the changes.

as_fileset_catalog()

Raises:

as_function_catalog()

Returns:

as_model_catalog()

Returns:

as_schemas()

Return the {@link SupportsSchemas} if the catalog supports schema operations.

as_table_catalog()

Raises:

as_topic_catalog()

Returns:

audit_info()

builder([name, catalog_type, provider, ...])

comment()

The comment of the catalog.

create_schema([schema_name, comment, properties])

Create a new schema with specified identifier, comment and metadata.

delete_model(model_ident)

Delete the model from the catalog.

delete_model_version(model_ident, version)

Delete the model version from the catalog.

delete_model_version_by_alias(model_ident, alias)

Delete the model version by alias from the catalog.

drop_function(ident)

Drop a function by name.

drop_schema(schema_name, cascade)

Drop the schema with specified identifier.

format_schema_request_path(ns)

function_exists(ident)

Check if a function with the given name exists in the catalog.

get_function(ident)

Get a function by NameIdentifier from the catalog.

get_model(ident)

Get a model by its identifier.

get_model_version(model_ident, version)

Get a model version by its identifier and version.

get_model_version_by_alias(model_ident, alias)

Get a model version by its identifier and alias.

get_model_version_uri(model_ident, version)

Get the URI of the model artifact with a specified version number.

get_model_version_uri_by_alias(model_ident, ...)

Get the URI of the model artifact with a specified version alias.

link_model_version(model_ident, uri, ...)

Link a new model version to the registered model object.

link_model_version_with_multiple_uris(...)

Link a new model version to the registered model object.

list_function_infos(namespace)

List the functions with details in a namespace from the catalog.

list_functions(namespace)

List the functions in a namespace from the catalog.

list_model_version_infos(model_ident)

List all the versions with their information of the register model by NameIdentifier in the catalog.

list_model_versions(model_ident)

List all the versions of the register model by NameIdentifier in the catalog.

list_models(namespace)

List the models in a schema namespace from the catalog.

list_schemas()

List all the schemas under the given catalog namespace.

load_schema(schema_name)

Load the schema with specified identifier.

name()

Returns:

properties()

The properties of the catalog.

provider()

Returns:

register_function(ident, comment, ...)

Register a function with one or more definitions (overloads).

register_model(ident, comment, properties)

Register a model in the catalog if the model is not existed, otherwise the ModelAlreadyExistsException will be thrown.

register_model_version(ident, uri, aliases, ...)

Register a model in the catalog if the model is not existed, otherwise the ModelAlreadyExistsException will be thrown.

register_model_version_with_multiple_uris(...)

Register a model in the catalog if the model is not existed, otherwise the ModelAlreadyExistsException will be thrown.

schema_exists(schema_name)

Check if a schema exists.

to_model_update_request(change)

to_model_version_update_request(change)

to_schema_update_request(change)

type()

Returns:

validate()

Attributes

PROPERTY_PACKAGE

A reserved property to specify the package location of the catalog.

rest_client

PROPERTY_PACKAGE = 'package'

A reserved property to specify the package location of the catalog. The “package” is a string of path to the folder where all the catalog related dependencies is located. The dependencies under the “package” will be loaded by Gravitino to create the catalog.

The property “package” is not needed if the catalog is a built-in one, Gravitino will search the proper location using “provider” to load the dependencies. Only when the folder is in different location, the “package” property is needed.

class Type(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

The type of the catalog.

FILESET = ('fileset', True)

Catalog Type for Fileset System (including HDFS, S3, etc.), like path/to/file

MESSAGING = ('messaging', False)

Catalog Type for Message Queue, like kafka://topic

MODEL = ('model', True)

Catalog Type for ML model

RELATIONAL = ('relational', False)

“Catalog Type for Relational Data Structure, like db.table, catalog.db.table.

UNSUPPORTED = ('unsupported', False)

Catalog Type for test only.

property supports_managed_catalog

A flag to indicate if the catalog type supports managed catalog. Managed catalog is a concept in Gravitino, which means Gravitino will manage the lifecycle of the catalog and its subsidiaries. If the catalog type supports managed catalog, users can create managed catalog of this type without specifying the catalog provider, Gravitino will use the type as the provider to create the managed catalog. If the catalog type does not support managed catalog, users need to specify the provider to create the catalog.

property type_name

The name of the catalog type.

alter_function(ident: NameIdentifier, *changes: FunctionChange) Function

Applies FunctionChange changes to a function in the catalog.

Implementations may reject the changes. If any change is rejected, no changes should be applied to the function.

Args:

ident: The NameIdentifier instance of the function to alter. changes: The FunctionChange instances to apply to the function.

Returns:

The updated Function instance.

Raises:

NoSuchFunctionException: If the function does not exist. IllegalArgumentException: If the change is rejected by the implementation.

alter_model(model_ident: NameIdentifier, *changes: ModelChange) Model

Alter the schema by applying the changes. Args:

model_ident: The identifier of the model. changes: The changes to apply to the model.

Raises:

NoSuchSchemaException: If the schema does not exist. IllegalArgumentException: If the changes are invalid.

Returns:

The updated schema object.

alter_model_version(model_ident: NameIdentifier, version: int, *changes: ModelVersionChange)

Alter the model version by applying the changes. Args:

model_ident: The identifier of the model. version: The version of the model version. changes: The changes to apply to the model version.

Raises:

NoSuchModelVersionException: If the model version does not exist. IllegalArgumentException: If the changes are invalid.

Returns:

The updated model version object.

alter_model_version_by_alias(model_ident: NameIdentifier, alias: str, *changes: ModelVersionChange)

Alter the model version by applying the changes. Args:

model_ident: The identifier of the model. alias: The alias of the model version. changes: The changes to apply to the model version.

Raises:

NoSuchModelVersionException: If the model version does not exist. IllegalArgumentException: If the changes are invalid.

Returns:

The updated model version object.

alter_schema(schema_name: str, *changes: SchemaChange) Schema

Alter the schema with specified identifier by applying the changes.

Args:

schema_name: The name of the schema. changes: The metadata changes to apply.

Raises:

NoSuchSchemaException if the schema with specified identifier does not exist.

Returns:

The altered Schema.

as_fileset_catalog() FilesetCatalog
Raises:

UnsupportedOperationException if the catalog does not support fileset operations.

Returns:

the FilesetCatalog if the catalog supports fileset operations.

as_function_catalog()
Returns:

the {@link FunctionCatalog} if the catalog supports function operations.

Raises:

UnsupportedOperationException if the catalog does not support function operations.

as_model_catalog()
Returns:

the {@link ModelCatalog} if the catalog supports model operations.

Raises:

UnsupportedOperationException if the catalog does not support model operations.

as_schemas()

Return the {@link SupportsSchemas} if the catalog supports schema operations.

Raises:

UnsupportedOperationException if the catalog does not support schema operations.

Returns:

The {@link SupportsSchemas} if the catalog supports schema operations.

as_table_catalog() TableCatalog
Raises:

UnsupportedOperationException if the catalog does not support table operations.

Returns:

the {@link TableCatalog} if the catalog supports table operations.

as_topic_catalog() TopicCatalog
Returns:

the {@link TopicCatalog} if the catalog supports topic operations.

Raises:

UnsupportedOperationException if the catalog does not support topic operations.

comment() str

The comment of the catalog. Note. this method will return null if the comment is not set for this catalog.

Returns:

The provider of the catalog.

create_schema(schema_name: str = None, comment: str = None, properties: Dict[str, str] = None) Schema

Create a new schema with specified identifier, comment and metadata.

Args:

schema_name: The name of the schema. comment: The comment of the schema. properties: The properties of the schema.

Raises:

NoSuchCatalogException if the catalog with specified namespace does not exist. SchemaAlreadyExistsException if the schema with specified identifier already exists.

Returns:

The created Schema.

delete_model(model_ident: NameIdentifier) bool

Delete the model from the catalog. If the model does not exist, return false. If the model is successfully deleted, return true. The deletion of the model will also delete all the model versions linked to this model.

Args:

model_ident: The identifier of the model.

Returns:

True if the model is deleted successfully, False is the model does not exist.

delete_model_version(model_ident: NameIdentifier, version: int) bool

Delete the model version from the catalog. If the model version does not exist, return false. If the model version is successfully deleted, return true.

Args:

model_ident: The identifier of the model. version: The version of the model.

Returns:

True if the model version is deleted successfully, False is the model version does not exist.

delete_model_version_by_alias(model_ident: NameIdentifier, alias: str) bool

Delete the model version by alias from the catalog. If the model version does not exist, return false. If the model version is successfully deleted, return true.

Args:

model_ident: The identifier of the model. alias: The alias of the model version.

Returns:

True if the model version is deleted successfully, False is the model version does not exist.

drop_function(ident: NameIdentifier) bool

Drop a function by name.

Args:

ident: The name identifier of the function.

Returns:

True if the function is deleted, False if the function does not exist.

drop_schema(schema_name: str, cascade: bool) bool

Drop the schema with specified identifier.

Args:

schema_name: The name of the schema. cascade: Whether to drop all the tables under the schema.

Raises:

NonEmptySchemaException if the schema is not empty and cascade is false.

Returns:

true if the schema is dropped successfully, false otherwise.

function_exists(ident: NameIdentifier) bool

Check if a function with the given name exists in the catalog.

Args:

ident: The function identifier.

Returns:

True if the function exists, False otherwise.

get_function(ident: NameIdentifier) Function

Get a function by NameIdentifier from the catalog.

The identifier only contains the schema and function name. A function may include multiple definitions (overloads) in the result.

Args:

ident: A function identifier.

Returns:

The function with the given name.

Raises:

NoSuchFunctionException: If the function does not exist.

get_model(ident: NameIdentifier) Model

Get a model by its identifier.

Args:

ident: The identifier of the model.

Raises:

NoSuchModelException: If the model does not exist.

Returns:

The model object.

get_model_version(model_ident: NameIdentifier, version: int) ModelVersion

Get a model version by its identifier and version.

Args:

model_ident: The identifier of the model. version: The version of the model.

Raises:

NoSuchModelVersionException: If the model version does not exist.

Returns:

The model version object.

get_model_version_by_alias(model_ident: NameIdentifier, alias: str) ModelVersion

Get a model version by its identifier and alias.

Args:

model_ident: The identifier of the model. alias: The alias of the model version.

Raises:

NoSuchModelVersionException: If the model version does not exist.

Returns:

The model version object.

get_model_version_uri(model_ident: NameIdentifier, version: int, uri_name: str = None)

Get the URI of the model artifact with a specified version number.

Args:

model_ident: The identifier of the model. version: The version of the model. uri_name: The name of the URI. If None, the default URI will be used.

Raises:

NoSuchModelVersionException: If the model version does not exist. NoSuchModelVersionURINameException: If the uri name does not exist.

Returns:

The URI of the model version.

get_model_version_uri_by_alias(model_ident: NameIdentifier, alias: str, uri_name: str = None)

Get the URI of the model artifact with a specified version alias.

Args:

model_ident: The identifier of the model. alias: The alias of the model version. uri_name: The name of the URI. If None, the default URI will be used.

Raises:

NoSuchModelVersionException: If the model version does not exist. NoSuchModelVersionURINameException: If the uri name does not exist.

Returns:

The URI of the model version.

Link a new model version to the registered model object. The new model version will be added to the model object. If the model object does not exist, it will throw an exception. If the version alias already exists in the model, it will throw an exception.

Args:

model_ident: The identifier of the model. uri: The URI of the model version. aliases: The aliases of the model version. The aliases of the model version. The aliases should be unique in this model, otherwise the ModelVersionAliasesAlreadyExistException will be thrown. The aliases are optional and can be empty. comment: The comment of the model version. properties: The properties of the model version.

Raises:

NoSuchModelException: If the model does not exist. ModelVersionAliasesAlreadyExistException: If the aliases of the model version already exist.

Link a new model version to the registered model object. The new model version will be added to the model object. If the model object does not exist, it will throw an exception. If the version alias already exists in the model, it will throw an exception.

Args:

model_ident: The identifier of the model. uris: The URIs and their names of the model version. aliases: The aliases of the model version. The aliases of the model version. The aliases should be unique in this model, otherwise the ModelVersionAliasesAlreadyExistException will be thrown. The aliases are optional and can be empty. comment: The comment of the model version. properties: The properties of the model version.

Raises:

NoSuchModelException: If the model does not exist. ModelVersionAliasesAlreadyExistException: If the aliases of the model version already exist.

list_function_infos(namespace: Namespace) List[Function]

List the functions with details in a namespace from the catalog.

Args:

namespace: A namespace.

Returns:

A list of functions in the namespace.

Raises:

NoSuchSchemaException: If the schema does not exist.

list_functions(namespace: Namespace) List[NameIdentifier]

List the functions in a namespace from the catalog.

Args:

namespace: A namespace.

Returns:

A list of function identifiers in the namespace.

Raises:

NoSuchSchemaException: If the schema does not exist.

list_model_version_infos(model_ident: NameIdentifier) List[ModelVersion]

List all the versions with their information of the register model by NameIdentifier in the catalog.

Args:

model_ident: The identifier of the model.

Raises:

NoSuchModelException: If the model does not exist.

Returns:

A list of model versions with their information.

list_model_versions(model_ident: NameIdentifier) List[int]

List all the versions of the register model by NameIdentifier in the catalog.

Args:

model_ident: The identifier of the model.

Raises:

NoSuchModelException: If the model does not exist.

Returns:

A list of model versions.

list_models(namespace: Namespace) List[NameIdentifier]

List the models in a schema namespace from the catalog.

Args:

namespace: The namespace of the schema.

Raises:

NoSuchSchemaException: If the schema does not exist.

Returns:

A list of NameIdentifier of models under the given namespace.

list_schemas() List[str]

List all the schemas under the given catalog namespace.

Raises:

NoSuchCatalogException if the catalog with specified namespace does not exist.

Returns:

A list of schema names under the given catalog namespace.

load_schema(schema_name: str) Schema

Load the schema with specified identifier.

Args:

schema_name: The name of the schema.

Raises:

NoSuchSchemaException if the schema with specified identifier does not exist.

Returns:

The Schema with specified identifier.

name() str
Returns:

The name of the catalog.

properties() Dict[str, str]

The properties of the catalog. Note, this method will return null if the properties are not set.

Returns:

The properties of the catalog.

provider() str
Returns:

The provider of the catalog.

register_function(ident: NameIdentifier, comment: str | None, function_type: FunctionType, deterministic: bool, definitions: List[FunctionDefinition]) Function

Register a function with one or more definitions (overloads).

Each definition contains its own return type (for scalar/aggregate functions) or return columns (for table-valued functions).

Args:

ident: The function identifier. comment: The optional function comment. function_type: The function type (SCALAR, AGGREGATE, or TABLE). deterministic: Whether the function is deterministic. definitions: The function definitions, each containing parameters,

return type/columns, and implementations.

Returns:

The registered function.

Raises:

NoSuchSchemaException: If the schema does not exist. FunctionAlreadyExistsException: If the function already exists.

register_model(ident: NameIdentifier, comment: str, properties: Dict[str, str]) Model

Register a model in the catalog if the model is not existed, otherwise the ModelAlreadyExistsException will be thrown. The Model object will be created when the model is registered, users can call ModelCatalog#link_model_version to link the model version to the registered Model.

Args:

ident: The identifier of the model. comment: The comment of the model. properties: The properties of the model.

Raises:

ModelAlreadyExistsException: If the model already exists. NoSuchSchemaException: If the schema does not exist.

Returns:

The registered model object.

register_model_version(ident: NameIdentifier, uri: str, aliases: List[str], comment: str, properties: Dict[str, str]) Model

Register a model in the catalog if the model is not existed, otherwise the ModelAlreadyExistsException will be thrown. The Model object will be created when the model is registered, in the meantime, the model version (version 0) will also be created and linked to the registered model. Register a model in the catalog and link a new model version to the registered model.

Args:

ident: The identifier of the model. uri: The URI of the model version. aliases: The aliases of the model version. comment: The comment of the model. properties: The properties of the model.

Raises:

ModelAlreadyExistsException: If the model already exists. ModelVersionAliasesAlreadyExistException: If the aliases of the model version already exist.

Returns:

The registered model object.

register_model_version_with_multiple_uris(ident: NameIdentifier, uris: Dict[str, str], aliases: List[str], comment: str, properties: Dict[str, str]) Model

Register a model in the catalog if the model is not existed, otherwise the ModelAlreadyExistsException will be thrown. The Model object will be created when the model is registered, in the meantime, the model version (version 0) will also be created and linked to the registered model. Register a model in the catalog and link a new model version to the registered model.

Args:

ident: The identifier of the model. uris: The URIs and their names of the model version. aliases: The aliases of the model version. comment: The comment of the model. properties: The properties of the model.

Raises:

ModelAlreadyExistsException: If the model already exists. ModelVersionAliasesAlreadyExistException: If the aliases of the model version already exist.

Returns:

The registered model object.

schema_exists(schema_name: str) bool

Check if a schema exists.

If an entity such as a table, view exists, its parent namespaces must also exist. For example, if table a.b.t exists, this method invoked as schema_exists(a.b) must return true.

Args:

schema_name: The name of the schema.

Returns:

True if the schema exists, false otherwise.

type() Type
Returns:

The type of the catalog.