gravitino.client.gravitino_metalake.GravitinoMetalake

class gravitino.client.gravitino_metalake.GravitinoMetalake(metalake: MetalakeDTO | None = None, client: HTTPClient | None = None)

Bases: MetalakeDTO, SupportsJobs

Gravitino Metalake is the top-level metadata repository for users. It contains a list of catalogs as sub-level metadata collections. With GravitinoMetalake, users can list, create, load, alter and drop a catalog with specified identifier.

__init__(metalake: MetalakeDTO | None = None, client: HTTPClient | None = None)

Methods

__init__([metalake, client])

alter_catalog(name, *changes)

Alter the catalog with specified name by applying the changes.

audit_info()

cancel_job(job_id)

Cancels a job by its ID.

comment()

The comment of the metalake.

create_catalog(name, catalog_type, provider, ...)

Create a new catalog with specified name, catalog type, comment and properties.

delete_job_template(job_template_name)

Deletes a job template by its name.

disable_catalog(name)

Disable the catalog with specified name.

drop_catalog(name[, force])

Drop the catalog with specified name.

enable_catalog(name)

Enable the catalog with specified name.

equals(other)

from_dict(kvs, *[, infer_missing])

from_json(s, *[, parse_float, parse_int, ...])

get_job(job_id)

Retrieves a job by its ID.

get_job_template(job_template_name)

Retrieves a job template by its name.

list_catalogs()

List all the catalogs under this metalake.

list_catalogs_info()

List all the catalogs with their information under this metalake.

list_job_templates()

List all the registered job templates in Gravitino.

list_jobs([job_template_name])

List all the jobs under this metalake.

load_catalog(name)

Load the catalog with specified name.

name()

The name of the metalake.

properties()

The properties of the metalake.

property_equal(p1, p2)

register_job_template(job_template)

Register a job template with the specified job template to Gravitino.

run_job(job_template_name, job_conf)

Runs a job based on the specified job template and configuration.

schema(*[, infer_missing, only, exclude, ...])

to_dict([encode_json])

to_json(*[, skipkeys, ensure_ascii, ...])

Attributes

API_METALAKES_CATALOGS_PATH

API_METALAKES_JOB_RUNS_PATH

API_METALAKES_JOB_TEMPLATES_PATH

dataclass_json_config

rest_client

alter_catalog(name: str, *changes: CatalogChange) Catalog

Alter the catalog with specified name by applying the changes.

Args:

name: the name of the catalog. changes: the changes to apply to the catalog.

Raises:

NoSuchCatalogException if the catalog with specified name does not exist. IllegalArgumentException if the changes are invalid.

Returns:

the altered Catalog.

cancel_job(job_id: str) JobHandle

Cancels a job by its ID.

Args:

job_id: The ID of the job to cancel.

Returns:

A JobHandle representing the cancelled job.

Raises:

NoSuchJobException: If no job with the specified ID exists.

comment() str

The comment of the metalake. Note. this method will return None if the comment is not set for this metalake.

Returns:

Optional[str]: The comment of the metalake.

create_catalog(name: str, catalog_type: Type, provider: str, comment: str, properties: Dict[str, str]) Catalog

Create a new catalog with specified name, catalog type, comment and properties.

Args:

name: The name of the catalog. catalog_type: The type of the catalog. provider: The provider of the catalog. This parameter can be None if the catalog provides a managed implementation. Currently, the model and fileset catalog support None provider. For the details, please refer to the Catalog.Type. comment: The comment of the catalog. properties: The properties of the catalog.

Raises:

NoSuchMetalakeException if the metalake does not exist. CatalogAlreadyExistsException if the catalog with specified name already exists.

Returns:

The created Catalog.

delete_job_template(job_template_name: str) bool

Deletes a job template by its name. This will remove the job template from Gravitino, and it will no longer be available for execution. Only when all the jobs associated with this job template are completed, failed or cancelled, the job template can be deleted successfully, otherwise it will throw InUseException. Returns false if the job template to be deleted does not exist.

The deletion of a job template will also delete all the jobs associated with this template.

Args:

job_template_name: The name of the job template to delete.

Returns:

bool: True if the job template was deleted successfully, False if the job template does not exist.

Raises:

InUseException: If the job template is currently in use by any jobs, it cannot be deleted.

disable_catalog(name: str)

Disable the catalog with specified name. If the catalog is already disabled, this method does nothing.

Args:

name: the name of the catalog.

Raises:

NoSuchCatalogException if the catalog with specified name does not exist.

drop_catalog(name: str, force: bool = False) bool

Drop the catalog with specified name.

Args:

name: the name of the catalog. force: whether to force drop the catalog.

Returns:

true if the catalog is dropped successfully, false if the catalog does not exist.

enable_catalog(name: str)

Enable the catalog with specified name. If the catalog is already in use, this method does nothing.

Args:

name: the name of the catalog.

Raises:

NoSuchCatalogException if the catalog with specified name does not exist.

get_job(job_id: str) JobHandle

Retrieves a job by its ID.

Args:

job_id: The ID of the job to retrieve.

Returns:

The JobHandle representing the job if found, otherwise raises an exception.

Raises:

NoSuchJobException: If no job with the specified ID exists.

get_job_template(job_template_name: str) JobTemplate

Retrieves a job template by its name.

Args:

job_template_name: The name of the job template to retrieve.

Returns:

The job template if found, otherwise raises an exception.

Raises:

NoSuchJobTemplateException: If no job template with the specified name exists.

list_catalogs() List[str]

List all the catalogs under this metalake.

Raises:

NoSuchMetalakeException if the metalake with specified namespace does not exist.

Returns:

A list of the catalog names under this metalake.

list_catalogs_info() List[Catalog]

List all the catalogs with their information under this metalake.

Raises:

NoSuchMetalakeException if the metalake with specified namespace does not exist.

Returns:

A list of Catalog under the specified namespace.

list_job_templates() List[JobTemplate]

List all the registered job templates in Gravitino.

Returns:

List of job templates.

list_jobs(job_template_name: str | None = None) List[JobHandle]

List all the jobs under this metalake.

Args:

job_template_name: The name of the job template to filter jobs by. If None, all jobs are listed.

Returns:

A list of JobHandle objects representing the jobs.

load_catalog(name: str) Catalog

Load the catalog with specified name.

Args:

name: The name of the catalog to load.

Raises:

NoSuchCatalogException if the catalog with specified name does not exist.

Returns:

The Catalog with specified name.

name() str

The name of the metalake.

Returns:

str: The name of the metalake.

properties() Dict[str, str]

The properties of the metalake. Note, this method will return None if the properties are not set.

Returns:

Optional[Dict[str, str]]: The properties of the metalake.

register_job_template(job_template: JobTemplate) None

Register a job template with the specified job template to Gravitino. The registered job template will be maintained in Gravitino, allowing it to be executed later.

Args:

job_template: The job template to register.

Raises:

JobTemplateAlreadyExists: If a job template with the same name already exists.

run_job(job_template_name: str, job_conf: Dict[str, str]) JobHandle

Runs a job based on the specified job template and configuration.

Args:

job_template_name: The name of the job template to use for running the job. job_conf: A dictionary containing the configuration for the job.

Returns:

A JobHandle representing the started job.

Raises:

NoSuchJobTemplateException: If no job template with the specified name exists.