gravitino.api.expressions.transforms.transforms.Transforms

class gravitino.api.expressions.transforms.transforms.Transforms

Bases: Transform

Helper methods to create logical transforms to pass into Apache Gravitino.

__init__()

Methods

__init__()

apply(name, arguments)

Create a transform that applies a function to the input value.

arguments()

Gets the arguments passed to the transform function.

assignments()

Gets the preassigned partitions in the partitioning.

bucket(num_buckets, *field_names)

Create a transform that returns the bucket of the input value.

children()

Returns a list of the children of this node.

day()

Create a transform that returns the input value.

hour()

Create a transform that returns the input value.

identity()

Create a transform that returns the input value.

list(*field_names[, assignments])

Create a transform that includes multiple fields in a list.

month()

Create a transform that returns the input value.

name()

Gets the transform function name.

range(field_name[, assignments])

Create a transform that returns the range of the input value with preassigned range partitions.

references()

Returns a list of fields or columns that are referenced by this expression.

truncate()

Create a transform that returns the truncated value of the input value with the given width.

year()

Create a transform that returns the input value.

Attributes

EMPTY_EXPRESSION

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

EMPTY_TRANSFORM

An empty array of transforms.

NAME_OF_BUCKET

The name of the bucket transform.

NAME_OF_DAY

The name of the day transform.

NAME_OF_HOUR

The name of the hour transform.

NAME_OF_IDENTITY

The name of the identity transform.

NAME_OF_LIST

The name of the list transform.

NAME_OF_MONTH

The name of the month transform.

NAME_OF_RANGE

The name of the range transform.

NAME_OF_TRUNCATE

The name of the truncate transform.

NAME_OF_YEAR

The name of the year transform.

class ApplyTransform(name: str, arguments: List[Expression])

Bases: Transform

A transform that applies a function to the input value.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

class BucketTransform(num_buckets: Literal[int], fields: List[NamedReference])

Bases: Transform

A transform that returns the bucket of the input value.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

class DayTransform(ref: NamedReference)

Bases: SingleFieldTransform

A transform that returns the day of the input value.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

field_name() List[str]

Gets the referenced field name as a list of string parts.

Returns:

List[str]: The referenced field name as an array of String parts.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

EMPTY_TRANSFORM: ClassVar[List[Transform]] = []

An empty array of transforms.

class HourTransform(ref: NamedReference)

Bases: SingleFieldTransform

A transform that returns the hour of the input value.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

field_name() List[str]

Gets the referenced field name as a list of string parts.

Returns:

List[str]: The referenced field name as an array of String parts.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

class IdentityTransform(ref: NamedReference)

Bases: SingleFieldTransform

A transform that returns the input value.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

field_name() List[str]

Gets the referenced field name as a list of string parts.

Returns:

List[str]: The referenced field name as an array of String parts.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

class ListTransform(fields: List[NamedReference], assignments: List[ListPartition] | None = None)

Bases: Transform

A transform that includes multiple fields in a list.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

field_names() List[List[str]]

Gets the field names to include in the list.

Returns:

List[List[str]]: The field names to include in the list.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

class MonthTransform(ref: NamedReference)

Bases: SingleFieldTransform

A transform that returns the month of the input value.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

field_name() List[str]

Gets the referenced field name as a list of string parts.

Returns:

List[str]: The referenced field name as an array of String parts.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

NAME_OF_BUCKET: ClassVar[str] = 'bucket'

The name of the bucket transform. The bucket transform returns the bucket of the input value.

NAME_OF_DAY: ClassVar[str] = 'day'

The name of the day transform. The day transform returns the day of the input value.

NAME_OF_HOUR: ClassVar[str] = 'hour'

The name of the hour transform. The hour transform returns the hour of the input value.

NAME_OF_IDENTITY: ClassVar[str] = 'identity'

The name of the identity transform.

NAME_OF_LIST: ClassVar[str] = 'list'

The name of the list transform. The list transform includes multiple fields in a list.

NAME_OF_MONTH: ClassVar[str] = 'month'

The name of the month transform. The month transform returns the month of the input value.

NAME_OF_RANGE: ClassVar[str] = 'range'

The name of the range transform. The range transform returns the range of the input value.

NAME_OF_TRUNCATE: ClassVar[str] = 'truncate'

The name of the truncate transform. The truncate transform returns the truncated value of the

NAME_OF_YEAR: ClassVar[str] = 'year'

The name of the year transform. The year transform returns the year of the input value.

class RangeTransform(field: NamedReference, assignments: List[RangePartition] | None = None)

Bases: Transform

A transform that returns the range of the input value.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

field_name() List[str]

Gets the field name to transform.

Returns:

List[str]: The field name to transform.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

class TruncateTransform(width: Literal[int], field: NamedReference)

Bases: Transform

A transform that returns the truncated value of the input value with the given width.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

field_name() List[str]

Gets the field name to transform.

Returns:

List[str]: The field name to transform.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

width() int

Gets the width to truncate to.

Returns:

int: The width to truncate to.

class YearTransform(ref: NamedReference)

Bases: SingleFieldTransform

A transform that returns the year of the input value.

EMPTY_EXPRESSION: List[Expression] = []

EMPTY_EXPRESSION is only used as an input when the default children method builds the result.

EMPTY_NAMED_REFERENCE: List[NamedReference] = []

EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.

arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

children() List[Expression]

Returns a list of the children of this node. Children should not change.

field_name() List[str]

Gets the referenced field name as a list of string parts.

Returns:

List[str]: The referenced field name as an array of String parts.

name() str

Gets the transform function name.

Returns:

str: The transform function name.

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

static apply(name: str, arguments: List[Expression]) ApplyTransform

Create a transform that applies a function to the input value.

Args:

name (str): The name of the function to apply arguments (List[Expression]): he arguments to the function

Returns:

Transforms.ApplyTransform: The created transform

abstract arguments() List[Expression]

Gets the arguments passed to the transform function.

Returns:

List[Expression]: The arguments passed to the transform function.

assignments() List[Partition]

Gets the preassigned partitions in the partitioning.

Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments

Returns:

List[Partition]: The preassigned partitions in the partitioning.

static bucket(num_buckets: int, *field_names: List[str]) BucketTransform

Create a transform that returns the bucket of the input value.

Args:

num_buckets (int): The number of buckets to use *field_names (List[str]): The field names to transform

Returns:

Transforms.BucketTransform: The created transform

children() List[Expression]

Returns a list of the children of this node. Children should not change.

static day(field_name: List[str]) DayTransform
static day(field_name: str) DayTransform

Create a transform that returns the input value.

Args:
field_name (List[str]):

The field name(s) to transform. Can be a list of field names or a single field name.

Returns:

DayTransform: The created transform

static hour(field_name: List[str]) HourTransform
static hour(field_name: str) HourTransform

Create a transform that returns the input value.

Args:
field_name (List[str]):

The field name(s) to transform. Can be a list of field names or a single field name.

Returns:

Transforms.HourTransform: The created transform

static identity(field_name: List[str]) IdentityTransform
static identity(field_name: str) IdentityTransform

Create a transform that returns the input value.

Args:
field_name (List[str]):

The field name(s) to transform. Can be a list of field names or a single field name.

Returns:

Transforms.IdentityTransform: The created transform

static list(*field_names: List[str], assignments: List[ListPartition] | None = None) ListTransform

Create a transform that includes multiple fields in a list.

Args:
*fields (List[NamedReference]):

The fields to include in the list

assignments (Optional[List[ListPartition]]):

The preassigned list partitions

Returns:

Transforms.ListTransform: The created transform

static month(field_name: List[str]) MonthTransform
static month(field_name: str) MonthTransform

Create a transform that returns the input value.

Args:
field_name (List[str]):

The field name(s) to transform. Can be a list of field names or a single field name.

Returns:

MonthTransform: The created transform

abstract name() str

Gets the transform function name.

Returns:

str: The transform function name.

static range(field_name: List[str], assignments: List[RangePartition] | None = None) RangeTransform

Create a transform that returns the range of the input value with preassigned range partitions.

Args:
field_name (List[str]):

The field name to transform

assignments (Optional[List[RangePartition]], optional):

The preassigned range partitions. Defaults to None.

Returns:

Transforms.RangeTransform: The created transform

references() List[NamedReference]

Returns a list of fields or columns that are referenced by this expression.

static truncate(width: int, field_name: List[str]) TruncateTransform
static truncate(width: int, field_name: str) TruncateTransform

Create a transform that returns the truncated value of the input value with the given width.

Args:

width (int): The width to truncate to field_name (Union[str, List[str]]): The column/field name to transform

Returns:

Transforms.TruncateTransform: The created transform

static year(field_name: List[str]) YearTransform
static year(field_name: str) YearTransform

Create a transform that returns the input value.

Args:
field_name (List[str]):

The field name(s) to transform. Can be a list of field names or a single field name.

Returns:

Transforms.YearTransform: The created transform