gravitino.api.expressions.transforms.transforms.Transforms¶
- class gravitino.api.expressions.transforms.transforms.Transforms¶
Bases:
Transform
Helper methods to create logical transforms to pass into Apache Gravitino.
- __init__()¶
Methods
__init__
()apply
(name, arguments)Create a transform that applies a function to the input value.
Gets the arguments passed to the transform function.
Gets the preassigned partitions in the partitioning.
bucket
(num_buckets, *field_names)Create a transform that returns the bucket of the input value.
children
()Returns a list of the children of this node.
day
()Create a transform that returns the input value.
hour
()Create a transform that returns the input value.
identity
()Create a transform that returns the input value.
list
(*field_names[, assignments])Create a transform that includes multiple fields in a list.
month
()Create a transform that returns the input value.
name
()Gets the transform function name.
range
(field_name[, assignments])Create a transform that returns the range of the input value with preassigned range partitions.
Returns a list of fields or columns that are referenced by this expression.
truncate
()Create a transform that returns the truncated value of the input value with the given width.
year
()Create a transform that returns the input value.
Attributes
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
An empty array of transforms.
The name of the bucket transform.
The name of the day transform.
The name of the hour transform.
The name of the identity transform.
The name of the list transform.
The name of the month transform.
The name of the range transform.
The name of the truncate transform.
The name of the year transform.
- class ApplyTransform(name: str, arguments: List[Expression])¶
Bases:
Transform
A transform that applies a function to the input value.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- class BucketTransform(num_buckets: Literal[int], fields: List[NamedReference])¶
Bases:
Transform
A transform that returns the bucket of the input value.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- class DayTransform(ref: NamedReference)¶
Bases:
SingleFieldTransform
A transform that returns the day of the input value.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- field_name() List[str] ¶
Gets the referenced field name as a list of string parts.
- Returns:
List[str]: The referenced field name as an array of String parts.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- class HourTransform(ref: NamedReference)¶
Bases:
SingleFieldTransform
A transform that returns the hour of the input value.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- field_name() List[str] ¶
Gets the referenced field name as a list of string parts.
- Returns:
List[str]: The referenced field name as an array of String parts.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- class IdentityTransform(ref: NamedReference)¶
Bases:
SingleFieldTransform
A transform that returns the input value.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- field_name() List[str] ¶
Gets the referenced field name as a list of string parts.
- Returns:
List[str]: The referenced field name as an array of String parts.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- class ListTransform(fields: List[NamedReference], assignments: List[ListPartition] | None = None)¶
Bases:
Transform
A transform that includes multiple fields in a list.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- field_names() List[List[str]] ¶
Gets the field names to include in the list.
- Returns:
List[List[str]]: The field names to include in the list.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- class MonthTransform(ref: NamedReference)¶
Bases:
SingleFieldTransform
A transform that returns the month of the input value.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- field_name() List[str] ¶
Gets the referenced field name as a list of string parts.
- Returns:
List[str]: The referenced field name as an array of String parts.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- NAME_OF_BUCKET: ClassVar[str] = 'bucket'¶
The name of the bucket transform. The bucket transform returns the bucket of the input value.
- NAME_OF_DAY: ClassVar[str] = 'day'¶
The name of the day transform. The day transform returns the day of the input value.
- NAME_OF_HOUR: ClassVar[str] = 'hour'¶
The name of the hour transform. The hour transform returns the hour of the input value.
- NAME_OF_IDENTITY: ClassVar[str] = 'identity'¶
The name of the identity transform.
- NAME_OF_LIST: ClassVar[str] = 'list'¶
The name of the list transform. The list transform includes multiple fields in a list.
- NAME_OF_MONTH: ClassVar[str] = 'month'¶
The name of the month transform. The month transform returns the month of the input value.
- NAME_OF_RANGE: ClassVar[str] = 'range'¶
The name of the range transform. The range transform returns the range of the input value.
- NAME_OF_TRUNCATE: ClassVar[str] = 'truncate'¶
The name of the truncate transform. The truncate transform returns the truncated value of the
- NAME_OF_YEAR: ClassVar[str] = 'year'¶
The name of the year transform. The year transform returns the year of the input value.
- class RangeTransform(field: NamedReference, assignments: List[RangePartition] | None = None)¶
Bases:
Transform
A transform that returns the range of the input value.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- field_name() List[str] ¶
Gets the field name to transform.
- Returns:
List[str]: The field name to transform.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- class TruncateTransform(width: Literal[int], field: NamedReference)¶
Bases:
Transform
A transform that returns the truncated value of the input value with the given width.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- field_name() List[str] ¶
Gets the field name to transform.
- Returns:
List[str]: The field name to transform.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- width() int ¶
Gets the width to truncate to.
- Returns:
int: The width to truncate to.
- class YearTransform(ref: NamedReference)¶
Bases:
SingleFieldTransform
A transform that returns the year of the input value.
- EMPTY_EXPRESSION: List[Expression] = []¶
EMPTY_EXPRESSION is only used as an input when the default children method builds the result.
- EMPTY_NAMED_REFERENCE: List[NamedReference] = []¶
EMPTY_NAMED_REFERENCE is only used as an input when the default references method builds the result array to avoid repeatedly allocating an empty array.
- arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- field_name() List[str] ¶
Gets the referenced field name as a list of string parts.
- Returns:
List[str]: The referenced field name as an array of String parts.
- name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- static apply(name: str, arguments: List[Expression]) ApplyTransform ¶
Create a transform that applies a function to the input value.
- Args:
name (str): The name of the function to apply arguments (List[Expression]): he arguments to the function
- Returns:
Transforms.ApplyTransform: The created transform
- abstract arguments() List[Expression] ¶
Gets the arguments passed to the transform function.
- Returns:
List[Expression]: The arguments passed to the transform function.
- assignments() List[Partition] ¶
Gets the preassigned partitions in the partitioning.
Currently, only Transforms.ListTransform and Transforms.RangeTransform need to deal with assignments
- Returns:
List[Partition]: The preassigned partitions in the partitioning.
- static bucket(num_buckets: int, *field_names: List[str]) BucketTransform ¶
Create a transform that returns the bucket of the input value.
- Args:
num_buckets (int): The number of buckets to use *field_names (List[str]): The field names to transform
- Returns:
Transforms.BucketTransform: The created transform
- children() List[Expression] ¶
Returns a list of the children of this node. Children should not change.
- static day(field_name: List[str]) DayTransform ¶
- static day(field_name: str) DayTransform
Create a transform that returns the input value.
- Args:
- field_name (List[str]):
The field name(s) to transform. Can be a list of field names or a single field name.
- Returns:
DayTransform: The created transform
- static hour(field_name: List[str]) HourTransform ¶
- static hour(field_name: str) HourTransform
Create a transform that returns the input value.
- Args:
- field_name (List[str]):
The field name(s) to transform. Can be a list of field names or a single field name.
- Returns:
Transforms.HourTransform: The created transform
- static identity(field_name: List[str]) IdentityTransform ¶
- static identity(field_name: str) IdentityTransform
Create a transform that returns the input value.
- Args:
- field_name (List[str]):
The field name(s) to transform. Can be a list of field names or a single field name.
- Returns:
Transforms.IdentityTransform: The created transform
- static list(*field_names: List[str], assignments: List[ListPartition] | None = None) ListTransform ¶
Create a transform that includes multiple fields in a list.
- Args:
- *fields (List[NamedReference]):
The fields to include in the list
- assignments (Optional[List[ListPartition]]):
The preassigned list partitions
- Returns:
Transforms.ListTransform: The created transform
- static month(field_name: List[str]) MonthTransform ¶
- static month(field_name: str) MonthTransform
Create a transform that returns the input value.
- Args:
- field_name (List[str]):
The field name(s) to transform. Can be a list of field names or a single field name.
- Returns:
MonthTransform: The created transform
- abstract name() str ¶
Gets the transform function name.
- Returns:
str: The transform function name.
- static range(field_name: List[str], assignments: List[RangePartition] | None = None) RangeTransform ¶
Create a transform that returns the range of the input value with preassigned range partitions.
- Args:
- field_name (List[str]):
The field name to transform
- assignments (Optional[List[RangePartition]], optional):
The preassigned range partitions. Defaults to None.
- Returns:
Transforms.RangeTransform: The created transform
- references() List[NamedReference] ¶
Returns a list of fields or columns that are referenced by this expression.
- static truncate(width: int, field_name: List[str]) TruncateTransform ¶
- static truncate(width: int, field_name: str) TruncateTransform
Create a transform that returns the truncated value of the input value with the given width.
- Args:
width (int): The width to truncate to field_name (Union[str, List[str]]): The column/field name to transform
- Returns:
Transforms.TruncateTransform: The created transform
- static year(field_name: List[str]) YearTransform ¶
- static year(field_name: str) YearTransform
Create a transform that returns the input value.
- Args:
- field_name (List[str]):
The field name(s) to transform. Can be a list of field names or a single field name.
- Returns:
Transforms.YearTransform: The created transform