Skip to main content
Version: 1.3.0

Manage Table Partitions

Introduction

Although many catalogs inherently manage partitions automatically, there are scenarios where manual partition management is necessary. Usage scenarios like managing the TTL (Time-To-Live) of partition data, gathering statistics on partition metadata, and optimizing queries through partition pruning. For these reasons, Apache Gravitino provides capabilities of partition management.

Requirements and Limitations

  • Partition management is based on the partitioned table, so ensure you are operating on a partitioned table.

The following table shows the partition operations supported across various catalogs in Gravitino:

OperationHive catalogIceberg catalogJdbc-MySQL catalogJdbc-PostgreSQL catalogJdbc-Doris catalog
Add Partition
Get Partition by Name
List Partition Names
List Partitions
Drop Partition
WELCOME FEEDBACK

If you need additional partition management support for a specific catalog, create an issue on the Gravitino repository.

Partition Operations

Add Partition

Match the partition types you want to add with the table's partitioning types; Gravitino supports adding the following partition types:

Partition TypeDescription
identityAn identity partition represents a result of identity partitioning.
rangeA range partition represents a result of range partitioning.
listA list partition represents a result of list partitioning.

For JSON examples:

{
"type": "identity",
"name": "dt=2008-08-08/country=us",
"fieldNames": [
[
"dt"
],
[
"country"
]
],
"values": [
{
"type": "literal",
"dataType": "date",
"value": "2008-08-08"
},
{
"type": "literal",
"dataType": "string",
"value": "us"
}
]
}
note

The values of the field values must be the same ordering as the values of fieldNames.

When adding an identity partition to a partitioned Hive table, the specified partition name is ignored. This is because Hive generates the partition name based on field names and values.

For Java examples:

Partition partition =
Partitions.identity(
"dt=2008-08-08/country=us",
new String[][] {{"dt"}, {"country"}},
new Literal[] {
Literals.dateLiteral(LocalDate.parse("2008-08-08")), Literals.stringLiteral("us")
},
Maps.newHashMap());
note

The values are in the same order as the field names.

When adding an identity partition to a partitioned Hive table, the specified partition name is ignored. This is because Hive generates the partition name based on field names and values.

Add a partition to a partitioned table by sending a POST request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables/{partitioned_table_name}/partitions endpoint or by using the Gravitino Java client. The following is an example of adding an identity partition to a Hive partitioned table:

curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"partitions": [
{
"type": "identity",
"fieldNames": [
[
"dt"
],
[
"country"
]
],
"values": [
{
"type": "literal",
"dataType": "date",
"value": "2008-08-08"
},
{
"type": "literal",
"dataType": "string",
"value": "us"
}
]
}
]
}' http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables/table/partitions

Get a Partition by Name

Get a partition by its name via sending a GET request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables/{partitioned_table_name}/partitions/{partition_name} endpoint or by using the Gravitino Java client. The following is an example of getting a partition by its name:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables/table/partitions/p20200321
tip

If the partition name contains special characters, you should use URL encoding. For example, if the partition name is dt=2008-08-08/country=us you should use dt%3D2008-08-08%2Fcountry%3Dus in the URL.

List Partition Names Under a Partitioned Table

List all partition names under a partitioned table by sending a GET request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables/{partitioned_table_name}/partitions endpoint or by using the Gravitino Java client. The following is an example of listing all the partition names under a partitioned table:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables/table/partitions

List Partitions Under a Partitioned Table

If you want to get more detailed information about the partitions under a partitioned table, you can list all partitions under a partitioned table by sending a GET request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables/{partitioned_table_name}/partitions endpoint or by using the Gravitino Java client. The following is an example of listing all the partitions under a partitioned table:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables/table/partitions?details=true

Drop a Partition by Name

Drop a partition by its name via sending a DELETE request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables/{partitioned_table_name}/partitions/{partition_name} endpoint or by using the Gravitino Java client. The following is an example of dropping a partition by its name:

curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables/table/partitions/p20200321
tip

If the partition name contains special characters, you should use URL encoding. For example, if the partition name is dt=2008-08-08/country=us you should use dt%3D2008-08-08%2Fcountry%3Dus in the URL.