Apache Doris catalog
Introduction
Apache Gravitino provides the ability to manage Apache Doris metadata through JDBC connection.
Gravitino saves some system information in schema and table comments, like
(From Gravitino, DO NOT EDIT: gravitino.v1.uid1078334182909406185)
, please don't change or remove this message.
Catalog
Catalog capabilities
- Gravitino catalog corresponds to the Doris instance.
- Supports metadata management of Doris (1.2.x).
- Supports table index.
- Supports column default value.
Catalog properties
You can pass to a Doris data source any property that isn't defined by Gravitino by adding
gravitino.bypass.
prefix as a catalog property. For example, catalog property
gravitino.bypass.maxWaitMillis
will pass maxWaitMillis
to the data source property.
You can check the relevant data source configuration in data source properties for more details.
Besides the common catalog properties, the Doris catalog has the following properties:
Configuration item | Description | Default value | Required | Since Version |
---|---|---|---|---|
jdbc-url | JDBC URL for connecting to the database. For example, jdbc:mysql://localhost:9030 | (none) | Yes | 0.5.0 |
jdbc-driver | The driver of the JDBC connection. For example, com.mysql.jdbc.Driver . | (none) | Yes | 0.5.0 |
jdbc-user | The JDBC user name. | (none) | Yes | 0.5.0 |
jdbc-password | The JDBC password. | (none) | Yes | 0.5.0 |
jdbc.pool.min-size | The minimum number of connections in the pool. 2 by default. | 2 | No | 0.5.0 |
jdbc.pool.max-size | The maximum number of connections in the pool. 10 by default. | 10 | No | 0.5.0 |
jdbc.pool.max-size | The maximum number of connections in the pool. 10 by default. | 10 | No | 0.5.0 |
replication_num | The number of replications for the table. If not specified and the number of backend servers less than 3, then the default value is 1; If not specified and the number of backend servers greater or equals to 3, the default value (3) in Doris server will be used. For more, please see the doc | 1 or 3 | No | 0.6.0-incubating |
Before using the Doris Catalog, you must download the corresponding JDBC driver to the catalogs/jdbc-doris/libs
directory.
Gravitino doesn't package the JDBC driver for Doris due to licensing issues.
Catalog operations
Refer to Manage Relational Metadata Using Gravitino for more details.
Schema
Schema capabilities
- Gravitino's schema concept corresponds to the Doris database.
- Supports creating schema.
- Supports dropping schema.
Schema properties
- Support schema properties, including Doris database properties and user-defined properties.
Schema operations
Please refer to Manage Relational Metadata Using Gravitino for more details.
Table
Table capabilities
- Gravitino's table concept corresponds to the Doris table.
- Supports index.
- Supports column default value.
Table column types
Gravitino Type | Doris Type |
---|---|
Boolean | Boolean |
Byte | TinyInt |
Short | SmallInt |
Integer | Int |
Long | BigInt |
Float | Float |
Double | Double |
Decimal | Decimal |
Date | Date |
Timestamp | Datetime |
VarChar | VarChar |
FixedChar | Char |
String | String |
Doris doesn't support Gravitino Fixed
Struct
List
Map
Timestamp_tz
IntervalDay
IntervalYear
Union
UUID
type.
The data types other than those listed above are mapped to Gravitino's
Unparsed Type that
represents an unresolvable data type since 0.5.0.
Table column auto-increment
Unsupported for now.
Table properties
- Doris supports table properties, and you can set them in the table properties.
- Only supports Doris table properties and doesn't support user-defined properties.
Table indexes
-
Supports PRIMARY_KEY
Please be aware that the index can only apply to a single column.
- Json
- Java
{
"indexes": [
{
"indexType": "primary_key",
"name": "PRIMARY",
"fieldNames": [["id"]]
}
]
}Index[] indexes = new Index[] {
Indexes.of(IndexType.PRIMARY_KEY, "PRIMARY", new String[][]{{"id"}})
}
Table partitioning
The Doris catalog supports partitioned tables. Users can create partitioned tables in the Doris catalog with specific partitioning attributes. It is also supported to pre-assign partitions when creating Doris tables. Note that although Gravitino supports several partitioning strategies, Apache Doris inherently only supports these two partitioning strategies:
RANGE
LIST
The fieldName
specified in the partitioning attributes must be the name of columns defined in the table.
Table distribution
Users can also specify the distribution strategy when creating tables in the Doris catalog. Currently, the Doris catalog supports the following distribution strategies:
HASH
RANDOM
For the RANDOM
distribution strategy, Gravitino uses the EVEN
to represent it. More information about the distribution strategy defined in Gravitino can be found here.
Table operations
Please refer to Manage Relational Metadata Using Gravitino for more details.
Alter table operations
Gravitino supports these table alteration operations:
RenameTable
UpdateComment
AddColumn
DeleteColumn
UpdateColumnType
UpdateColumnPosition
UpdateColumnComment
SetProperty
Please be aware that:
- Not all table alteration operations can be processed in batches.
- Schema changes, such as adding/modifying/dropping columns can be processed in batches.
- Supports modifying multiple column comments at the same time.
- Doesn't support modifying the column type and column comment at the same time.
- The schema alteration in Doris is asynchronous. You might get an outdated schema if you execute a schema query immediately after the alteration. It is recommended to pause briefly after the schema alteration. Gravitino will add the schema alteration status into the schema information in the upcoming version to solve this problem.