Skip to main content
Version: 0.7.0-incubating

Apache Doris catalog

Introduction

Apache Gravitino provides the ability to manage Apache Doris metadata through JDBC connection.

caution

Gravitino saves some system information in schema and table comments, like (From Gravitino, DO NOT EDIT: gravitino.v1.uid1078334182909406185), please don't change or remove this message.

Catalog

Catalog capabilities

  • Gravitino catalog corresponds to the Doris instance.
  • Supports metadata management of Doris (1.2.x).
  • Supports table index.
  • Supports column default value.

Catalog properties

You can pass to a Doris data source any property that isn't defined by Gravitino by adding gravitino.bypass. prefix as a catalog property. For example, catalog property gravitino.bypass.maxWaitMillis will pass maxWaitMillis to the data source property.

You can check the relevant data source configuration in data source properties for more details.

Besides the common catalog properties, the Doris catalog has the following properties:

Configuration itemDescriptionDefault valueRequiredSince Version
jdbc-urlJDBC URL for connecting to the database. For example, jdbc:mysql://localhost:9030(none)Yes0.5.0
jdbc-driverThe driver of the JDBC connection. For example, com.mysql.jdbc.Driver.(none)Yes0.5.0
jdbc-userThe JDBC user name.(none)Yes0.5.0
jdbc-passwordThe JDBC password.(none)Yes0.5.0
jdbc.pool.min-sizeThe minimum number of connections in the pool. 2 by default.2No0.5.0
jdbc.pool.max-sizeThe maximum number of connections in the pool. 10 by default.10No0.5.0
jdbc.pool.max-sizeThe maximum number of connections in the pool. 10 by default.10No0.5.0
replication_numThe number of replications for the table. If not specified and the number of backend servers less than 3, then the default value is 1; If not specified and the number of backend servers greater or equals to 3, the default value (3) in Doris server will be used. For more, please see the doc1 or 3No0.6.0-incubating

Before using the Doris Catalog, you must download the corresponding JDBC driver to the catalogs/jdbc-doris/libs directory. Gravitino doesn't package the JDBC driver for Doris due to licensing issues.

Catalog operations

Refer to Manage Relational Metadata Using Gravitino for more details.

Schema

Schema capabilities

  • Gravitino's schema concept corresponds to the Doris database.
  • Supports creating schema.
  • Supports dropping schema.

Schema properties

  • Support schema properties, including Doris database properties and user-defined properties.

Schema operations

Please refer to Manage Relational Metadata Using Gravitino for more details.

Table

Table capabilities

  • Gravitino's table concept corresponds to the Doris table.
  • Supports index.
  • Supports column default value.

Table column types

Gravitino TypeDoris Type
BooleanBoolean
ByteTinyInt
ShortSmallInt
IntegerInt
LongBigInt
FloatFloat
DoubleDouble
DecimalDecimal
DateDate
TimestampDatetime
VarCharVarChar
FixedCharChar
StringString

Doris doesn't support Gravitino Fixed Struct List Map Timestamp_tz IntervalDay IntervalYear Union UUID type. The data types other than those listed above are mapped to Gravitino's Unparsed Type that represents an unresolvable data type since 0.5.0.

Table column auto-increment

Unsupported for now.

Table properties

  • Doris supports table properties, and you can set them in the table properties.
  • Only supports Doris table properties and doesn't support user-defined properties.

Table indexes

  • Supports PRIMARY_KEY

    Please be aware that the index can only apply to a single column.

    {
    "indexes": [
    {
    "indexType": "primary_key",
    "name": "PRIMARY",
    "fieldNames": [["id"]]
    }
    ]
    }

Table partitioning

The Doris catalog supports partitioned tables. Users can create partitioned tables in the Doris catalog with specific partitioning attributes. It is also supported to pre-assign partitions when creating Doris tables. Note that although Gravitino supports several partitioning strategies, Apache Doris inherently only supports these two partitioning strategies:

  • RANGE
  • LIST
caution

The fieldName specified in the partitioning attributes must be the name of columns defined in the table.

Table distribution

Users can also specify the distribution strategy when creating tables in the Doris catalog. Currently, the Doris catalog supports the following distribution strategies:

  • HASH
  • RANDOM

For the RANDOM distribution strategy, Gravitino uses the EVEN to represent it. More information about the distribution strategy defined in Gravitino can be found here.

Table operations

Please refer to Manage Relational Metadata Using Gravitino for more details.

Alter table operations

Gravitino supports these table alteration operations:

  • RenameTable
  • UpdateComment
  • AddColumn
  • DeleteColumn
  • UpdateColumnType
  • UpdateColumnPosition
  • UpdateColumnComment
  • SetProperty

Please be aware that:

  • Not all table alteration operations can be processed in batches.
  • Schema changes, such as adding/modifying/dropping columns can be processed in batches.
  • Supports modifying multiple column comments at the same time.
  • Doesn't support modifying the column type and column comment at the same time.
  • The schema alteration in Doris is asynchronous. You might get an outdated schema if you execute a schema query immediately after the alteration. It is recommended to pause briefly after the schema alteration. Gravitino will add the schema alteration status into the schema information in the upcoming version to solve this problem.