Skip to main content
Version: 0.6.1-incubating

Flink connector hive catalog

With the Apache Gravitino Flink connector, accessing data or managing metadata in Hive catalogs becomes straightforward, enabling seamless federation queries across different Hive catalogs.

Capabilities

Supports most DDL and DML operations in Flink SQL, except such operations:

  • Function operations
  • Partition operations
  • View operations
  • Querying UDF
  • LOAD clause
  • UNLOAD clause
  • CREATE TABLE LIKE clause
  • TRUCATE TABLE clause
  • UPDATE clause
  • DELETE clause
  • CALL clause

Requirement

  • Hive metastore 2.x
  • HDFS 2.x or 3.x

SQL example


// Suppose hive_a is the Hive catalog name managed by Gravitino
USE hive_a;

CREATE DATABASE IF NOT EXISTS mydatabase;
USE mydatabase;

// Create table
CREATE TABLE IF NOT EXISTS employees (
id INT,
name STRING,
date INT
)
PARTITIONED BY (date);

DESC TABLE EXTENDED employees;

INSERT INTO TABLE employees VALUES (1, 'John Doe', 20240101), (2, 'Jane Smith', 20240101);
SELECT * FROM employees WHERE date = '20240101';

Catalog properties

The configuration of Flink Hive Connector is the same with the original Flink Hive connector. Gravitino catalog property names with the prefix flink.bypass. are passed to Flink Hive connector. For example, using flink.bypass.hive-conf-dir to pass the hive-conf-dir to the Flink Hive connector. The validated catalog properties are listed below. Any other properties with the prefix flink.bypass. in Gravitino Catalog will be ignored by Gravitino Flink Connector.

Property name in Gravitino catalog propertiesFlink Hive connector configurationDescriptionSince Version
flink.bypass.default-databasedefault-databaseHive default database0.6.0
flink.bypass.hive-conf-dirhive-conf-dirHive conf dir0.6.0
flink.bypass.hive-versionhive-versionHive version0.6.0
flink.bypass.hadoop-conf-dirhadoop-conf-dirHadoop conf dir0.6.0
metastore.urishive.metastore.urisHive metastore uri0.6.0
caution

You can set other hadoop properties (with the prefix hadoop., dfs., fs., hive.) in Gravitino Catalog properties. If so, it will override the configuration from the hive-conf-dir and hadoop-conf-dir.