Apache Gravitino Trino connector development
This document is to guide users through the development of the Apache Gravitino Trino connector for Trino locally.
Prerequisites
Before you start developing the Gravitino Trino connector, you need to have the following prerequisites:
- You need to start the Gravitino server locally, for more information, please refer to the start Gravitino server
- Create a catalog in the Gravitino server, for more information, please refer to the Gravitino metadata management. Assuming we have just created a MySQL catalog using the following command:
curl -X POST -H "Content-Type: application/json" -d '{"name":"test","comment":"comment","properties":{}}' http://localhost:8090/api/metalakes
curl -X POST -H "Content-Type: application/json" -d '{"name":"mysql_catalog3","type":"RELATIONAL","comment":"comment","provider":"jdbc-mysql", "properties":{
"jdbc-url": "jdbc:mysql://127.0.0.1:3306?useSSL=false&allowPublicKeyRetrieval=true",
"jdbc-user": "root",
"jdbc-password": "123456",
"jdbc-driver": "com.mysql.cj.jdbc.Driver"
}}' http://localhost:8090/api/metalakes/test/catalogs
Please change the above localhost
, port
and the names of metalake and catalogs accordingly.
Development environment
To develop the Gravitino Trino connector locally, you need to do the following steps:
IDEA
- Clone the Trino repository from the GitHub repository. The released version Trino-435 is the least version that Gravitino supports.
- Open the Trino project in your IDEA.
- Create a new module for the Gravitino Trino connector in the Trino project as the following picture (we will use the name
trino-gravitino
as the module name in the following steps). - Add a soft link to the Gravitino Trino connector module in the Trino project. Assuming the src java main directory of the Gravitino Trino connector in project Gravitino is
gravitino/path/to/gravitino-trino-connector/src/main/java
, and the src java main directory of trino-gravitino in the Trino project istrino/path/to/trino-gravitino/src/main/java
, you can use the following command to create a soft link:
ln -s gravitino/path/to/trino-connector/src/main/java trino/path/to/trino-gravitino/src/main/java
then you can see the gravitino-trino-connecor
source files and directories in the trino-gravitino
module as follows:
- Add
<module>plugin/trino-gravitino</module>
totrino/pom.xml
and change thepom.xml
file in thetrino-gravitino
module accordingly. This is an example content of thepom.xml
file in thetrino-gravitino
module. Ensure that the version of trino-root is identical to the version of trino.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>io.trino</groupId>
<artifactId>trino-root</artifactId>
<version>435</version>
<relativePath>../../pom.xml</relativePath>
</parent>
<artifactId>trino-gravitino</artifactId>
<packaging>trino-plugin</packaging>
<description>Trino - Gravitino Connector</description>
<properties>
<air.main.basedir>${project.parent.basedir}</air.main.basedir>
</properties>
<dependencies>
<!--
You can switch to the snapshot version as you like, for example,
if you want to use the jar of latest main branch,
you can execute the following command to install Gravitino `client-java-runtime` jar locally.
./gradlew publishToMavenLocal
-->
<dependency>
<groupId>org.apache.gravitino</groupId>
<artifactId>catalog-common</artifactId>
<version><GRAVITINO_VERSION></version>
<exclusions>
<exclusion>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-core</artifactId>
</exclusion>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.gravitino</groupId>
<artifactId>client-java-runtime</artifactId>
<version><GRAVITINO_VERSION></version>
</dependency>
<dependency>
<groupId>io.airlift</groupId>
<artifactId>json</artifactId>
</dependency>
<dependency>
<groupId>io.airlift.resolver</groupId>
<artifactId>resolver</artifactId>
<version>1.6</version>
</dependency>
<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-client</artifactId>
</dependency>
<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-jdbc</artifactId>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.4</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
</dependency>
<dependency>
<groupId>org.codehaus.plexus</groupId>
<artifactId>plexus-xml</artifactId>
<version>4.0.2</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>2.0.9</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j2-impl</artifactId>
<version>2.22.0</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.22.0</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.22.0</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-api</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-spi</artifactId>
<scope>provided</scope>
</dependency>
</dependencies>
</project>
- Try to compile module
trino-gravitino
to see if there are any errors.
# build the whole trino project
./mvnw -pl '!core/trino-server-rpm' package -DskipTests -Dair.check.skip-all=true
# build the trino-gravitino module if we change the code in the trino-gravitino module
./mvnw clean -pl 'plugin/trino-gravitino' package -DskipTests -Dcheckstyle.skip -Dair.check.skip-checkstyle=true -DskipTests -Dair.check.skip-all=true
If a compile error occurs due to The following artifacts could not be resolved: org.apache.gravitino:xxx:jar
, which can be resolved by executing ./gradlew publishToMavenLocal
in gravitino beforehand.
- Set up the configuration for the Gravitino Trino connector in the Trino project. You can do as the following picture shows:
The corresponding configuration files are here:
- Gravitino properties file:
gravitino.properties
# the connector name is always 'gravitino'
connector.name=gravitino
# uri of the gravitino server, you need to change it according to your environment
gravitino.uri=http://localhost:8090
# The name of the metalake to which the connector is connected, you need to change it according to your environment
gravitino.metalake=test
- Trino configuration file:
config.properties
#
# WARNING
# ^^^^^^^
# This configuration file is for development only and should NOT be used
# in production. For example configuration, see the Trino documentation.
# sample nodeId to provide consistency across test runs
node.id=ffffffff-ffff-ffff-ffff-ffffffffffff
node.environment=test
node.internal-address=localhost
experimental.concurrent-startup=true
# Default port is 8080, We change it to 8180
http-server.http.port=8180
discovery.uri=http://localhost:8180
exchange.http-client.max-connections=1000
exchange.http-client.max-connections-per-server=1000
exchange.http-client.connect-timeout=1m
exchange.http-client.idle-timeout=1m
scheduler.http-client.max-connections=1000
scheduler.http-client.max-connections-per-server=1000
scheduler.http-client.connect-timeout=1m
scheduler.http-client.idle-timeout=1m
query.client.timeout=5m
query.min-expire-age=30m
# We removed several catalogs that won't be used in Gravitino
plugin.bundles=\
../../plugin/trino-iceberg/pom.xml,\
../../plugin/trino-hive/pom.xml,\
../../plugin/trino-local-file/pom.xml, \
../../plugin/trino-mysql/pom.xml,\
../../plugin/trino-postgresql/pom.xml, \
../../plugin/trino-exchange-filesystem/pom.xml, \
../../plugin/trino-gravitino/pom.xml
node-scheduler.include-coordinator=true
# Note: The Gravitino Trino connector olny supports with The dynamic catalog manager
catalog.management=dynamic
Remove the file /etc/catalogs/xxx.properties
if the corresponding plugin/trino-xxx/pom.xml
is not recorded in the /etc/config.properties
. For the hive plugin, please use plugin/trino-hive/pom.xml
after release version 435. Others should use plugin/trino-hive-hadoop2/pom.xml
.
- Start the Trino server and connect to the Gravitino server.
- If
DevelopmentServer
has started successfully, you can connect to the Trino server using thetrino-cli
and run the following command to see if the Gravitino Trino connector is available:
java -jar trino-cli-429-executable.jar --server localhost:8180
The trino-cli-429-executable.jar
is the Trino CLI jar file, you can download it from the Trino release page. Users can use the version of the Trino CLI jar file according to the version of the Trino server.
- If nothing goes wrong, you can start developing the Gravitino Trino connector in the Gravitino project and debug it in the Trino project.