How to install Apache Gravitino
Install Apache Gravitino from scratch
Apache Gravitino supports running on Java 17, and higher versions should also work but not fully tested. Make sure you have Java installed and
JAVA_HOME configured correctly. To confirm the Java version, run the
${JAVA_HOME}/bin/java -version command.
The Gravitino package comprises both the Gravitino server and the Gravitino Iceberg REST server. You can manage these servers independently or run them concurrently on a single server.
Get the Apache Gravitino binary distribution package
Before installing Gravitino, make sure you have the Gravitino binary distribution package. You can download the latest Gravitino binary distribution package from GitHub. You can also build it yourself by following the instructions in How to Build Gravitino.
-
If you build Gravitino yourself using the
./gradlew compileDistributioncommand, you can find the Gravitino binary distribution package in thedistribution/packageanddistribution/package-alldirectory. The main difference between these two packages is that thepackage-allpackage contains all catalogs including catalog under foldercatalogs-contrib, while thepackagepackage only contains the main catalogs under foldercatalogs. -
If you build Gravitino yourself using the
./gradlew assembleDistributioncommand, you can get the compressed Gravitino binary distribution package with the namegravitino-<version>-bin.tar.gzin thedistributiondirectory with sha256 checksum filegravitino-<version>-bin.tar.gz.sha256. Also, you can get the complete compressed Gravitino binary distribution package with the namegravitino-<version>-bin-all.tar.gzin thedistributiondirectory with sha256 checksum filegravitino-<version>-bin-all.tar.gz.sha256. The main difference between these two packages is that the-allpackage contains all catalogs including catalog under foldercatalogs-contribwhile the normal package only contains the main catalogs under foldercatalogs.
Note: Apache Gravitino only releases gravitino-<version>-bin.tar.gz packages on GitHub releases and the gravitino-<version>-bin-all.tar.gz packages are only for users who build Gravitino from source code by themselves if interested.
The Gravitino binary distribution package contains the following files:
|── ...
└── distribution/package
|── bin/
| ├── gravitino.sh # Gravitino server Launching scripts.
| └── gravitino-iceberg-rest-server.sh # Gravitino Iceberg REST server Launching scripts.
|── catalogs
| └── fileset/ # Fileset catalog dependencies and configurations.
| └── hive/ # Apache Hive catalog dependencies and configurations.
| └── jdbc-doris/ # JDBC doris catalog dependencies and configurations.
| └── jdbc-mysql/ # JDBC MySQL catalog dependencies and configurations.
| └── jdbc-starrocks/ # JDBC Starrocks catalog dependencies and configurations.
| └── jdbc-postgresql/ # JDBC PostgreSQL catalog dependencies and configurations.
| └── jdbc-hudi/ # Hudi catalog dependencies and configurations.
| └── kafka/ # Apache Kafka catalog dependencies and configurations.
| └── lakehouse-iceberg/ # Apache Iceberg catalog dependencies and configurations.
| └── lakehouse-paimon/ # Apache Paimon catalog dependencies and configurations.
| └── model/ # Model catalog dependencies and configurations.
|── conf/ # All configurations for Gravitino.
| ├── gravitino.conf # Gravitino server and Gravitino Iceberg REST server configuration.
| ├── gravitino-iceberg-rest-server.conf # Gravitino server configuration.
| ├── gravitino-env.sh # Environment variables, etc., JAVA_HOME, GRAVITINO_HOME, and more.
| └── log4j2.properties # log4j configuration for the Gravitino server and Gravitino Iceberg REST server.
|── libs/ # Gravitino server dependencies libraries.
|── logs/ # Gravitino server and Gravitino Iceberg REST server logs. Automatically created after the server starts.
|── data/ # Default directory for the Gravitino server to store data.
|── iceberg-rest-server/ # Gravitino Iceberg REST server package and dependencies libraries.
└── scripts/ # Extra scripts for Gravitino.
Catalogs OceanBase and ClickHouse are not included in the Gravitino binary distribution package (see above) by default due to package size limitations and License compatibility issues since 1.2.0.
If you want to use these two catalogs, please build the Gravitino binary distribution package by yourself and use tarball gravitino-<version>-bin-all.tar.gz, which contains all catalogs including those in the catalogs-contrib module.
For more details about it, please refer to Reorg catalogs structure
Initialize the RDBMS (Optional)
If you want to use the relational backend storage, you need to initialize the RDBMS first. For the details on initializing the RDBMS, please check How to use relational backend storage.
Configure the Apache Gravitino server
The Gravitino server configuration file is conf/gravitino.conf. You can configure the Gravitino server by modifying this file. Basic configurations have already been added to this file. All the configurations are listed in Gravitino Server Configurations.