User Docker images
Apache Gravitino Docker image
You can deploy the service with the Gravitino Docker image.
Container startup commands
docker run --rm -d -p 8090:8090 -p 9001:9001 apache/gravitino:0.7.0-incubating
Changelog
-
apache/gravitino:0.8.0-incubating
- Based on Gravitino 0.8.0-incubating, you can know more information from 0.8.0-incubating release notes.
-
apache/gravitino:0.7.0-incubating
- Based on Gravitino 0.7.0-incubating, you can know more information from 0.7.0-incubating release notes.
- Place bundle jars (gravitino-aws-bundle.jar, gravitino-gcp-bundle.jar, gravitino-aliyun-bundle.jar) in the
${GRAVITINO_HOME}/catalogs/hadoop/libsfolder to support the cloud storage catalog without manually adding the jars to the classpath.
-
apache/gravitino:0.6.1-incubating
- Based on Gravitino 0.6.1-incubating, you can know more information from 0.6.1-incubating release notes.
-
apache/gravitino:0.6.0-incubating (Switch to Apache official DockerHub repository)
- Use the latest Gravitino version 0.6.0-incubating source code to build the image.
-
datastrato/gravitino:0.5.1
- Based on Gravitino 0.5.1, you can know more information from 0.5.1 release notes.
-
datastrato/gravitino:0.5.0
- Based on Gravitino 0.5.0, you can know more information from 0.5.0 release notes.
-
datastrato/gravitino:0.4.0
- Based on Gravitino 0.4.0, you can know more information from 0.4.0 release notes.
-
datastrato/gravitino:0.3.1
- Fix some issues
-
datastrato/gravitino:0.3.0
- Docker image
datastrato/gravitino:0.3.0 - Gravitino Server
- Expose ports:
8090Gravitino Web UI9001Iceberg REST service
- Docker image
Apache Gravitino Iceberg REST server Docker image
You can deploy the standalone Gravitino Iceberg REST server with the Docker image.
Container startup commands
docker run --rm -d -p 9001:9001 apache/gravitino-iceberg-rest:0.7.0-incubating
Changelog
-
apache/gravitino-iceberg-rest:0.8.0-incubating
- Supports OSS and ADLS storage.
-
apache/gravitino-iceberg-rest:0.8.0-incubating
- Supports OSS and ADLS storage.
- Supports event listener.
- Supports audit log.
-
apache/gravitino-iceberg-rest:0.7.0-incubating
- Using JDBC catalog backend.
- Supports S3 and GCS storage.
- Supports credential vending.
- Supports changing configuration by environment variables.
-
apache/gravitino-iceberg-rest:0.6.1-incubating
- Based on Gravitino 0.6.1-incubating, you can know more information from 0.6.1-incubating release notes.
-
apache/gravitino-iceberg-rest:0.6.0-incubating.
- Gravitino Iceberg REST Server with memory catalog backend.
- Expose ports:
9001Iceberg REST service
Playground Docker image
You can use the playground to experience the whole Gravitino system with other components.
The playground consists of multiple Docker images.
The Docker images of the playground have suitable configurations for users to experience.
Apache Hive image
Changelog
-
apache/gravitino-playground:hive-2.7.3 (Switch to Apache official DockerHub repository)
- Use
datastrato/hive:2.7.3-no-yarnDockerfile to rebuild the image.
- Use
-
datastrato/hive:2.7.3-no-yarn
- Docker image
datastrato/hive:2.7.3-no-yarn hadoop-2.7.3hive-2.3.9- Don't start YARN when container startup
- Docker image
Trino image
Changelog
-
apache/gravitino-playground:trino-435-gravitino-0.8.0-incubating
- Use Gravitino release 0.8.0-incubating Dockerfile to build the image.
-
apache/gravitino-playground:trino-435-gravitino-0.7.0-incubating
- Use Gravitino release 0.7.0-incubating Dockerfile to build the image.
-
apache/gravitino-playground:trino-435-gravitino-0.6.1-incubating
- Use Gravitino release 0.6.1-incubating Dockerfile to build the image.
-
apache/gravitino-playground:trino-435-gravitino-0.6.0-incubating (Switch to Apache official DockerHub repository)
- Use Gravitino release 0.6.0 Dockerfile to build the image.
-
datastrato/trino:435-gravitino-0.5.1
- Based on Gravitino 0.5.1, you can know more information from 0.5.1 release notes.
-
datastrato/trino:426-gravitino-0.5.0
- Based on Gravitino 0.5.0, you can know more information from 0.5.0 release notes.
-
datastrato/trino:426-gravitino-0.4.0
- Based on Gravitino 0.4.0, you can know more information from 0.4.0 release notes.
-
datastrato/trino:426-gravitino-0.3.1
- Fix some issues
-
datastrato/trino:426-gravitino-0.3.0
- Docker image
datastrato/trino:426-gravitino-0.3.0 - Base on
trino:426 - Added Gravitino trino-connector-0.3.0 libraries into the
/usr/lib/trino/plugin/gravitino
- Docker image
Developer Docker images
You can use these kinds of Docker images to facilitate integration testing of all catalog and connector modules within Gravitino.
Apache Gravitino CI Apache Hive image with kerberos enabled
You can use this kind of image to test the catalog of Apache Hive with kerberos enable
Changelog
-
apache/gravitino-ci:kerberos-hive-0.1.5 (Switch to Apache official DockerHub repository)
- Use Gravitino release 0.6.0 Dockerfile to build the image.
-
datastrato/gravitino-ci-kerberos-hive:0.1.5
- Start another HMS for the Hive cluster in the container with port 19083. This is to test whether Kerberos authentication works for a Kerberos-enabled Hive cluster with multiple HMS.
- Refresh ssh keys in the startup script.
- Add test logic to log in localhost via ssh without password.
-
datastrato/gravitino-ci-kerberos-hive:0.1.4
- Increase the total check time for the status of DataNode to 150s.
- Output the log of the DataNode fails to start
-
datastrato/gravitino-ci-kerberos-hive:0.1.3
- Add more proxy users in the core-site.xml file.
- fix bugs in the
start.shscript.
-
datastrato/gravitino-ci-kerberos-hive:0.1.2
- Add
${HOSTNAME} >> /root/.ssh/known_hoststo the startup script. - Add check for the status of DataNode, if the DataNode is not running or ready within 100s, the container will exit.
- Add
-
datastrato/gravitino-ci-kerberos-hive:0.1.1
- Add a principal for Gravitino web server named 'HTTP/localhost@HADOOPKRB'.
- Fix bugs about the configuration of proxy users.
-
datastrato/gravitino-ci-kerberos-hive:0.1.0
- Set up a Hive cluster with kerberos enabled.
- Install a KDC server and create a principal for Hive. For more please see kerberos-hive
Apache Gravitino CI Apache Hive image
You can use this kind of image to test the catalog of Apache Hive.
Changelog
-
apache/gravitino-ci:hive-0.1.17
- Add support for JDBC SQL standard authorization
- Add JDBC SQL standard authorization related configuration in the
hive-site-for-sql-base-auth.xmlandhiveserver2-site-for-sql-base-auth.xml
- Add JDBC SQL standard authorization related configuration in the
- Add support for JDBC SQL standard authorization
-
apache/gravitino-ci:hive-0.1.16
- Add GCS related configuration in the
hive-site.xmlfile. - Add GCS bundle jar in the
${HADOOP_HOME}/share/hadoop/common/lib/
- Add GCS related configuration in the
-
apache/gravitino-ci:hive-0.1.15
- Add Azure Blob Storage(ADLS) related configurations in the
hive-site.xmlfile.
- Add Azure Blob Storage(ADLS) related configurations in the
-
apache/gravitino-ci:hive-0.1.14
- Add amazon S3 related configurations in the
hive-site.xmlfile.fs.s3a.access.keyThe access key for the S3 bucket.fs.s3a.secret.keyThe secret key for the S3 bucket.fs.s3a.endpointThe endpoint for the S3 bucket.
- Add amazon S3 related configurations in the
-
apache/gravitino-ci:hive-0.1.13 (Switch to Apache official DockerHub repository)
- Use Gravitino release 0.6.0 Dockerfile to build the image.
-
datastrato/gravitino-ci-hive:0.1.13
- Support Hive 2.3.9 and HDFS 2.7.3
- Docker environment variables:
HIVE_RUNTIME_VERSION:hive2(default)
- Docker environment variables:
- Support Hive 3.1.3, HDFS 3.1.0 and Ranger plugin version 2.4.0
- Docker environment variables:
HIVE_RUNTIME_VERSION:hive3RANGER_SERVER_URL: Ranger admin URLRANGER_HIVE_REPOSITORY_NAME: Hive repository name in Ranger adminRANGER_HDFS_REPOSITORY_NAME: HDFS repository name in Ranger admin
- If you want to enable Hive Ranger plugin, you need both set the
RANGER_SERVER_URLandRANGER_HIVE_REPOSITORY_NAMEenvironment variables. Hive Ranger audit logs are stored in the/tmp/root/ranger-hive-audit.log. - If you want to enable HDFS Ranger plugin, you need both set the
RANGER_SERVER_URLandRANGER_HDFS_REPOSITORY_NAMEenvironment variables. HDFS Ranger audit logs are stored in the/usr/local/hadoop/logs/ranger-hdfs-audit.log - Example: docker run -e HIVE_RUNTIME_VERSION='hive3' -e RANGER_SERVER_URL='http://ranger-server:6080' -e RANGER_HIVE_REPOSITORY_NAME='hiveDev' -e RANGER_HDFS_REPOSITORY_NAME='hdfsDev' ... datastrato/gravitino-ci-hive:0.1.13
- Docker environment variables:
- Support Hive 2.3.9 and HDFS 2.7.3
-
datastrato/gravitino-ci-hive:0.1.12
- Shrink hive Docker image size by 420MB
-
datastrato/gravitino-ci-hive:0.1.11
- Remove
yarnfrom the startup script; Removeyarn-site.xmlandyarn-env.shfiles; - Change the value of
mapreduce.framework.namefromyarntolocalin themapred-site.xmlfile.
- Remove
-
datastrato/gravitino-ci-hive:0.1.10
- Remove SSH service from the startup script.
- Use
hadoop-daemon.shto start HDFS services.
-
datastrato/gravitino-ci-hive:0.1.9
- Remove cache after installing packages.
-
datastrato/gravitino-ci-hive:0.1.8
- Change the value of
hive.server2.enable.doAstotrue
- Change the value of
-
datastrato/gravitino-ci-hive:0.1.7
- Download MySQL JDBC driver before building the Docker image
- Set
hdfsas HDFS superuser group
-
datastrato/gravitino-ci-hive:0.1.6
- No starting YARN when container startup
- Removed expose ports:
22SSH8088YARN Service
-
datastrato/gravitino-ci-hive:0.1.5
- Rollback
Map container hostname to 127.0.0.1 before starting Hadoopofdatastrato/gravitino-ci-hive:0.1.4
- Rollback
-
datastrato/gravitino-ci-hive:0.1.4
- Configure HDFS DataNode data transfer address to be
0.0.0.0:50010 - Map the container hostname to
127.0.0.1before starting Hadoop - Expose
50010port for the HDFS DataNode
- Configure HDFS DataNode data transfer address to be
-
datastrato/gravitino-ci-hive:0.1.3
- Change MySQL bind-address from
127.0.0.1to0.0.0.0 - Add
icebergto MySQL users with passwordiceberg - Export
3306port for MySQL
- Change MySQL bind-address from
-
datastrato/gravitino-ci-hive:0.1.2
- Based on
datastrato/gravitino-ci-hive:0.1.1 - Modify
fs.defaultFSfromlocalto0.0.0.0in thecore-site.xmlfile. - Expose
9000port in theDockerfilefile.
- Based on
-
datastrato/gravitino-ci-hive:0.1.1
- Based on
datastrato/gravitino-ci-hive:0.1.0 - Modify HDFS/YARN/HIVE
MaxPermSizefrom8GBto128MB - Modify
HADOOP_HEAPSIZEfrom8192to128
- Based on
-
datastrato/gravitino-ci-hive:0.1.0
- Docker image
datastrato/gravitino-ci-hive:0.1.0 hadoop-2.7.3hive-2.3.9- Expose ports:
22SSH9000HDFS defaultFS50070HDFS NameNode50075HDFS DataNode HTTP server50010HDFS DataNode data transfer8088YARN Service9083Hive metastore10000HiveServer210002HiveServer2 HTTP
- Docker image
Apache Gravitino CI Trino image
You can use this image to test Trino.
Changelog
-
apache/gravitino-ci:trino-0.1.6 (Switch to Apache official DockerHub repository)
- Use Gravitino release 0.6.0 Dockerfile to build the image.
-
datastrato/gravitino-ci-trino:0.1.6
- Upgrade trino:426 to trino:435
-
datastrato/gravitino-ci-trino:0.1.5
- Add check for the version of gravitino-trino-connector
-
datastrato/gravitino-ci-trino:0.1.4
- Change
-Xmx1Gto-Xmx2Gin the config file/etc/trino/jvm.config
- Change
-
datastrato/gravitino-ci-trino:0.1.3
- Remove copy content in folder
gravitino-trino-connectorto plugin folder/usr/lib/trino/plugin/gravitino
- Remove copy content in folder
-
datastrato/gravitino-ci-trino:0.1.2
- Copy JDBC driver 'mysql-connector-java' and 'postgres' to
/usr/lib/trino/iceberg/folder
- Copy JDBC driver 'mysql-connector-java' and 'postgres' to
-
datastrato/gravitino-ci-trino:0.1.0
- Docker image
datastrato/gravitino-ci-trino:0.1.0 - Based on
trinodb/trino:426and removed some unused plugins from it. - Expose ports:
8080Trino JDBC port
- Docker image
Apache Gravitino CI Doris image
You can use this image to test Apache Doris.
Changelog
-
apache/gravitino-ci:doris-0.1.5 (Switch to Apache official DockerHub repository)
- Use Gravitino release 0.6.0 Dockerfile to build the image.
-
datastrato/gravitino-ci-doris:0.1.5
- Remove the chmod command in the Dockerfile to decrease the size of the Docker image.
-
datastrato/gravitino-ci-doris:0.1.4
- remove chmod in start.sh to accelerate the startup speed
-
datastrato/gravitino-ci-doris:0.1.3
- To adapt to the CI framework, don't exit container when start failed, logs are no longer printed to stdout.
- Add
report_disk_state_interval_secondsconfig to decrease report interval.
-
datastrato/gravitino-ci-doris:0.1.2
- Add a check for the status of Doris BE, add retry for adding BE nodes.
-
datastrato/gravitino-ci-doris:0.1.1
- Optimize
start.sh, add disk space check before starting Doris, exit when FE or BE start failed, add log to stdout
- Optimize
-
datastrato/gravitino-ci-doris:0.1.0
- Docker image
datastrato/gravitino-ci-doris:0.1.0 - Start Doris BE & FE in one container
- Please set table properties
"replication_num" = "1"when creating a table in Doris, because the default replication number is 3, but the Doris container only has one BE. - Username:
root, Password: N/A (password is empty) - Expose ports:
8030Doris FE HTTP port9030Doris FE MySQL server port
- Docker image
Apache Gravitino CI Apache Ranger image
You can use this image to control Trino's permissions.
Changelog
-
apache/gravitino-ci:ranger-0.1.1 (Switch to Apache official DockerHub repository)
- Use Gravitino release 0.6.0 Dockerfile to build the image.
-
datastrato/gravitino-ci-ranger:0.1.1
- Docker image datastrato/gravitino-ci-ranger:0.1.1
- Use
ranger-adminrelease fromdatastrato/apache-ranger:2.4.0to build docker image. - Remove unnecessary hack in
start-ranger-service.sh. - Reduce docker image build time from
~1hto~5min. - How to debug Ranger admin service:
- Use
docker exec -it <container_id> bashto enter the docker container. - Add these context
export JAVA_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5001into/opt/ranger-admin/ews/webapp/WEB-INF/classes/conf/ranger-admin-env-debug.shin the docker container. - Execute
./opt/ranger-admin/stop-ranger-admin.shand./opt/ranger-admin/start-ranger-admin.shto restart Ranger admin. - Clone the
Apache Rangerproject from GiHub and checkout the2.4.0release. - Create a remote debug configuration (
Use model classpath=EmbeddedServer) in your IDE and connect to the Ranger admin container.
- Use
-
datastrato/gravitino-ci-ranger:0.1.0
- Docker image
datastrato/gravitino-ci-ranger:0.1.0 - Support Apache Ranger 2.4.0
- Use environment variable
RANGER_PASSWORDto set up Apache Ranger admin password, Please notice Apache Ranger Password should be minimum 8 characters with min one alphabet and one numeric. - Expose ports:
6080Apache Ranger admin port
- Docker image