Background
With the Apache Gravitino Trino connector and the Gravitino Trino cascading connector, you can implement cascading queries in Trino.
These connectors allow you to treat other Trino clusters as data sources for the current Trino cluster,
enabling queries across catalogs in different Trino clusters.
This mechanism prioritizes executing queries in the Trino cluster located in the same region as the data, based on the data distribution in the catalogs. By doing so, it significantly reduces the amount of data transferred over the network, addressing the performance issues commonly found in traditional federated query engines where large volumes of data need to be transmitted across networks.
Deploying Trino
Deploying Trino
To set up the Trino cascading query environment, you should first deploy at least two Trino environments.
Next, install the Apache Gravitino Trino connector plugin and Gravitino Trino cascading connector plugin into Trino.
For detailed steps, please refer to the Deploying Trino documentation.
Follow these steps:
- Download the
Apache Gravitino Trino connectortarball and unpack it. The tarball contains a single top-level directory namedgravitino-trino-connector-<version>. Rename this directory togravitino. - Download the
Gravitino Trino cascading connectortarball and unpack it. This tarball also contains a single top-level directory namedgravitino-trino-cascading-connector-<version>. Rename this directory totrino. - Copy both connector directories to Trino's plugin directory.
Typically, this directory is located at
Trino-server-<version>/pluginand contains other catalogs used by Trino.
Ensure that the plugin directory includes the gravitino and trino subdirectories.
Verify the network connectivity between the machines hosting the two Trino clusters, identified as c1-trino and c2-trino.
Deploying Trino in Containers
Download the Apache Gravitino Trino connector tarball and Gravitino Trino cascading connector tarball, then unpack them.
After unpacking, you will find the directories named gravitino-trino-connector-<version>
and gravitino-trino-cascading-connector-<version>.
To start Trino on the host c1-trino and mount the plugins, execute the following command:
docker run --name c1-trino -d -p 8080:8080 <image-name> -v `gravitino-trino-connector-<version>`:/usr/lib/trino/plugin/gravitino \
-v `gravitino-trino-cascading-connector-<version>`:/usr/lib/trino/plugin/trino
Similarly, to start Trino on the host c2-trino and mount the plugins, use:
docker run --name c2-trino -d -p 8080:8080 <image-name> -v `gravitino-trino-connector-<version>`:/usr/lib/trino/plugin/gravitino \
-v `gravitino-trino-cascading-connector-<version>`:/usr/lib/trino/plugin/trino
After starting the Trino containers, ensure the configuration directory /etc/trino is correctly set up.
Also, verify that the Trino containers on c1-trino and c2-trino can communicate with each other over the network.