Class FileFetcher

java.lang.Object
org.apache.gravitino.utils.FileFetcher

public final class FileFetcher extends Object
Singleton that fetches a file referenced by a URI to a local destination. Supports file, http, https, ftp and hdfs schemes. This is the single shared implementation used by the job manager and the Kerberos clients of the Hive, Iceberg, Hadoop and Paimon catalogs.

The hdfs scheme is resolved reflectively against org.apache.hadoop.fs.FileSystem so that this class can live in the common module, which does not declare a compile-time dependency on Hadoop. Callers that never use hdfs URIs (for example the job manager and the Paimon catalog) simply pass null for the Hadoop configuration.

  • Field Details

    • BLOCK_UNSAFE_REMOTE_URI_CONFIG

      public static final String BLOCK_UNSAFE_REMOTE_URI_CONFIG
      The server configuration that controls unsafe remote URI blocking.
      See Also:
  • Method Details

    • get

      public static FileFetcher get()
      Returns the singleton file fetcher.
      Returns:
      the singleton file fetcher
    • initialize

      public void initialize(boolean blockUnsafeRemoteUri)
      Initializes the file fetcher.
      Parameters:
      blockUnsafeRemoteUri - whether to block remote URIs that resolve to unsafe addresses
    • fetchFileFromUri

      public String fetchFileFromUri(String fileUri, File destFile, int timeoutMs, @Nullable Object hadoopConf) throws IOException
      Fetches the file referenced by fileUri into destFile.
      Parameters:
      fileUri - the source URI; a missing scheme is treated as file
      destFile - the local destination file
      timeoutMs - the connect/read timeout in milliseconds, applied to remote (http/https/ftp) downloads
      hadoopConf - an org.apache.hadoop.conf.Configuration instance, required only for the hdfs scheme; may be null when no hdfs URI is fetched
      Returns:
      the absolute path of destFile
      Throws:
      IOException - if the file cannot be fetched