This guide serves as a introduction to several key entities that can be managed with Apache Polaris (Incubating), describes how to build and deploy Polaris locally, and finally includes examples of how to use Polaris with Apache Spark™.
This guide covers building Polaris, deploying it locally or via Docker, and interacting with it using the command-line interface and Apache Spark. Before proceeding with Polaris, be sure to satisfy the relevant prerequisites listed here.
To get the latest Polaris code, you'll need to clone the repository using git. You can install git using homebrew:
brew install git
Then, use git to clone the Polaris repo:
cd ~
git clone https://github.com/apache/polaris.git
If you plan to deploy Polaris inside Docker, you'll need to install docker itself. For example, this can be done using homebrew:
brew install --cask docker
Once installed, make sure Docker is running.
If you plan to build Polaris from source yourself, you will need to satisfy a few prerequisites first.
Polaris is built using gradle and is compatible with Java 21. We recommend the use of jenv to manage multiple Java versions. For example, to install Java 21 via homebrew and configure it with jenv:
cd ~/polaris
brew install openjdk@21 jenv
jenv add $(brew --prefix openjdk@21)
jenv local 21
Polaris is compatible with any Apache Iceberg client that supports the REST API. Depending on the client you plan to use, refer to the prerequisites below.
If you want to connect to Polaris with Apache Spark, you'll need to start by cloning Spark. As above, make sure git is installed first. You can install it with homebrew:
brew install git
Then, clone Spark and check out a versioned branch. This guide uses Spark 3.5.
cd ~
git clone https://github.com/apache/spark.git
cd ~/spark
git checkout branch-3.5
Polaris can be deployed via a lightweight docker image or as a standalone process. Before starting, be sure that you've satisfied the relevant prerequisites detailed above.
To start using Polaris in Docker, launch Polaris while Docker is running:
cd ~/polaris
docker compose -f docker-compose.yml up --build
Once the polaris-polaris
container is up, you can continue to Defining a Catalog.
Run Polaris locally with:
cd ~/polaris
./gradlew runApp
You should see output for some time as Polaris builds and starts up. Eventually, you won’t see any more logs and should see messages that resemble the following:
INFO [...] [main] [] o.e.j.s.handler.ContextHandler: Started i.d.j.MutableServletContextHandler@...
INFO [...] [main] [] o.e.j.server.AbstractConnector: Started application@...
INFO [...] [main] [] o.e.j.server.AbstractConnector: Started admin@...
INFO [...] [main] [] o.eclipse.jetty.server.Server: Started Server@...
At this point, Polaris is running.
For this tutorial, we'll launch an instance of Polaris that stores entities only in-memory. This means that any entities that you define will be destroyed when Polaris is shut down. It also means that Polaris will automatically bootstrap itself with root credentials. For more information on how to configure Polaris for production usage, see the docs.
When Polaris is launched using in-memory mode the root principal credentials can be found in stdout on initial startup. For example:
realm: default-realm root principal credentials: <client-id>:<client-secret>
Be sure to note of these credentials as we'll be using them below. You can also set these credentials as environment variables for use with the Polaris CLI:
export CLIENT_ID=<client-id>
export CLIENT_SECRET=<client-secret>
In Polaris, the catalog is the top-level entity that objects like tables and views are organized under. With a Polaris service running, you can create a catalog like so:
cd ~/polaris
./polaris \
--client-id ${CLIENT_ID} \
--client-secret ${CLIENT_SECRET} \
catalogs \
create \
--storage-type s3 \
--default-base-location ${DEFAULT_BASE_LOCATION} \
--role-arn ${ROLE_ARN} \
quickstart_catalog
This will create a new catalog called quickstart_catalog.
The DEFAULT_BASE_LOCATION
you provide will be the default location that objects in this catalog should be stored in, and the ROLE_ARN
you provide should be a Role ARN with access to read and write data in that location. These credentials will be provided to engines reading data from the catalog once they have authenticated with Polaris using credentials that have access to those resources.
If you’re using a storage type other than S3, such as Azure, you’ll provide a different type of credential than a Role ARN. For more details on supported storage types, see the docs.
Additionally, if Polaris is running somewhere other than localhost:8181
, you can specify the correct hostname and port by providing --host
and --port
flags. For the full set of options supported by the CLI, please refer to the docs.
With a catalog created, we can create a principal that has access to manage that catalog. For details on how to configure the Polaris CLI, see the section above or refer to the docs.
./polaris \
--client-id ${CLIENT_ID} \
--client-secret ${CLIENT_SECRET} \
principals \
create \
quickstart_user
./polaris \
--client-id ${CLIENT_ID} \
--client-secret ${CLIENT_SECRET} \
principal-roles \
create \
quickstart_user_role
./polaris \
--client-id ${CLIENT_ID} \
--client-secret ${CLIENT_SECRET} \
catalog-roles \
create \
--catalog quickstart_catalog \
quickstart_catalog_role
Be sure to provide the necessary credentials, hostname, and port as before.
When the principals create
command completes successfully, it will return the credentials for this new principal. Be sure to note these down for later. For example:
./polaris ... principals create example
{"clientId": "XXXX", "clientSecret": "YYYY"}
Now, we grant the principal the principal role we created, and grant the catalog role the principal role we created. For more information on these entities, please refer to the linked documentation.
./polaris \
--client-id ${CLIENT_ID} \
--client-secret ${CLIENT_SECRET} \
principal-roles \
grant \
--principal quickstart_user \
quickstart_user_role
./polaris \
--client-id ${CLIENT_ID} \
--client-secret ${CLIENT_SECRET} \
catalog-roles \
grant \
--catalog quickstart_catalog \
--principal-role quickstart_user_role \
quickstart_catalog_role
Now, we’ve linked our principal to the catalog via roles like so:
In order to give this principal the ability to interact with the catalog, we must assign some privileges. For the time being, we will give this principal the ability to fully manage content in our new catalog. We can do this with the CLI like so:
./polaris \
--client-id ${CLIENT_ID} \
--client-secret ${CLIENT_SECRET} \
privileges \
catalog \
grant \
--catalog quickstart_catalog \
--catalog-role quickstart_catalog_role \
CATALOG_MANAGE_CONTENT
This grants the catalog privileges CATALOG_MANAGE_CONTENT
to our catalog role, linking everything together like so:
CATALOG_MANAGE_CONTENT
has create/list/read/write privileges on all entities within the catalog. The same privilege could be granted to a namespace, in which case the principal could create/list/read/write any entity under that namespace.
At this point, we’ve created a principal and granted it the ability to manage a catalog. We can now use an external engine to assume that principal, access our catalog, and store data in that catalog using Apache Iceberg.
To use a Polaris-managed catalog in Apache Spark, we can configure Spark to use the Iceberg catalog REST API.
This guide uses Apache Spark 3.5, but be sure to find the appropriate iceberg-spark package for your Spark version. From a local Spark clone on the branch-3.5
branch we can run the following:
Note: the credentials provided here are those for our principal, not the root credentials.
bin/spark-shell \
--packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2,org.apache.hadoop:hadoop-aws:3.4.0 \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
--conf spark.sql.catalog.quickstart_catalog.warehouse=quickstart_catalog \
--conf spark.sql.catalog.quickstart_catalog.header.X-Iceberg-Access-Delegation=vended-credentials \
--conf spark.sql.catalog.quickstart_catalog=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.quickstart_catalog.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
--conf spark.sql.catalog.quickstart_catalog.uri=http://localhost:8181/api/catalog \
--conf spark.sql.catalog.quickstart_catalog.credential='XXXX:YYYY' \
--conf spark.sql.catalog.quickstart_catalog.scope='PRINCIPAL_ROLE:ALL' \
--conf spark.sql.catalog.quickstart_catalog.token-refresh-enabled=true
Replace XXXX
and YYYY
with the client ID and client secret generated when you created the quickstart_user
principal.
Similar to the CLI commands above, this configures Spark to use the Polaris running at localhost:8181
. If your Polaris server is running elsewhere, but sure to update the configuration appropriately.
Finally, note that we include the hadoop-aws
package here. If your table is using a different filesystem, be sure to include the appropriate dependency.
Once the Spark session starts, we can create a namespace and table within the catalog:
spark.sql("USE quickstart_catalog")
spark.sql("CREATE NAMESPACE IF NOT EXISTS quickstart_namespace")
spark.sql("CREATE NAMESPACE IF NOT EXISTS quickstart_namespace.schema")
spark.sql("USE NAMESPACE quickstart_namespace.schema")
spark.sql("""
CREATE TABLE IF NOT EXISTS quickstart_table (
id BIGINT, data STRING
)
USING ICEBERG
""")
We can now use this table like any other:
spark.sql("INSERT INTO quickstart_table VALUES (1, 'some data')")
spark.sql("SELECT * FROM quickstart_table").show(false)
. . .
+---+---------+
|id |data |
+---+---------+
|1 |some data|
+---+---------+
If at any time access is revoked...
./polaris \
--client-id ${CLIENT_ID} \
--client-secret ${CLIENT_SECRET} \
privileges \
catalog \
revoke \
--catalog quickstart_catalog \
--catalog-role quickstart_catalog_role \
CATALOG_MANAGE_CONTENT
Spark will lose access to the table:
spark.sql("SELECT * FROM quickstart_table").show(false)
org.apache.iceberg.exceptions.ForbiddenException: Forbidden: Principal 'quickstart_user' with activated PrincipalRoles '[]' and activated ids '[6, 7]' is not authorized for op LOAD_TABLE_WITH_READ_DELEGATION
Apache Polaris (Incubating) is a catalog implementation for Apache Iceberg™ tables and is built on the open source Apache Iceberg™ REST protocol.
With Polaris, you can provide centralized, secure read and write access to your Iceberg tables across different REST-compatible query engines.
This section introduces key concepts associated with using Apache Polaris (Incubating).
In the following diagram, a sample Apache Polaris (Incubating) structure with nested namespaces is shown for Catalog1. No tables or namespaces have been created yet for Catalog2 or Catalog3.
In Polaris, you can create one or more catalog resources to organize Iceberg tables.
Configure your catalog by setting values in the storage configuration for S3, Azure, or Google Cloud Storage. An Iceberg catalog enables a query engine to manage and organize tables. The catalog forms the first architectural layer in the Apache Iceberg™ table specification and must support the following tasks:
Storing the current metadata pointer for one or more Iceberg tables. A metadata pointer maps a table name to the location of that table's current metadata file.
Performing atomic operations so that you can update the current metadata pointer for a table to the metadata pointer of a new version of the table.
To learn more about Iceberg catalogs, see the Apache Iceberg™ documentation.
A catalog can be one of the following two types:
Internal: The catalog is managed by Polaris. Tables from this catalog can be read and written in Polaris.
External: The catalog is externally managed by another Iceberg catalog provider (for example, Snowflake, Glue, Dremio Arctic). Tables from this catalog are synced to Polaris. These tables are read-only in Polaris. In the current release, only a Snowflake external catalog is provided.
A catalog is configured with a storage configuration that can point to S3, Azure storage, or GCS.
You create namespaces to logically group Iceberg tables within a catalog. A catalog can have multiple namespaces. You can also create nested namespaces. Iceberg tables belong to namespaces.
In an internal catalog, an Iceberg table is registered in Polaris, but read and written via query engines. The table data and metadata is stored in your external cloud storage. The table uses Polaris as the Iceberg catalog.
If you have tables housed in another Iceberg catalog, you can sync these tables to an external catalog in Polaris. If you sync this catalog to Polaris, it appears as an external catalog in Polaris. Clients connecting to the external catalog can read from or write to these tables. However, clients connecting to Polaris will only be able to read from these tables.
Important
For the access privileges defined for a catalog to be enforced correctly, the following conditions must be met:
- The directory only contains the data files that belong to a single table.
- The directory hierarchy matches the namespace hierarchy for the catalog.
For example, if a catalog includes the following items:
- Top-level namespace namespace1
- Nested namespace namespace1a
- A customers table, which is grouped under nested namespace namespace1a
- An orders table, which is grouped under nested namespace namespace1a
The directory hierarchy for the catalog must follow this structure:
- /namespace1/namespace1a/customers/<files for the customers table *only*>
- /namespace1/namespace1a/orders/<files for the orders table *only*>
A service principal is an entity that you create in Polaris. Each service principal encapsulates credentials that you use to connect to Polaris.
Query engines use service principals to connect to catalogs.
Polaris generates a Client ID and Client Secret pair for each service principal.
The following table displays example service principals that you might create in Polaris:
Service connection name | Purpose |
---|---|
Flink ingestion | For Apache Flink® to ingest streaming data into Apache Iceberg™ tables. |
Spark ETL pipeline | For Apache Spark™ to run ETL pipeline jobs on Iceberg tables. |
Snowflake data pipelines | For Snowflake to run data pipelines for transforming data in Apache Iceberg™ tables. |
Trino BI dashboard | For Trino to run BI queries for powering a dashboard. |
Snowflake AI team | For Snowflake to run AI jobs on data in Apache Iceberg™ tables. |
A service connection represents a REST-compatible engine (such as Apache Spark™, Apache Flink®, or Trino) that can read from and write to Polaris Catalog. When creating a new service connection, the Polaris administrator grants the service principal that is created with the new service connection either a new or existing principal role. A principal role is a resource in Polaris that you can use to logically group Polaris service principals together and grant privileges on securable objects. For more information, see Principal role. Polaris uses a role-based access control (RBAC) model to grant service principals access to resources. For more information, see Access control. For a diagram of this model, see RBAC model.
If the Polaris administrator grants the service principal for the new service connection a new principal role, the service principal doesn't have any privileges granted to it yet. When securing the catalog that the new service connection will connect to, the Polaris administrator grants privileges to catalog roles and then grants these catalog roles to the new principal role. As a result, the service principal for the new service connection has these privileges. For more information about catalog roles, see Catalog role.
If the Polaris administrator grants an existing principal role to the service principal for the new service connection, the service principal has the same privileges granted to the catalog roles that are granted to the existing principal role. If needed, the Polaris administrator can grant additional catalog roles to the existing principal role or remove catalog roles from it to adjust the privileges bestowed to the service principal. For an example of how RBAC works in Polaris, see RBAC example.
A storage configuration stores a generated identity and access management (IAM) entity for your external cloud storage and is created when you create a catalog. The storage configuration is used to set the values to connect Polaris to your cloud storage. During the catalog creation process, an IAM entity is generated and used to create a trust relationship between the cloud storage provider and Polaris Catalog.
When you create a catalog, you supply the following information about your external cloud storage:
Cloud storage provider | Information |
---|---|
Amazon S3 |
|
Google Cloud Storage (GCS) |
|
Azure |
|
In the following example workflow, Bob creates an Apache Iceberg™ table named Table1 and Alice reads data from Table1.
Bob uses Apache Spark™ to create the Table1 table under the Namespace1 namespace in the Catalog1 catalog and insert values into Table1.
Bob can create Table1 and insert data into it because he is using a service connection with a service principal that has the privileges to perform these actions.
Alice uses Trino to read data from Table1.
Alice can read data from Table1 because she is using a service connection with a service principal that has the privileges to perform this action.
This section describes security and access control.
To secure interactions with service connections, Polaris vends temporary storage credentials to the query engine during query execution. These credentials allow the query engine to run the query without requiring access to your external cloud storage for Iceberg tables. This process is called credential vending.
Polaris uses the identity and access management (IAM) entity to securely connect to your storage for accessing table data, Iceberg metadata, and manifest files that store the table schema, partitions, and other metadata. Polaris retains the IAM entity for your storage location.
Polaris enforces the access control that you configure across all tables registered with the service and governs security for all queries from query engines in a consistent manner.
Polaris uses a role-based access control (RBAC) model that lets you centrally configure access for Polaris service principals to catalogs, namespaces, and tables.
Polaris RBAC uses two different role types to delegate privileges:
Principal roles: Granted to Polaris service principals and analogous to roles in other access control systems that you grant to service principals.
Catalog roles: Configured with certain privileges on Polaris catalog resources and granted to principal roles.
For more information, see Access control.
Apache®, Apache Iceberg™, Apache Spark™, Apache Flink®, and Flink® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.
This page documents various entities that can be managed in Apache Polaris (Incubating).
A catalog is a top-level entity in Polaris that may contain other entities like namespaces and tables. These map directly to Apache Iceberg catalogs.
For information on managing catalogs with the REST API or for more information on what data can be associated with a catalog, see the API docs.
All catalogs in Polaris are associated with a storage type. Valid Storage Types are S3
, Azure
, and GCS
. The FILE
type is also additionally available for testing. Each of these types relates to a different storage provider where data within the catalog may reside. Depending on the storage type, various other configurations may be set for a catalog including credentials to be used when accessing data inside the catalog.
For details on how to use Storage Types in the REST API, see the API docs.
A namespace is a logical entity that resides within a catalog and can contain other entities such as tables or views. Some other systems may refer to namespaces as schemas or databases.
In Polaris, namespaces can be nested. For example, a.b.c.d.e.f.g
is a valid namespace. b
is said to reside within a
, and so on.
For information on managing namespaces with the REST API or for more information on what data can be associated with a namespace, see the API docs.
Polaris tables are entities that map to Apache Iceberg tables.
For information on managing tables with the REST API or for more information on what data can be associated with a table, see the API docs.
Polaris views are entities that map to Apache Iceberg views.
For information on managing views with the REST API or for more information on what data can be associated with a view, see the API docs.
Polaris principals are unique identities that can be used to represent users or services. Each principal may have one or more principal roles assigned to it for the purpose of accessing catalogs and the entities within them.
For information on managing principals with the REST API or for more information on what data can be associated with a principal, see the API docs.
Polaris principal roles are labels that may be granted to principals. Each principal may have one or more principal roles, and the same principal role may be granted to multiple principals. Principal roles may be assigned based on the persona or responsibilities of a given principal, or on how that principal will need to access different entities within Polaris.
For information on managing principal roles with the REST API or for more information on what data can be associated with a principal role, see the API docs.
Polaris catalog roles are labels that may be granted to catalogs. Each catalog may have one or more catalog roles, and the same catalog role may be granted to multiple catalogs. Catalog roles may be assigned based on the nature of data that will reside in a catalog, or by the groups of users and services that might need to access that data.
Each catalog role may have multiple privileges granted to it, and each catalog role can be granted to one or more principal roles. This is the mechanism by which principals are granted access to entities inside a catalog such as namespaces and tables.
Polaris privileges are granted to catalog roles in order to grant principals with a given principal role some degree of access to catalogs with a given catalog role. When a privilege is granted to a catalog role, any principal roles granted that catalog role receive the privilege. In turn, any principals who are granted that principal role receive it.
A privilege can be scoped to any entity inside a catalog, including the catalog itself.
For a list of supported privileges for each privilege class, see the API docs:
This section provides information about how access control works for Apache Polaris (Incubating).
Polaris uses a role-based access control (RBAC) model in which the Polaris administrator assigns access privileges to catalog roles and then grants access to resources to service principals by assigning catalog roles to principal roles.
These are the key concepts to understanding access control in Polaris:
A securable object is an object to which access can be granted. Polaris has the following securable objects:
A principal role is a resource in Polaris that you can use to logically group Polaris service principals together and grant privileges on securable objects.
Polaris supports a many-to-one relationship between service principals and principal roles. For example, to grant the same privileges to multiple service principals, you can grant a single principal role to those service principals. A service principal can be granted one principal role. When registering a service connection, the Polaris administrator specifies the principal role that is granted to the service principal.
You don't grant privileges directly to a principal role. Instead, you configure object permissions at the catalog role level, and then grant catalog roles to a principal role.
The following table shows examples of principal roles that you might configure in Polaris:
Principal role name | Description |
---|---|
Data_engineer | A role that is granted to multiple service principals for running data engineering jobs. |
Data_scientist | A role that is granted to multiple service principals for running data science or AI jobs. |
A catalog role belongs to a particular catalog resource in Polaris and specifies a set of permissions for actions on the catalog or objects in the catalog, such as catalog namespaces or tables. You can create one or more catalog roles for a catalog.
You grant privileges to a catalog role and then grant the catalog role to a principal role to bestow the privileges to one or more service principals.
Note
If you update the privileges bestowed to a service principal, the updates won't take effect for up to one hour. This means that if you revoke or grant some privileges for a catalog, the updated privileges won't take effect on any service principal with access to that catalog for up to one hour.
Polaris also supports a many-to-many relationship between catalog roles and principal roles. You can grant the same catalog role to one or more principal roles. Likewise, a principal role can be granted to one or more catalog roles.
The following table displays examples of catalog roles that you might configure in Polaris:
Example Catalog role | Description |
---|---|
Catalog administrators | A role that has been granted multiple privileges to emulate full access to the catalog. Principal roles that have been granted this role are permitted to create, alter, read, write, and drop tables in the catalog. |
Catalog readers | A role that has been granted read-only privileges to tables in the catalog. Principal roles that have been granted this role are allowed to read from tables in the catalog. |
Catalog contributor | A role that has been granted read and write access privileges to all tables that belong to the catalog. Principal roles that have been granted this role are allowed to perform read and write operations on tables in the catalog. |
The following diagram illustrates the RBAC model used by Polaris. For each catalog, the Polaris administrator assigns access privileges to catalog roles and then grants service principals access to resources by assigning catalog roles to principal roles. Polaris supports a many-to-one relationship between service principals and principal roles.
This section describes the privileges that are available in the Polaris access control model. Privileges are granted to catalog roles, catalog roles are granted to principal roles, and principal roles are granted to service principals to specify the operations that service principals can perform on objects in Polaris.
Important
You can only grant privileges at the catalog level. Fine-grained access controls are not available. For example, you can grant read privileges to all tables in a catalog but not to an individual table in the catalog.
To grant the full set of privileges (drop, list, read, write, etc.) on an object, you can use the full privilege option.
Privilege | Description |
---|---|
TABLE_CREATE | Enables registering a table with the catalog. |
TABLE_DROP | Enables dropping a table from the catalog. |
TABLE_LIST | Enables listing any tables in the catalog. |
TABLE_READ_PROPERTIES | Enables reading properties of the table. |
TABLE_WRITE_PROPERTIES | Enables configuring properties for the table. |
TABLE_READ_DATA | Enables reading data from the table by receiving short-lived read-only storage credentials from the catalog. |
TABLE_WRITE_DATA | Enables writing data to the table by receiving short-lived read+write storage credentials from the catalog. |
TABLE_FULL_METADATA | Grants all table privileges, except TABLE_READ_DATA and TABLE_WRITE_DATA, which need to be granted individually. |
Privilege | Description |
---|---|
VIEW_CREATE | Enables registering a view with the catalog. |
VIEW_DROP | Enables dropping a view from the catalog. |
VIEW_LIST | Enables listing any views in the catalog. |
VIEW_READ_PROPERTIES | Enables reading all the view properties. |
VIEW_WRITE_PROPERTIES | Enables configuring view properties. |
VIEW_FULL_METADATA | Grants all view privileges. |
Privilege | Description |
---|---|
NAMESPACE_CREATE | Enables creating a namespace in a catalog. |
NAMESPACE_DROP | Enables dropping the namespace from the catalog. |
NAMESPACE_LIST | Enables listing any object in the namespace, including nested namespaces and tables. |
NAMESPACE_READ_PROPERTIES | Enables reading all the namespace properties. |
NAMESPACE_WRITE_PROPERTIES | Enables configuring namespace properties. |
NAMESPACE_FULL_METADATA | Grants all namespace privileges. |
Privilege | Description |
---|---|
CATALOG_MANAGE_ACCESS | Includes the ability to grant or revoke privileges on objects in a catalog to catalog roles, and the ability to grant or revoke catalog roles to or from principal roles. |
CATALOG_MANAGE_CONTENT | Enables full management of content for the catalog. This privilege encompasses the following privileges:
|
CATALOG_MANAGE_METADATA | Enables full management of the catalog, catalog roles, namespaces, and tables. |
CATALOG_READ_PROPERTIES | Enables listing catalogs and reading properties of the catalog. |
CATALOG_WRITE_PROPERTIES | Enables configuring catalog properties. |
The following diagram illustrates how RBAC works in Polaris and includes the following users:
Alice: A service admin who signs up for Polaris. Alice can create service principals. She can also create catalogs and namespaces and configure access control for Polaris resources.
Bob: A data engineer who uses Apache Spark™ to interact with Polaris.
Alice has created a service principal for Bob. It has been granted the Data_engineer principal role, which in turn has been granted the following catalog roles: Catalog contributor and Data administrator (for both the Silver and Gold zone catalogs in the following diagram).
The Catalog contributor role grants permission to create namespaces and tables in the Bronze zone catalog.
The Data administrator roles grant full administrative rights to the Silver zone catalog and Gold zone catalog.
Mark: A data scientist who uses trains models with data managed by Polaris.
Alice has created a service principal for Mark. It has been granted the Data_scientist principal role, which in turn has been granted the catalog role named Catalog reader.
The Catalog reader role grants read-only access for a catalog named Gold zone catalog.
The default polaris-server.yml
configuration is intended for development and testing. When deploying Polaris in production, there are several best practices to keep in mind.
Notable configuration used to secure a Polaris deployment are outlined below.
[!WARNING]
Ensure that thetokenBroker
setting reflects the token broker specified inauthenticator
below.
TestInlineBearerTokenPolarisAuthenticator
option and uncomment the DefaultPolarisAuthenticator
authenticator option beneath it.[!WARNING]
Ensure that thetokenBroker
setting reflects the token broker specified inoauth2
above.
[!IMPORTANT]
The defaultin-memory
implementation formetastoreManager
is meant for testing and not suitable for production usage. Instead, consider an implementation such aseclipse-link
which allows you to store metadata in a remote database.
A Metastore Manger should be configured with an implementation that durably persists Polaris entities. Use the configuration metaStoreManager
to configure a MetastoreManager implementation where Polaris entities will be persisted.
Be sure to secure your metastore backend since it will be storing credentials and catalog metadata.
To use EclipseLink for metastore management, specify the configuration metaStoreManager.conf-file
to point to an EclipseLink persistence.xml
file. This file, local to the Polaris service, contains details of the database used for metastore management and the connection settings. For more information, refer to metastore documentation for details.
Before using Polaris when using a metastore manager other than in-memory
, you must bootstrap the metastore manager. This is a manual operation that must be performed only once in order to prepare the metastore manager to integrate with Polaris. When the metastore manager is bootstrapped, any existing Polaris entities in the metastore manager may be purged.
To bootstrap Polaris, run:
java -jar /path/to/jar/polaris-service-all.jar bootstrap polaris-server.yml
Afterwards, Polaris can be launched normally:
java -jar /path/to/jar/polaris-service-all.jar server polaris-server.yml
When deploying Polaris in production, consider adjusting the following configurations:
FILE
storage type. This should be disabled for production systems.The default polaris-server.yml
configuration is intended for development and testing. When deploying Polaris in production, there are several best practices to keep in mind.
Notable configuration used to secure a Polaris deployment are outlined below.
[!WARNING]
Ensure that thetokenBroker
setting reflects the token broker specified inauthenticator
below.
TestInlineBearerTokenPolarisAuthenticator
option and uncomment the DefaultPolarisAuthenticator
authenticator option beneath it.[!WARNING]
Ensure that thetokenBroker
setting reflects the token broker specified inoauth2
above.
[!IMPORTANT]
The defaultin-memory
implementation formetastoreManager
is meant for testing and not suitable for production usage. Instead, consider an implementation such aseclipse-link
which allows you to store metadata in a remote database.
A Metastore Manger should be configured with an implementation that durably persists Polaris entities. Use the configuration metaStoreManager
to configure a MetastoreManager implementation where Polaris entities will be persisted.
Be sure to secure your metastore backend since it will be storing credentials and catalog metadata.
To use EclipseLink for metastore management, specify the configuration metaStoreManager.conf-file
to point to an EclipseLink persistence.xml
file. This file, local to the Polaris service, contains details of the database used for metastore management and the connection settings. For more information, refer to metastore documentation for details.
Before using Polaris when using a metastore manager other than in-memory
, you must bootstrap the metastore manager. This is a manual operation that must be performed only once in order to prepare the metastore manager to integrate with Polaris. When the metastore manager is bootstrapped, any existing Polaris entities in the metastore manager may be purged.
To bootstrap Polaris, run:
java -jar /path/to/jar/polaris-service-all.jar bootstrap polaris-server.yml
Afterwards, Polaris can be launched normally:
java -jar /path/to/jar/polaris-service-all.jar server polaris-server.yml
When deploying Polaris in production, consider adjusting the following configurations:
FILE
storage type. This should be disabled for production systems.List all catalogs in this polaris service
{- "catalogs": [
- {
- "type": "INTERNAL",
- "name": "string",
- "properties": {
- "default-base-location": "string",
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0,
- "storageConfigInfo": {
- "storageType": "S3",
- "allowedLocations": "For AWS [s3://bucketname/prefix/], for AZURE [abfss://container@storageaccount.blob.core.windows.net/prefix/], for GCP [gs://bucketname/prefix/]"
}
}
]
}
Add a new Catalog
The Catalog to create
required | object (Polaris_Management_Service_Catalog) A catalog object. A catalog may be internal or external. External catalogs are managed entirely by an external catalog interface. Third party catalogs may be other Iceberg REST implementations or other services with their own proprietary APIs |
{- "catalog": {
- "type": "INTERNAL",
- "name": "string",
- "properties": {
- "default-base-location": "string",
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0,
- "storageConfigInfo": {
- "storageType": "S3",
- "allowedLocations": "For AWS [s3://bucketname/prefix/], for AZURE [abfss://container@storageaccount.blob.core.windows.net/prefix/], for GCP [gs://bucketname/prefix/]"
}
}
}
Get the details of a catalog
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog |
{- "type": "INTERNAL",
- "name": "string",
- "properties": {
- "default-base-location": "string",
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0,
- "storageConfigInfo": {
- "storageType": "S3",
- "allowedLocations": "For AWS [s3://bucketname/prefix/], for AZURE [abfss://container@storageaccount.blob.core.windows.net/prefix/], for GCP [gs://bucketname/prefix/]"
}
}
Update an existing catalog
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog |
The catalog details to use in the update
currentEntityVersion | integer The version of the object onto which this update is applied; if the object changed, the update will fail and the caller should retry after fetching the latest version. |
object | |
object (Polaris_Management_Service_StorageConfigInfo) A storage configuration used by catalogs |
{- "currentEntityVersion": 0,
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "storageConfigInfo": {
- "storageType": "S3",
- "allowedLocations": "For AWS [s3://bucketname/prefix/], for AZURE [abfss://container@storageaccount.blob.core.windows.net/prefix/], for GCP [gs://bucketname/prefix/]"
}
}
{- "type": "INTERNAL",
- "name": "string",
- "properties": {
- "default-base-location": "string",
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0,
- "storageConfigInfo": {
- "storageType": "S3",
- "allowedLocations": "For AWS [s3://bucketname/prefix/], for AZURE [abfss://container@storageaccount.blob.core.windows.net/prefix/], for GCP [gs://bucketname/prefix/]"
}
}
List the principals for the current catalog
{- "principals": [
- {
- "name": "string",
- "clientId": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
]
}
Create a principal
The principal to create
object (Polaris_Management_Service_Principal) A Polaris principal. | |
credentialRotationRequired | boolean If true, the initial credentials can only be used to call rotateCredentials |
{- "principal": {
- "name": "string",
- "clientId": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}, - "credentialRotationRequired": true
}
{- "principal": {
- "name": "string",
- "clientId": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}, - "credentials": {
- "clientId": "string",
- "clientSecret": "pa$$word"
}
}
Get the principal details
principalName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The principal name |
{- "name": "string",
- "clientId": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
Update an existing principal
principalName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The principal name |
The principal details to use in the update
currentEntityVersion required | integer The version of the object onto which this update is applied; if the object changed, the update will fail and the caller should retry after fetching the latest version. |
required | object |
{- "currentEntityVersion": 0,
- "properties": {
- "property1": "string",
- "property2": "string"
}
}
{- "name": "string",
- "clientId": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
Rotate a principal's credentials. The new credentials will be returned in the response. This is the only API, aside from createPrincipal, that returns the user's credentials. This API is not idempotent.
principalName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The user name |
{- "principal": {
- "name": "string",
- "clientId": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}, - "credentials": {
- "clientId": "string",
- "clientSecret": "pa$$word"
}
}
List the roles assigned to the principal
principalName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the target principal |
{- "roles": [
- {
- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
]
}
Add a role to the principal
principalName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the target principal |
The principal role to assign
object (Polaris_Management_Service_PrincipalRole) |
{- "principalRole": {
- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
}
Remove a role from a catalog principal
principalName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the target principal |
principalRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the role |
List the principal roles
{- "roles": [
- {
- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
]
}
Create a principal role
The principal to create
object (Polaris_Management_Service_PrincipalRole) |
{- "principalRole": {
- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
}
Get the principal role details
principalRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The principal role name |
{- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
Update an existing principalRole
principalRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The principal role name |
The principalRole details to use in the update
currentEntityVersion required | integer The version of the object onto which this update is applied; if the object changed, the update will fail and the caller should retry after fetching the latest version. |
required | object |
{- "currentEntityVersion": 0,
- "properties": {
- "property1": "string",
- "property2": "string"
}
}
{- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
List the Principals to whom the target principal role has been assigned
principalRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The principal role name |
{- "principals": [
- {
- "name": "string",
- "clientId": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
]
}
Get the catalog roles mapped to the principal role
principalRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The principal role name |
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog where the catalogRoles reside |
{- "roles": [
- {
- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
]
}
Assign a catalog role to a principal role
principalRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The principal role name |
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog where the catalogRoles reside |
The principal to create
object (Polaris_Management_Service_CatalogRole) |
{- "catalogRole": {
- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
}
Remove a catalog role from a principal role
principalRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The principal role name |
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog that contains the role to revoke |
catalogRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog role that should be revoked |
List existing roles in the catalog
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The catalog for which we are reading/updating roles |
{- "roles": [
- {
- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
]
}
Create a new role in the catalog
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The catalog for which we are reading/updating roles |
object (Polaris_Management_Service_CatalogRole) |
{- "catalogRole": {
- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
}
Get the details of an existing role
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The catalog for which we are retrieving roles |
catalogRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the role |
{- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
Update an existing role in the catalog
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The catalog for which we are retrieving roles |
catalogRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the role |
currentEntityVersion required | integer The version of the object onto which this update is applied; if the object changed, the update will fail and the caller should retry after fetching the latest version. |
required | object |
{- "currentEntityVersion": 0,
- "properties": {
- "property1": "string",
- "property2": "string"
}
}
{- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
Delete an existing role from the catalog. All associated grants will also be deleted
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The catalog for which we are retrieving roles |
catalogRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the role |
List the PrincipalRoles to which the target catalog role has been assigned
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog where the catalog role resides |
catalogRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog role |
{- "roles": [
- {
- "name": "string",
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "createTimestamp": 0,
- "lastUpdateTimestamp": 0,
- "entityVersion": 0
}
]
}
List the grants the catalog role holds
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog where the role will receive the grant |
catalogRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the role receiving the grant (must exist) |
{- "grants": [
- {
- "type": "catalog"
}
]
}
Add a new grant to the catalog role
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog where the role will receive the grant |
catalogRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the role receiving the grant (must exist) |
object (Polaris_Management_Service_GrantResource) |
{- "grant": {
- "type": "catalog"
}
}
Delete a specific grant from the role. This may be a subset or a superset of the grants the role has. In case of a subset, the role will retain the grants not specified. If the cascade
parameter is true, grant revocation will have a cascading effect - that is, if a principal has specific grants on a subresource, and grants are revoked on a parent resource, the grants present on the subresource will be revoked as well. By default, this behavior is disabled and grant revocation only affects the specified resource.
catalogName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the catalog where the role will receive the grant |
catalogRoleName required | string [ 1 .. 256 ] characters ^(?!\s*[s|S][y|Y][s|S][t|T][e|E][m|M]\$).*$ The name of the role receiving the grant (must exist) |
cascade | boolean Default: false If true, the grant revocation cascades to all subresources. |
object (Polaris_Management_Service_GrantResource) |
{- "grant": {
- "type": "catalog"
}
}
All REST clients should first call this route to get catalog configuration properties from the server to configure the catalog and its HTTP client. Configuration from the server consists of two sets of key/value pairs.
Catalog configuration is constructed by setting the defaults, then client- provided configuration, and finally overrides. The final property set is then used to configure the catalog.
For example, a default configuration property might set the size of the client pool, which can be replaced with a client-specific setting. An override might be used to set the warehouse location, which is stored on the server rather than in client configuration.
Common catalog configuration settings are documented at https://iceberg.apache.org/docs/latest/configuration/#catalog-properties
warehouse | string Warehouse location or identifier to request from the service |
{- "overrides": {
- "warehouse": "s3://bucket/warehouse/"
}, - "defaults": {
- "clients": "4"
}
}
Exchange credentials for a token using the OAuth2 client credentials flow or token exchange.
This endpoint is used for three purposes -
For example, a catalog client may be configured with client credentials from the OAuth2 Authorization flow. This client would exchange its client ID and secret for an access token using the client credentials request with this endpoint (1). Subsequent requests would then use that access token.
Some clients may also handle sessions that have additional user context. These clients would use the token exchange flow to exchange a user token (the "subject" token) from the session for a more specific access token for that user, using the catalog's access token as the "actor" token (2). The user ID token is the "subject" token and can be any token type allowed by the OAuth2 token exchange flow, including a unsecured JWT token with a sub claim. This request should use the catalog's bearer token in the "Authorization" header.
Clients may also use the token exchange flow to refresh a token that is about to expire by sending a token exchange request (3). The request's "subject" token should be the expiring token. This request should use the subject token in the "Authorization" header.
Authorization | string |
grant_type required | string Value: "client_credentials" |
scope | string |
client_id required | string Client ID This can be sent in the request body, but OAuth2 recommends sending it in a Basic Authorization header. |
client_secret required | string Client secret This can be sent in the request body, but OAuth2 recommends sending it in a Basic Authorization header. |
{- "access_token": "string",
- "token_type": "bearer",
- "expires_in": 0,
- "issued_token_type": "urn:ietf:params:oauth:token-type:access_token",
- "refresh_token": "string",
- "scope": "string"
}
List all namespaces at a certain level, optionally starting from a given parent namespace. If table accounting.tax.paid.info exists, using 'SELECT NAMESPACE IN accounting' would translate into GET /namespaces?parent=accounting
and must return a namespace, ["accounting", "tax"] only. Using 'SELECT NAMESPACE IN accounting.tax' would translate into GET /namespaces?parent=accounting%1Ftax
and must return a namespace, ["accounting", "tax", "paid"]. If parent
is not provided, all top-level namespaces should be listed.
prefix required | string An optional prefix in the path |
pageToken | string or null (Apache_Iceberg_REST_Catalog_API_PageToken) An opaque token that allows clients to make use of pagination for list APIs (e.g. ListTables). Clients may initiate the first paginated request by sending an empty query parameter |
pageSize | integer >= 1 For servers that support pagination, this signals an upper bound of the number of results that a client will receive. For servers that do not support pagination, clients may receive results larger than the indicated |
parent | string Example: parent=accounting%1Ftax An optional namespace, underneath which to list namespaces. If not provided or empty, all top-level namespaces should be listed. If parent is a multipart namespace, the parts must be separated by the unit separator ( |
{- "namespaces": [
- [
- "accounting",
- "tax"
], - [
- "accounting",
- "credits"
]
]
}
Create a namespace, with an optional set of properties. The server might also add properties, such as last_modified_time
etc.
prefix required | string An optional prefix in the path |
namespace required | Array of strings (Apache_Iceberg_REST_Catalog_API_Namespace) Reference to one or more levels of a namespace |
object Default: {} Configured string to string map of properties for the namespace |
{- "namespace": [
- "accounting",
- "tax"
], - "properties": {
- "owner": "Hank Bendickson"
}
}
{- "namespace": [
- "accounting",
- "tax"
], - "properties": {
- "owner": "Ralph",
- "created_at": "1452120468"
}
}
Return all stored metadata properties for a given namespace
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
{- "namespace": [
- "accounting",
- "tax"
], - "properties": {
- "owner": "Ralph",
- "transient_lastDdlTime": "1452120468"
}
}
Check if a namespace exists. The response does not contain a body.
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}
Set and/or remove properties on a namespace. The request body specifies a list of properties to remove and a map of key value pairs to update. Properties that are not in the request are not modified or removed by this call. Server implementations are not required to support namespace properties.
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
removals | Array of strings unique |
object |
{- "removals": [
- "foo",
- "bar"
], - "updates": {
- "owner": "Raoul"
}
}
{- "updated": [
- "owner"
], - "removed": [
- "foo"
], - "missing": [
- "bar"
]
}
Return all table identifiers under this namespace
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
pageToken | string or null (Apache_Iceberg_REST_Catalog_API_PageToken) An opaque token that allows clients to make use of pagination for list APIs (e.g. ListTables). Clients may initiate the first paginated request by sending an empty query parameter |
pageSize | integer >= 1 For servers that support pagination, this signals an upper bound of the number of results that a client will receive. For servers that do not support pagination, clients may receive results larger than the indicated |
{- "identifiers": [
- {
- "namespace": [
- "accounting",
- "tax"
], - "name": "paid"
}, - {
- "namespace": [
- "accounting",
- "tax"
], - "name": "owed"
}
]
}
Create a table or start a create transaction, like atomic CTAS.
If stage-create
is false, the table is created immediately.
If stage-create
is true, the table is not created, but table metadata is initialized and returned. The service should prepare as needed for a commit to the table commit endpoint to complete the create transaction. The client uses the returned metadata to begin a transaction. To commit the transaction, the client sends all create and subsequent changes to the table commit route. Changes from the table create operation include changes like AddSchemaUpdate and SetCurrentSchemaUpdate that set the initial table state.
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
X-Iceberg-Access-Delegation | string Enum: "vended-credentials" "remote-signing" Example: vended-credentials,remote-signing Optional signal to the server that the client supports delegated access via a comma-separated list of access mechanisms. The server may choose to supply access via any or none of the requested mechanisms. Specific properties and handling for The protocol and specification for |
name required | string |
location | string |
required | object (Apache_Iceberg_REST_Catalog_API_Schema) |
object (Apache_Iceberg_REST_Catalog_API_PartitionSpec) | |
object (Apache_Iceberg_REST_Catalog_API_SortOrder) | |
stage-create | boolean |
object |
{- "name": "string",
- "location": "string",
- "schema": {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "identifier-field-ids": [
- 0
]
}, - "partition-spec": {
- "fields": [
- {
- "field-id": 0,
- "source-id": 0,
- "name": "string",
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
]
}
]
}, - "write-order": {
- "fields": [
- {
- "source-id": 0,
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
], - "direction": "asc",
- "null-order": "nulls-first"
}
]
}, - "stage-create": true,
- "properties": {
- "property1": "string",
- "property2": "string"
}
}
{- "metadata-location": "string",
- "metadata": {
- "format-version": 1,
- "table-uuid": "string",
- "location": "string",
- "last-updated-ms": 0,
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "schemas": [
- {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "schema-id": 0,
- "identifier-field-ids": [
- 0
]
}
], - "current-schema-id": 0,
- "last-column-id": 0,
- "partition-specs": [
- {
- "spec-id": 0,
- "fields": [
- {
- "field-id": 0,
- "source-id": 0,
- "name": "string",
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
]
}
]
}
], - "default-spec-id": 0,
- "last-partition-id": 0,
- "sort-orders": [
- {
- "order-id": 0,
- "fields": [
- {
- "source-id": 0,
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
], - "direction": "asc",
- "null-order": "nulls-first"
}
]
}
], - "default-sort-order-id": 0,
- "snapshots": [
- {
- "snapshot-id": 0,
- "parent-snapshot-id": 0,
- "sequence-number": 0,
- "timestamp-ms": 0,
- "manifest-list": "string",
- "summary": {
- "operation": "append",
- "property1": "string",
- "property2": "string"
}, - "schema-id": 0
}
], - "refs": {
- "property1": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}, - "property2": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}
}, - "current-snapshot-id": 0,
- "last-sequence-number": 0,
- "snapshot-log": [
- {
- "snapshot-id": 0,
- "timestamp-ms": 0
}
], - "metadata-log": [
- {
- "metadata-file": "string",
- "timestamp-ms": 0
}
], - "statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0,
- "file-footer-size-in-bytes": 0,
- "blob-metadata": [
- {
- "type": "string",
- "snapshot-id": 0,
- "sequence-number": 0,
- "fields": [
- 0
], - "properties": { }
}
]
}
], - "partition-statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0
}
]
}, - "config": {
- "property1": "string",
- "property2": "string"
}
}
Register a table using given metadata file location.
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
name required | string |
metadata-location required | string |
{- "name": "string",
- "metadata-location": "string"
}
{- "metadata-location": "string",
- "metadata": {
- "format-version": 1,
- "table-uuid": "string",
- "location": "string",
- "last-updated-ms": 0,
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "schemas": [
- {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "schema-id": 0,
- "identifier-field-ids": [
- 0
]
}
], - "current-schema-id": 0,
- "last-column-id": 0,
- "partition-specs": [
- {
- "spec-id": 0,
- "fields": [
- {
- "field-id": 0,
- "source-id": 0,
- "name": "string",
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
]
}
]
}
], - "default-spec-id": 0,
- "last-partition-id": 0,
- "sort-orders": [
- {
- "order-id": 0,
- "fields": [
- {
- "source-id": 0,
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
], - "direction": "asc",
- "null-order": "nulls-first"
}
]
}
], - "default-sort-order-id": 0,
- "snapshots": [
- {
- "snapshot-id": 0,
- "parent-snapshot-id": 0,
- "sequence-number": 0,
- "timestamp-ms": 0,
- "manifest-list": "string",
- "summary": {
- "operation": "append",
- "property1": "string",
- "property2": "string"
}, - "schema-id": 0
}
], - "refs": {
- "property1": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}, - "property2": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}
}, - "current-snapshot-id": 0,
- "last-sequence-number": 0,
- "snapshot-log": [
- {
- "snapshot-id": 0,
- "timestamp-ms": 0
}
], - "metadata-log": [
- {
- "metadata-file": "string",
- "timestamp-ms": 0
}
], - "statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0,
- "file-footer-size-in-bytes": 0,
- "blob-metadata": [
- {
- "type": "string",
- "snapshot-id": 0,
- "sequence-number": 0,
- "fields": [
- 0
], - "properties": { }
}
]
}
], - "partition-statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0
}
]
}, - "config": {
- "property1": "string",
- "property2": "string"
}
}
Load a table from the catalog.
The response contains both configuration and table metadata. The configuration, if non-empty is used as additional configuration for the table that overrides catalog configuration. For example, this configuration may change the FileIO implementation to be used for the table.
The response also contains the table's full metadata, matching the table metadata JSON file.
The catalog configuration may contain credentials that should be used for subsequent requests for the table. The configuration key "token" is used to pass an access token to be used as a bearer token for table requests. Otherwise, a token may be passed using a RFC 8693 token type as a configuration key. For example, "urn:ietf:params:oauth:token-type:jwt=
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
table required | string Example: sales A table name |
snapshots | string Enum: "all" "refs" The snapshots to return in the body of the metadata. Setting the value to |
X-Iceberg-Access-Delegation | string Enum: "vended-credentials" "remote-signing" Example: vended-credentials,remote-signing Optional signal to the server that the client supports delegated access via a comma-separated list of access mechanisms. The server may choose to supply access via any or none of the requested mechanisms. Specific properties and handling for The protocol and specification for |
{- "metadata-location": "string",
- "metadata": {
- "format-version": 1,
- "table-uuid": "string",
- "location": "string",
- "last-updated-ms": 0,
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "schemas": [
- {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "schema-id": 0,
- "identifier-field-ids": [
- 0
]
}
], - "current-schema-id": 0,
- "last-column-id": 0,
- "partition-specs": [
- {
- "spec-id": 0,
- "fields": [
- {
- "field-id": 0,
- "source-id": 0,
- "name": "string",
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
]
}
]
}
], - "default-spec-id": 0,
- "last-partition-id": 0,
- "sort-orders": [
- {
- "order-id": 0,
- "fields": [
- {
- "source-id": 0,
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
], - "direction": "asc",
- "null-order": "nulls-first"
}
]
}
], - "default-sort-order-id": 0,
- "snapshots": [
- {
- "snapshot-id": 0,
- "parent-snapshot-id": 0,
- "sequence-number": 0,
- "timestamp-ms": 0,
- "manifest-list": "string",
- "summary": {
- "operation": "append",
- "property1": "string",
- "property2": "string"
}, - "schema-id": 0
}
], - "refs": {
- "property1": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}, - "property2": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}
}, - "current-snapshot-id": 0,
- "last-sequence-number": 0,
- "snapshot-log": [
- {
- "snapshot-id": 0,
- "timestamp-ms": 0
}
], - "metadata-log": [
- {
- "metadata-file": "string",
- "timestamp-ms": 0
}
], - "statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0,
- "file-footer-size-in-bytes": 0,
- "blob-metadata": [
- {
- "type": "string",
- "snapshot-id": 0,
- "sequence-number": 0,
- "fields": [
- 0
], - "properties": { }
}
]
}
], - "partition-statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0
}
]
}, - "config": {
- "property1": "string",
- "property2": "string"
}
}
Commit updates to a table.
Commits have two parts, requirements and updates. Requirements are assertions that will be validated before attempting to make and commit changes. For example, assert-ref-snapshot-id
will check that a named ref's snapshot ID has a certain value.
Updates are changes to make to table metadata. For example, after asserting that the current main ref is at the expected snapshot, a commit may add a new child snapshot and set the ref to the new snapshot id.
Create table transactions that are started by createTable with stage-create
set to true are committed using this route. Transactions should include all changes to the table, including table initialization, like AddSchemaUpdate and SetCurrentSchemaUpdate. The assert-create
requirement is used to ensure that the table was not created concurrently.
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
table required | string Example: sales A table name |
object (Apache_Iceberg_REST_Catalog_API_TableIdentifier) | |
required | Array of objects (Apache_Iceberg_REST_Catalog_API_TableRequirement) |
required | Array of Apache_Iceberg_REST_Catalog_API_AssignUUIDUpdate (object) or Apache_Iceberg_REST_Catalog_API_UpgradeFormatVersionUpdate (object) or Apache_Iceberg_REST_Catalog_API_AddSchemaUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetCurrentSchemaUpdate (object) or Apache_Iceberg_REST_Catalog_API_AddPartitionSpecUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetDefaultSpecUpdate (object) or Apache_Iceberg_REST_Catalog_API_AddSortOrderUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetDefaultSortOrderUpdate (object) or Apache_Iceberg_REST_Catalog_API_AddSnapshotUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetSnapshotRefUpdate (object) or Apache_Iceberg_REST_Catalog_API_RemoveSnapshotsUpdate (object) or Apache_Iceberg_REST_Catalog_API_RemoveSnapshotRefUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetLocationUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetPropertiesUpdate (object) or Apache_Iceberg_REST_Catalog_API_RemovePropertiesUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetStatisticsUpdate (object) or Apache_Iceberg_REST_Catalog_API_RemoveStatisticsUpdate (object) (Apache_Iceberg_REST_Catalog_API_TableUpdate) |
{- "identifier": {
- "namespace": [
- "accounting",
- "tax"
], - "name": "string"
}, - "requirements": [
- {
- "type": "string"
}
], - "updates": [
- {
- "action": "string",
- "uuid": "string"
}
]
}
{- "metadata-location": "string",
- "metadata": {
- "format-version": 1,
- "table-uuid": "string",
- "location": "string",
- "last-updated-ms": 0,
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "schemas": [
- {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "schema-id": 0,
- "identifier-field-ids": [
- 0
]
}
], - "current-schema-id": 0,
- "last-column-id": 0,
- "partition-specs": [
- {
- "spec-id": 0,
- "fields": [
- {
- "field-id": 0,
- "source-id": 0,
- "name": "string",
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
]
}
]
}
], - "default-spec-id": 0,
- "last-partition-id": 0,
- "sort-orders": [
- {
- "order-id": 0,
- "fields": [
- {
- "source-id": 0,
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
], - "direction": "asc",
- "null-order": "nulls-first"
}
]
}
], - "default-sort-order-id": 0,
- "snapshots": [
- {
- "snapshot-id": 0,
- "parent-snapshot-id": 0,
- "sequence-number": 0,
- "timestamp-ms": 0,
- "manifest-list": "string",
- "summary": {
- "operation": "append",
- "property1": "string",
- "property2": "string"
}, - "schema-id": 0
}
], - "refs": {
- "property1": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}, - "property2": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}
}, - "current-snapshot-id": 0,
- "last-sequence-number": 0,
- "snapshot-log": [
- {
- "snapshot-id": 0,
- "timestamp-ms": 0
}
], - "metadata-log": [
- {
- "metadata-file": "string",
- "timestamp-ms": 0
}
], - "statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0,
- "file-footer-size-in-bytes": 0,
- "blob-metadata": [
- {
- "type": "string",
- "snapshot-id": 0,
- "sequence-number": 0,
- "fields": [
- 0
], - "properties": { }
}
]
}
], - "partition-statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0
}
]
}
}
Remove a table from the catalog
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
table required | string Example: sales A table name |
purgeRequested | boolean Default: false Whether the user requested to purge the underlying table's data and metadata |
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}
Check if a table exists within a given namespace. The response does not contain a body.
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
table required | string Example: sales A table name |
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}
Rename a table from one identifier to another. It's valid to move a table across namespaces, but the server implementation is not required to support it.
prefix required | string An optional prefix in the path |
Current table identifier to rename and new table identifier to rename to
required | object (Apache_Iceberg_REST_Catalog_API_TableIdentifier) |
required | object (Apache_Iceberg_REST_Catalog_API_TableIdentifier) |
{- "source": {
- "namespace": [
- "accounting",
- "tax"
], - "name": "paid"
}, - "destination": {
- "namespace": [
- "accounting",
- "tax"
], - "name": "owed"
}
}
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
table required | string Example: sales A table name |
The request containing the metrics report to be sent
table-name required | string |
snapshot-id required | integer <int64> |
required | Apache_Iceberg_REST_Catalog_API_AndOrExpression (object) or Apache_Iceberg_REST_Catalog_API_NotExpression (object) or Apache_Iceberg_REST_Catalog_API_SetExpression (object) or Apache_Iceberg_REST_Catalog_API_LiteralExpression (object) or Apache_Iceberg_REST_Catalog_API_UnaryExpression (object) (Apache_Iceberg_REST_Catalog_API_Expression) |
schema-id required | integer |
projected-field-ids required | Array of integers |
projected-field-names required | Array of strings |
required | object (Apache_Iceberg_REST_Catalog_API_Metrics) |
object | |
report-type required | string |
{- "table-name": "string",
- "snapshot-id": 0,
- "filter": {
- "type": [
- "eq",
- "and",
- "or",
- "not",
- "in",
- "not-in",
- "lt",
- "lt-eq",
- "gt",
- "gt-eq",
- "not-eq",
- "starts-with",
- "not-starts-with",
- "is-null",
- "not-null",
- "is-nan",
- "not-nan"
], - "left": null,
- "right": null
}, - "schema-id": 0,
- "projected-field-ids": [
- 0
], - "projected-field-names": [
- "string"
], - "metrics": {
- "metrics": {
- "total-planning-duration": {
- "count": 1,
- "time-unit": "nanoseconds",
- "total-duration": 2644235116
}, - "result-data-files": {
- "unit": "count",
- "value": 1
}, - "result-delete-files": {
- "unit": "count",
- "value": 0
}, - "total-data-manifests": {
- "unit": "count",
- "value": 1
}, - "total-delete-manifests": {
- "unit": "count",
- "value": 0
}, - "scanned-data-manifests": {
- "unit": "count",
- "value": 1
}, - "skipped-data-manifests": {
- "unit": "count",
- "value": 0
}, - "total-file-size-bytes": {
- "unit": "bytes",
- "value": 10
}, - "total-delete-file-size-bytes": {
- "unit": "bytes",
- "value": 0
}
}
}, - "metadata": {
- "property1": "string",
- "property2": "string"
}, - "report-type": "string"
}
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
table required | string Example: sales A table name |
The request containing the notification to be sent. For each table, Polaris will reject any notification where the timestamp in the request body is older than or equal to the most recent time Polaris has already processed for the table. The responsibility of ensuring the correct order of timestamps for a sequence of notifications lies with the caller of the API. This includes managing potential clock skew or inconsistencies when notifications are sent from multiple sources. A VALIDATE request behaves like a dry-run of a CREATE or UPDATE request up to but not including loading the contents of a metadata file; this includes validations of permissions, the specified metadata path being within ALLOWED_LOCATIONS, having an EXTERNAL catalog, etc. The intended use case for a VALIDATE notification is to allow a remote catalog to pre-validate the general settings of a receiving catalog against an intended new table location before possibly creating a table intended for sending notifcations in the remote catalog at all. For a VALIDATE request, the specified metadata-location can either be a prospective full metadata file path, or a relevant parent directory of the intended table to validate against ALLOWED_LOCATIONS.
notification-type required | string (Apache_Iceberg_REST_Catalog_API_NotificationType) Enum: "UNKNOWN" "CREATE" "UPDATE" "DROP" "VALIDATE" |
required | object (Apache_Iceberg_REST_Catalog_API_TableUpdateNotification) |
{- "notification-type": "UNKNOWN",
- "payload": {
- "table-name": "string",
- "timestamp": 0,
- "table-uuid": "string",
- "metadata-location": "string",
- "metadata": {
- "format-version": 1,
- "table-uuid": "string",
- "location": "string",
- "last-updated-ms": 0,
- "properties": {
- "property1": "string",
- "property2": "string"
}, - "schemas": [
- {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "identifier-field-ids": [
- 0
]
}
], - "current-schema-id": 0,
- "last-column-id": 0,
- "partition-specs": [
- {
- "fields": [
- {
- "field-id": 0,
- "source-id": 0,
- "name": "string",
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
]
}
]
}
], - "default-spec-id": 0,
- "last-partition-id": 0,
- "sort-orders": [
- {
- "fields": [
- {
- "source-id": 0,
- "transform": [
- "identity",
- "year",
- "month",
- "day",
- "hour",
- "bucket[256]",
- "truncate[16]"
], - "direction": "asc",
- "null-order": "nulls-first"
}
]
}
], - "default-sort-order-id": 0,
- "snapshots": [
- {
- "snapshot-id": 0,
- "parent-snapshot-id": 0,
- "sequence-number": 0,
- "timestamp-ms": 0,
- "manifest-list": "string",
- "summary": {
- "operation": "append",
- "property1": "string",
- "property2": "string"
}, - "schema-id": 0
}
], - "refs": {
- "property1": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}, - "property2": {
- "type": "tag",
- "snapshot-id": 0,
- "max-ref-age-ms": 0,
- "max-snapshot-age-ms": 0,
- "min-snapshots-to-keep": 0
}
}, - "current-snapshot-id": 0,
- "last-sequence-number": 0,
- "snapshot-log": [
- {
- "snapshot-id": 0,
- "timestamp-ms": 0
}
], - "metadata-log": [
- {
- "metadata-file": "string",
- "timestamp-ms": 0
}
], - "statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0,
- "file-footer-size-in-bytes": 0,
- "blob-metadata": [
- {
- "type": "string",
- "snapshot-id": 0,
- "sequence-number": 0,
- "fields": [
- 0
], - "properties": { }
}
]
}
], - "partition-statistics-files": [
- {
- "snapshot-id": 0,
- "statistics-path": "string",
- "file-size-in-bytes": 0
}
]
}
}
}
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}
prefix required | string An optional prefix in the path |
Commit updates to multiple tables in an atomic operation
A commit for a single table consists of a table identifier with requirements and updates. Requirements are assertions that will be validated before attempting to make and commit changes. For example, assert-ref-snapshot-id
will check that a named ref's snapshot ID has a certain value.
Updates are changes to make to table metadata. For example, after asserting that the current main ref is at the expected snapshot, a commit may add a new child snapshot and set the ref to the new snapshot id.
required | Array of objects (Apache_Iceberg_REST_Catalog_API_CommitTableRequest) |
{- "table-changes": [
- {
- "identifier": {
- "namespace": [
- "accounting",
- "tax"
], - "name": "string"
}, - "requirements": [
- {
- "type": "string"
}
], - "updates": [
- {
- "action": "string",
- "uuid": "string"
}
]
}
]
}
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}
Return all view identifiers under this namespace
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
pageToken | string or null (Apache_Iceberg_REST_Catalog_API_PageToken) An opaque token that allows clients to make use of pagination for list APIs (e.g. ListTables). Clients may initiate the first paginated request by sending an empty query parameter |
pageSize | integer >= 1 For servers that support pagination, this signals an upper bound of the number of results that a client will receive. For servers that do not support pagination, clients may receive results larger than the indicated |
{- "identifiers": [
- {
- "namespace": [
- "accounting",
- "tax"
], - "name": "paid"
}, - {
- "namespace": [
- "accounting",
- "tax"
], - "name": "owed"
}
]
}
Create a view in the given namespace.
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
name required | string |
location | string |
required | object (Apache_Iceberg_REST_Catalog_API_Schema) |
required | object (Apache_Iceberg_REST_Catalog_API_ViewVersion) |
required | object |
{- "name": "string",
- "location": "string",
- "schema": {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "identifier-field-ids": [
- 0
]
}, - "view-version": {
- "version-id": 0,
- "timestamp-ms": 0,
- "schema-id": 0,
- "summary": {
- "property1": "string",
- "property2": "string"
}, - "representations": [
- {
- "type": "string",
- "sql": "string",
- "dialect": "string"
}
], - "default-catalog": "string",
- "default-namespace": [
- "accounting",
- "tax"
]
}, - "properties": {
- "property1": "string",
- "property2": "string"
}
}
{- "metadata-location": "string",
- "metadata": {
- "view-uuid": "string",
- "format-version": 1,
- "location": "string",
- "current-version-id": 0,
- "versions": [
- {
- "version-id": 0,
- "timestamp-ms": 0,
- "schema-id": 0,
- "summary": {
- "property1": "string",
- "property2": "string"
}, - "representations": [
- {
- "type": "string",
- "sql": "string",
- "dialect": "string"
}
], - "default-catalog": "string",
- "default-namespace": [
- "accounting",
- "tax"
]
}
], - "version-log": [
- {
- "version-id": 0,
- "timestamp-ms": 0
}
], - "schemas": [
- {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "schema-id": 0,
- "identifier-field-ids": [
- 0
]
}
], - "properties": {
- "property1": "string",
- "property2": "string"
}
}, - "config": {
- "property1": "string",
- "property2": "string"
}
}
Load a view from the catalog.
The response contains both configuration and view metadata. The configuration, if non-empty is used as additional configuration for the view that overrides catalog configuration.
The response also contains the view's full metadata, matching the view metadata JSON file.
The catalog configuration may contain credentials that should be used for subsequent requests for the view. The configuration key "token" is used to pass an access token to be used as a bearer token for view requests. Otherwise, a token may be passed using a RFC 8693 token type as a configuration key. For example, "urn:ietf:params:oauth:token-type:jwt=
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
view required | string Example: sales A view name |
{- "metadata-location": "string",
- "metadata": {
- "view-uuid": "string",
- "format-version": 1,
- "location": "string",
- "current-version-id": 0,
- "versions": [
- {
- "version-id": 0,
- "timestamp-ms": 0,
- "schema-id": 0,
- "summary": {
- "property1": "string",
- "property2": "string"
}, - "representations": [
- {
- "type": "string",
- "sql": "string",
- "dialect": "string"
}
], - "default-catalog": "string",
- "default-namespace": [
- "accounting",
- "tax"
]
}
], - "version-log": [
- {
- "version-id": 0,
- "timestamp-ms": 0
}
], - "schemas": [
- {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "schema-id": 0,
- "identifier-field-ids": [
- 0
]
}
], - "properties": {
- "property1": "string",
- "property2": "string"
}
}, - "config": {
- "property1": "string",
- "property2": "string"
}
}
Commit updates to a view.
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
view required | string Example: sales A view name |
object (Apache_Iceberg_REST_Catalog_API_TableIdentifier) | |
Array of objects (Apache_Iceberg_REST_Catalog_API_ViewRequirement) | |
required | Array of Apache_Iceberg_REST_Catalog_API_AssignUUIDUpdate (object) or Apache_Iceberg_REST_Catalog_API_UpgradeFormatVersionUpdate (object) or Apache_Iceberg_REST_Catalog_API_AddSchemaUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetLocationUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetPropertiesUpdate (object) or Apache_Iceberg_REST_Catalog_API_RemovePropertiesUpdate (object) or Apache_Iceberg_REST_Catalog_API_AddViewVersionUpdate (object) or Apache_Iceberg_REST_Catalog_API_SetCurrentViewVersionUpdate (object) (Apache_Iceberg_REST_Catalog_API_ViewUpdate) |
{- "identifier": {
- "namespace": [
- "accounting",
- "tax"
], - "name": "string"
}, - "requirements": [
- {
- "type": "string"
}
], - "updates": [
- {
- "action": "string",
- "uuid": "string"
}
]
}
{- "metadata-location": "string",
- "metadata": {
- "view-uuid": "string",
- "format-version": 1,
- "location": "string",
- "current-version-id": 0,
- "versions": [
- {
- "version-id": 0,
- "timestamp-ms": 0,
- "schema-id": 0,
- "summary": {
- "property1": "string",
- "property2": "string"
}, - "representations": [
- {
- "type": "string",
- "sql": "string",
- "dialect": "string"
}
], - "default-catalog": "string",
- "default-namespace": [
- "accounting",
- "tax"
]
}
], - "version-log": [
- {
- "version-id": 0,
- "timestamp-ms": 0
}
], - "schemas": [
- {
- "type": "struct",
- "fields": [
- {
- "id": 0,
- "name": "string",
- "type": [
- "long",
- "string",
- "fixed[16]",
- "decimal(10,2)"
], - "required": true,
- "doc": "string"
}
], - "schema-id": 0,
- "identifier-field-ids": [
- 0
]
}
], - "properties": {
- "property1": "string",
- "property2": "string"
}
}, - "config": {
- "property1": "string",
- "property2": "string"
}
}
Remove a view from the catalog
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
view required | string Example: sales A view name |
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}
Check if a view exists within a given namespace. This request does not return a response body.
prefix required | string An optional prefix in the path |
namespace required | string Examples:
A namespace identifier as a single string. Multipart namespace parts should be separated by the unit separator ( |
view required | string Example: sales A view name |
{- "error": {
- "message": "Credentials have timed out",
- "type": "AuthenticationTimeoutException",
- "code": 419
}
}
Rename a view from one identifier to another. It's valid to move a view across namespaces, but the server implementation is not required to support it.
prefix required | string An optional prefix in the path |
Current view identifier to rename and new view identifier to rename to
required | object (Apache_Iceberg_REST_Catalog_API_TableIdentifier) |
required | object (Apache_Iceberg_REST_Catalog_API_TableIdentifier) |
{- "source": {
- "namespace": [
- "accounting",
- "tax"
], - "name": "paid-view"
}, - "destination": {
- "namespace": [
- "accounting",
- "tax"
], - "name": "owed-view"
}
}
{- "error": {
- "message": "Malformed request",
- "type": "BadRequestException",
- "code": 400
}
}