Skip to main content
Skip to main content
Edit this page

Accessing Iceberg data securely

ClickHouse Cloud supports secure role-based access to Iceberg data stored in object storage (typically S3) by using an ARN-based AWS IAM trust relationship. This guide follows the same secure-setup pattern as Accessing S3 data securely, and adds Iceberg-specific configuration in ClickHouse.

Overview

  • Obtain the ClickHouse Cloud service role ID (IAM).
  • Create an IAM role in your AWS account that ClickHouse can assume.
  • Attach Iceberg-specific object and catalog policies to the role.
  • Use Iceberg table functions or the IcebergS3 table engine with role-based credentials.

Obtain the ClickHouse service role ID (ARN)

1. Login to your ClickHouse Cloud account.

2. Select the ClickHouse service where you want to query Iceberg data.

3. Go to the Settings tab.

4. Scroll to Network security information.

5. Copy the Service role ID (IAM) value.

This ARN is required for the trust policy on the AWS IAM role that will access your Iceberg data.

Set up IAM assume role

1. Login to AWS and go to the IAM service.

2. Select Roles then Create role.

Select Trusted entity type as Custom trust policy and enter values based on step 3.

3. Add the Trust and IAM policies.

Replace {service-role-id} with the Service Role ID (IAM) from your ClickHouse instance.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ClickHouseServiceRoleTrustPolicy",
      "Effect": "Allow",
      "Action": "sts:AssumeRole",
      "Principal": {
        "AWS": "{service-role-id}"  
      }
    },
    {
      "Sid": "ReadOnlyIcebergS3IAMPolicy",
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketLocation",
        "s3:ListBucket",
        "s3:GetObject",
        "s3:ListMultipartUploadParts",
        "s3:GetObjectVersion",
        "s3:ListBucketVersions"
      ],
      "Resource": [
        "arn:aws:s3:::{your-bucket}",
        "arn:aws:s3:::{your-bucket}/*"
      ]
    },
    {
      "Sid": "OptionalGlueDataCatalogIAMPolicy",
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase",
        "glue:GetDatabases",
        "glue:GetTable",
        "glue:GetTables",
        "glue:GetPartition",
        "glue:GetPartitions"
      ],
      "Resource": "arn:aws:glue:{region}:{account-id}:*"
    }
  ]
}
Note

For read/write workloads, the IAM policy must include s3:PutObject, s3:DeleteObject, and metadata-modifying actions for Iceberg. The above sample is conservative read-only.

If you need stronger isolation, require requests to originate from ClickHouse Cloud VPC endpoints. For more information on this option, review Secure S3 advanced action control.

4. Finish role creation.

a. Click Next, then Next again through the permission assignment screen.

b. Add a name (e.g. iceberg-role-for-clickhouse) and description.

c. Add tags (optional).

d. Review the policies.

e. Select Create role.

5. Copy the new IAM Role Arn after creation.

Configure Iceberg access in ClickHouse Cloud

Option A: Iceberg table function with role ARN

Use the icebergS3 table function with the NOSIGN option and role-based credentials. ClickHouse Cloud will call STS to assume the role.

SELECT count(*)
FROM icebergS3(
  'https://{your-bucket}.s3.{region}.amazonaws.com/{iceberg-path}/',
  'NOSIGN',
  extra_credentials(role_arn='arn:aws:iam::{account-id}:role/iceberg-role-for-clickhouse', role_session_name='iceberg-session')
);

Option B: Persistent Iceberg table engine

CREATE TABLE iceberg_secure (
  id UInt64,
  event_date Date,
  data String
)
ENGINE = IcebergS3(
  'https://{your-bucket}.s3.{region}.amazonaws.com/{iceberg-path}/',
  'NOSIGN',
  extra_credentials(role_arn='arn:aws:iam::{account-id}:role/iceberg-role-for-clickhouse')
);

Option C: Glue catalog + IcebergS3

CREATE TABLE my_db.my_table
ENGINE = IcebergS3(
  's3://{your-bucekt}/warehouse/{db}/{table}/',
  'NOSIGN',
  extra_credentials(role_arn='arn:aws:iam::{account-id}:role/iceberg-role-for-clickhouse')
)
SETTINGS
  catalog_type = 'glue',
  warehouse = '{your-warehouse}',
  storage_endpoint = 's3://{your-bucket}',
  region = '{region}'
  aws_role_arn = 'arn:aws:iam::{account-id}:role/iceberg-role-for-clickhouse';

Note: When using Glue catalog, ensure your IAM role has both S3 and Glue read/list permissions.

Option D: DataLake Catalog for Glue

Note

DataLake Catalog for Glue is coming in version 26.2.

CREATE DATABASE glue_test2
ENGINE = DataLakeCatalog
SETTINGS 
    catalog_type = 'glue', 
    region = {region}, 
    aws_role_arn = 'arn:aws:iam::{account-id}:role/iceberg-role-for-clickhouse',
    aws_role_session_name = {session-name},
    SETTINGS
    allow_database_glue_catalog = 1;

Validate access

  1. Run a simple query:
SELECT * FROM icebergS3('https://{your-bucket}.s3.{region}.amazonaws.com/{iceberg-path}/', 'NOSIGN')
LIMIT 5;
  1. Check for IAM errors like AccessDenied or InvalidAccessKeyId.

Troubleshooting

  • Verify the role ARN from ClickHouse Cloud service settings.
  • Ensure your bucket/objects are in the same region as the Iceberg queries to reduce latency and cost.
  • Confirm Iceberg table path points to a valid Iceberg metadata location (metadata/v1/... files under the table root).
  • For catalog mode, check Glue metadata and partition visibility with AWS Glue console.