Amazon Redshift Data Connection- FineDataLink Help Document

Last update: September 24, 2024

Overview

Version

FineDataLink	Version Functional Change
4.0.28	/
4.1.11.4	Supported data writing into Amazon Redshift using real-time tasks.

Application Scenario

Amazon Redshift is a data warehouse product with MPP architecture provided by AWS. You want this kind of database to be supported by FineDataLink at the source end of pipeline tasks and the source and target ends of scheduled tasks.

Function Description

FineDataLink supports connection to Amazon Redshift for data reading and writing via scheduled tasks and for data writing via pipeline tasks and real-time tasks. (It supports COPY-based high-speed loading and JDBC loading when serving as the target data source.)

This article describes how to connect FineDataLink to Amazon Redshift.

Limitation

The parameter describe_field_name_in_uppercase in the Amazon Redshift database specifies whether column names returned by SELECT statements are uppercase or lowercase, which invalidates the configuration in General Configuration. The parameter value defaults to off, that is, column names are returned in lowercase by default. For details, see the official document.

Preparation

Version and Driver

Driver Package to Download	Driver Name
You can download the driver from the official website of AWS. Upload it to FineDataLink.	com.amazon.redshift.jdbc.Driver

Note:

Cluster-provisioned version and Amazon Redshift Serverless are supported.

Connection Information Collection

Collect the following information before connecting FineDataLink to the database.

IP address and port number of the database server
Username and password of the database
Database mode

Note:

To achieve high-speed data loading into the data source via AWS S3 using the COPY command, you need to prepare and fill in AWS authentication information for uploading files to the S3 bucket.

Connecting to Amazon Redshift for Common Data Reading/Writing

1. Log in to FineDataLink as the admin, choose System Management > Data Connection > Data Connection Management, and click New Data Connection, as shown in the following figure.

Note:

If you are not the admin, you can configure data connections only after the admin assigns you permission on Data Connection under Permission Management > System Management. For details, see Data Connection Management Permission.

2. Find the Amazon Redshift icon, as shown in the following figure.

3. Click Custom, select the driver you uploaded, and enter the connection information collected in the section "Connection Information Collection."

Note:

You can optionally modify other advanced settings. For details, see Data Source Creation and Management.

4. Click Test Connection. If the connection is successful, click Save to save the data connection.

Connecting to Amazon Redshift for High-Speed Writing

Prerequisite

Amazon Redshift has had access to Port 22 on the FineDataLink server.

The public key has been added to the FineDataLink server if you are using an Amazon Redshift cluster. For details, see steps 1, 2, and 3 of Loading Data from Remote Hosts.

Configuration Item Description

To achieve high-speed data loading into Amazon Redshift via AWS S3 using the COPY command, you need to prepare and fill in AWS authentication information for uploading files to the S3 bucket, as shown in the following figure.

If you set Credential Reading Mode to Use In-Environment Default Credential:

Configuration Item	Required or Not	Description
Credential Reading Mode	Required	Select Use In-Environment Default Credential, and use the standard method for the SDK to use the default credential provider chain to load temporary credentials. For details about the in-environment configurations, see Set Up AWS Temporary Credentials and AWS Region for Development.
Region	Required	Fill in the region code where the AWS S3 bucket is located, such as cn-northwest-1 and cn-north-1.
Bucket Name	Required	Fill in the S3 bucket name.
Directory for Writing Temporary File	Optional, and empty by default	Fill in the directory used to hold temporary files. If it is empty, the root directory of the storage bucket is used.

If you set Credential Reading Mode to Specify Credential Manually:

Configuration Item	Required or Not	Description
Credential Reading Mode	Required	Specify the credential. To use SDK to load temporary credentials, you need to create a BasicSessionCredentials class and use the specified credential when initializing the S3 client.
AccessKeyID	Required	Fill in the AWS_ACCESS_KEY_ID authenticated by Identity and Access Management (IAM).
SecretAccessKey	Required	Fill in the AWS_SECRET_ACCESS_KEY authenticated by IAM.
Region	Required	Fill in the region code where the AWS S3 bucket is located, such as cn-northwest-1 and cn-north-1.
Bucket Name	Required	Fill in the S3 bucket name.
Directory for Writing Temporary File	Optional, and empty by default	Fill in the directory used to hold temporary files. If it is empty, the root directory of the storage bucket is used.

Note:

1. For details about IAM, see official documents.

2. On the AWS console page, you can see the AccessKeyID only, and you can see the SecretAccessKey only when creating the AccessKey.

3. After you enable high-speed loading of temporary files in FineDataLink, temporary files are written to the /web-inf/temp path in the FineDataLink installation directory.

Data Source Usage

After you have configured the data source, you can read/write data from/into it using Scheduled Task.

Read data from Amazon Redshift.

Write data into Amazon Redshift. After configuring the high-speed loading referring to the section "Connecting to Amazon Redshift for High-Speed Writing" of this document, you can select whether to enable High-Speed Loading on the Write Method tab page, as shown in the following figure.

Write data into Amazon Redshift using Pipeline Task. After configuring the high-speed loading referring to the section "Connecting to Amazon Redshift for High-Speed Writing" of this document, you can select whether to enable High-Speed Loading on the Write Method tab page, as shown in the following figure.