Extraction Cluster- FineBI Help Document

Last update：May 19, 2023

Overview

Version

FineBI Version	Functional Change
6.0	/
6.0.8	Disabled data synchronization in ASYNC node mode

Functions

FineBI 6.0 supports the high-concurrency and high-availability extract cluster. If your project enables data extraction, it is recommended to use this solution to build a cluster.

Features

The features of the FineBI 6.0 extract cluster are as follows:

1. High Availability of Business

The cluster is composed of multiple synchronous and asynchronous nodes. As long as there is at least one sync node, it can provide complete data and ensure high availability of data queries.

When an exception occurs during data update period (such as the downtime or node cleanup), the cluster can restore the ongoing task to its pre-execution state and re-execute it to ensure high availability for update with the recovery mechanism.

2. High Concurrency for Query

Multiple nodes simultaneously provide query services, achieving the balance among loads. The query concurrency increases approximately linearly with the number of nodes, providing scale-out query capabilities.

Note: Does not optimize the concurrency of individual node.

3. Improving Update Throughput

By adding nodes, the update performance can be improved, and the update time shows a non-linear decrease trend as the number of nodes increases.

Note: Does not optimize the individual node update performance and throughput.

Deploying Procedure

Objects

The cluster is designed for two types of objects:

1. Newly deployed cluster in projects of FineBI 6.0 with data extraction enabling.

2. Upgrade hot standby projects in FineBI.5.x to FineBI 6.0, cancel hot standby after the operation, and then adjust to extract cluster.

Deploying Process

The deployment for extract cluster is divided into four steps:

	Procedure	Explanation
1	Contact Fanruan	If you plan to deploy the extract cluster in FineBI 6.0, you can contact the FineRuan technical support (Online Support or call at 400-811-8890) or your FineRuan sales representative for preliminary evaluation and confirmation of environmental information. Before proceeding with the following operations, you need to contact FineRuan technical support first. If you do not contact FineRuan and deploy the cluster on your own, any unknown risks may cannot be dealt with in a timely manner.
2	Environment preparation	Extract Cluster Environment Preparation
3	Build a cluster	Build A Extract Cluster
4	Cluster management	Extract Cluster Management Interface

Brief Introduction of the Principle

1. Updating Scheduling

Data update requests can be forwarded to all nodes in a balanced manner.
All nodes can preprocess the update tasks and store the information of the update tasks in the task list.
All sync nodes can receive dataset update subtasks and perform actual data extraction.
After the dataset is extracted at a node, it will be synchronized to other sync nodes. The dataset finishes update only after all synchronization is completed.

2. Synchronizing Data

SYNC node:

In this mode, the data files in nodes are always the latest, and ensure data consistency. When updating, it will ensure the data is synchronized successfully.
The basic guarantee for high availability and also the main force for high concurrency in queries and update.

ASNYC node:

Supports high-concurrency queries.

3. Forwarding Query Request

Data query requests can be forwarded to all nodes in a balanced manner.
If the node data status returned by the state server is the latest, the request will be processed by that node;
If data on the node is not the latest, a request for the latest data will be forwarded internally in FineBI and data synchronization on this node will be triggered.

Performance (Informal Result)

1. Increasing High-Availability Time

Compared with hot standby cluster, when it is the downtime, nodes in the extract cluster will not waste time on node switching, increasing more time with high availability.

In addition, the extract cluster in FineBI 6.0 also supports high availability of business updates and provides an exception recovery mechanism for update tasks.

2. Improving Update Performance

The update performance of multi-table update tasks will change with the number of nodes. It will follow a non-linear trend, increasing first and then decreasing with the increase of the number of nodes (due to data synchronization mechanism).

Global Update-Number of Nodes	Global Update-Performance Improvement Percentage Compared to Stand-alone Environment
One node	/
Two nodes	31%
Three nodes	39%
Four nodes	37%

Note: Due to the data synchronization mechanism, the update performance of a single table in multiple nodes will be lower than that of a single node.

3. Improving Performance Linearly in Query

Queries from the extract cluster achieve balance among loads, and concurrent performance will increase linearly with the increase of nodes. (It will increase non-integer multiples with the increase of nodes, but with a certain attenuation)

Node-Concurrency	Multiple of performance improvement
1 node-10 concurrency	/
4 nodes-40 concurrencies	3.6
7 nodes-70 concurrencies	6.7
1 node-20 concurrency	/
4 nodes-80 concurrencies	3.6
7 nodes-140 concurrencies	6.1

Helpful
Not helpful
Only read

中文（简体）中文（繁體）日本語

English

Extraction Cluster