Overview
Version
FineBI Version | Function Adjustment |
---|---|
6.0 | / |
6.0.2 | Unified some calculation logics of direct-connected data and extracted data. For details, see section "Calculation Logic of Extracted Data and Direct-Connected Data in Components." |
Application Scenario
This document introduces what extracted data and direct-connected data are, what is the differences between these two kinds of data, and when to use extracted data or direct-connected data.
Introduction to Extracted Data and Direct-Connected Data
Engine Overview
Engine Overview | |
---|---|
Extracted Data | If you use extracted data, data in the database will be extracted to FineBI (similar to saving data to FineBI). Therefore, data in the database and data in FineBI are not continuously synchronous. You need to regularly update data in FineBI so that data can be consistent with that in the database. Since data is extracted and saved to the FineBI engine, enough space in the local disk is required in the Extracted Data mode. Through data cache/data copy creation, online analytical processing (OLAP) of large volumes of data is supported to accelerate query performance, ensure analysis experience, and minimize the impacts on business databases as much as possible. |
Direct-Connected Data | If you use a direct -connected dataset, data in your database is directly used in FineBI for calculation. Therefore, data in FineBl and data in the database are kept synchronous. With the help of your big data platform/data warehouse, the simple self-service analysis requirement can be met in the case of high concurrency and large data volumes. |
Usage Requirement
Usage Requirement | |
---|---|
Extracted Data | For details about related server performance requirements for extraction, see Recommended Environment and Configuration for Project Deployment. |
Direct-Connected Data |
|
Usage Scenario
Application Scenario | |
---|---|
Extracted Data | (1) Complex self-service lineage analysis of 10 million data records
(2) Joint analysis of data from multiple databases |
Direct-Connected Data | (1) Simple self-service analysis of large volumes of data
(2) Relatively-high user quantity and concurrency, requiring linear scalability (3) Relatively-high requirement on timeliness (4) Relatively-high requirement on data security (5) Scenario where the data volume is not large and extraction is troublesome |
Comparison Between Direct-Connected Data and Extracted Data
Comparison | |
---|---|
Extracted Data | Due to the data volume limitation, extraction of over 100 million data records cannot be supported. |
Direct-Connected Data |
|
Customer Portrait
Customer Portrait | |
---|---|
Extracted Data | Generally, it is applicable to small enterprises that have small data volumes/low budgets and require self-service analysis. |
Direct-Connected Data | Generally, it is applicable to large enterprises, who have completely-constructed big data platforms and attach great importance to data security (indicating that they do not want to extract data again) and data timeliness. |
When to Use Extracted Data or Direct-Connected Data

Extracted Data Recommended If There Are Not Many Tables with 100 Million Data Records
If the result set contains a small volume of data (10 million or less), use extracted data.
If the result set contains a large volume of data (100 million or less), you are preferentially advised to use extracted data.
Direct-Connected Data Recommended If There Are Many Tables with 100 Million Data Records
If the result set contains 10 million or more data records and high timeliness (hour-level update) is required, you are advised to use direct-connected data.
Notes for Direct-Connected Data
1. If the direct-connected database is a high-performance OLAP database (such as the StarRocks database, Doris database, Hologres database, Vertica database, and GaussDB 200 database), simple self-service analysis is supported.
Simple self-service analysis scenarios refer to scenarios meeting the following points:
① The total number of complex calculation steps in the self-service dataset (namely the total number of Join, Column from Other Tables, Summary Column, Formula Column > DEF Function, Row to Column, and Column to Row steps) is less than or equal to 2.
② The maximum number of lineage levels of the direct-connected dataset is limited to 3.
③ If a subject model is used, no complex calculation steps can be used in the self-service dataset.
2. If the direct-connected database is of other types, the self-service dataset can have a maximum of 1 complex calculation step added and a maximum of 3 lineage levels.
Calculation Logic of Extracted Data and Direct-Connected Data in Components
Calculation Logic in the Same Scenario
Calculation Logic | Extracted Data | Direct-Connected Data |
---|---|---|
Influence of the filtering and quick calculation on the summary value | No influence | No influence |
Influence of one indicator (for which the filtering and quick calculation are performed) on other indicators (for which the quick calculation is performed) | No influence | No influence |
Influence of one indicator (for which the filtering and quick calculation are performed) on other indicators summarized by the quick calculation | No influence | No influence |
Indicator-based dimension filtering/sorting | Filtering/Sorting based on the results of summary rows | Filtering/Sorting based on the results of summary rows |
Filtering logic of the cross table | Normal filtering according to the filtering conditions | Normal filtering according to the filtering conditions |
Filtering in Filter and hear filtering at the same filtering level | Intersection of the filtering conditions in both | Intersection of the filtering conditions in both |
Different filtering logics for null and empty strings | Filtering out all null and empty strings (whether you select null or empty strings for filtering) | Filtering according to the database logic (If the database logic is to filter out null and empty strings separately, the result of direct-connected data will be different from that of extracted data.) |