FineBI 6.1 Architecture Description

  • Last update:November 20, 2024
  • Overview

    This document introduces:

    (1) FineBI 6.1 architecture

    (2) FineBI 6.1 advantage

    FineBI 6.1 Project Architecture

    Architecture Overview

    FineBI6.1架构介绍 图1.png

    Component Overview

    Component
    Overview

    Load balancing gateway

    Function:

    The load balancing gateway is located between the client and the bi-web component.

    It receives requests from the client and distributes these requests to the bi-web component.

    Therefore, load balancing requires accessibility to each BI node.

    Advantage (Why load balancing? Why not direct connection from the client to BI?)

    Performance optimization: Requests are intelligently allocated to various BI nodes to prevent server overload and improve efficiency.

    Fault isolation: Faulty BI nodes (if any) are automatically detected and troubleshot to prevent overall services from being affected.

    Security enhancement: As the first line of defense, the load balancing gateway filters and monitors traffic to prevent malicious attacks.

    Session persistence: The load balancing gateway allows you to configure session persistence policies to keep sessions alive by ensuring user request continuity.

    Scalability: All the client needs is just the load balancing gateway address. Even when more BI nodes are added due to business growth, the user access address is not affected.

    Type:

    This FanRuan intra-project gateway is customized based on FanRuan business to balance user request distribution and improve performance. Therefore, self-provided gateways are not supported.

    If you need to use other types of load balancing gateways such as F5, SLB, and ELB, you can configure forwarding by yourself so that client requests can be forwarded to your self-provided gateway first, then the FanRuan intra-project gateway, and last various BI nodes.

    BI node

    BI nodes are located at two layers: network forwarding layer and business processing layer.

    Network forwarding layer (How can business of one user be continuously processed by the same node?)

    To ensure that one user's requests can be continuously processed by the same BI node if multiple BI nodes exist, a unique session ID will be generated when the client and the BI node is connected.

    In this case, regardless of which BI node subsequent requests from this client are forwarded to, these requests will be re-forwarded to the BI node (that first received the requests) based on the session ID, so that sessions can be alive and services can be consistent.

    Business processing layer:

    BI nodes at this layer are used to process client requests, including platform function requests and frontend page rendering requests.

    During request processing, tasks involved with any data update or calculation will be intelligently assigned to the backend engine worker.

    Engine master

    The engine master has two functions: storing/providing information about worker and storing metadata.

    Storing/Providing information about worker (Q: Are tasks of BI nodes directly assigned to worker? A: Yes, but not entirely.)

    When data in a BI node needs to be updated and calculated, these tasks will be intelligently assigned to the engine worker for execution.

    However, the engine worker provides stateless services, which means that worker is not responsible for storing any persistent data (similar to the situation that worker does not have a fixed ID card). Therefore, the BI node cannot directly identify or contact any engine worker.

    In this case, the BI node will first communicate with the engine master. Then the engine master dynamically provides an available address of worker (randomly assigned and specifically used to complete the current task) for the BI node.

    Once obtaining the address of the engine worker from the engine master, the BI node can smoothly distribute tasks to the corresponding engine worker, ensuring the smoothness of data processing and calculation.

    Storing metadata (How does worker retrieve data from the data storage component?)

    Metadata, also known as the information of each BI data table (not the data in each BI data table), such as the table name, field name, and table storage location in the data storage component.

    The metadata is stored in the directory where the component of master is mounted, namely, the server's disk.

    Before retrieving data from the data storage component for calculation/update, the worker needs to first obtain the metadata of these data tables from the master, then correctly locate the required tables, and finally complete the work.

    Engine worker

    Two parts, monitor for health check and engine worker, are involved in this component.

    Engine worker:

    Such engine is used to execute data update and calculation tasks distributed by BI nodes.

    By default, each engine worker can execute both data update and data query/calculation tasks.

    What is read-write separation? When to perform configuration?

    In a formal business system, business data is generally queried/calculated during working hours in the daytime, and updated during off-hours at night.

    In the daytime, data update (if any) cannot occupy the resources for query. At night, data calculation (if any) cannot occupy the resources for update.

    In this case, you can configure attributes for the engine worker to allow certain nodes to focus on a specific task in a specified time period (namely, read-write separation).

    monitor (Why is a requirement posed on the restart order of the master and worker?)

    An engine worker will regularly (every 3 seconds) send a heartbeat signal (indicating that it is still alive) to the engine master.

    If the sending is abnormal, the master will consider that the worker is faulty and needs to be restarted.

    The master will send a task to the monitor to execute the kill command to forcefully close the worker and then close the monitor itself.

    In this way, the mechanism of the worker itself will be triggered to automatically restart itself.

    Therefore, if the master is restarted successfully as required, all worker engines need to be restarted. Otherwise, the master will not be  able to check whether each worker is alive.

    Data storage

    Such component is used to store data extracted from base tables and self-service datasets in FineBI.

    Note that the following data is not included in the data storage component:

    Excel dataset: The related files are stored in the /WEB-INF/assets/temp_attach path on the file server.

    Direct-connected data: Cached data related to direct connection is jointly stored by the server memory and disk.

    Configuration database

    The configuration database, also known as FineDB in earlier versions, is used to store configuration information (for example, which users/directories are in the project, what permissions do users have, and when data update tasks are executed) in the project.

    The configuration information is stored separately in a database and constantly connected with the project to ensure long-term stable project running.

    This is why each project must have its own configuration database, rather than share one with other projects. Sharing may cause configuration confusion.

    Example:

    For a cluster with multiple BI business nodes, the same platform style and the same directory are displayed whether you access the project from a load balancing entry or a specific node. This is because each node reads related information from the configuration database and displays such information.

    State service

    The state server is a monitor that monitors the running state of each component in the entire BI project and each node, records logs and errors, and coordinates inter-node communication and task allocation.

    Example:

    Assume that a user is logged in at node A and processes business normally. But because node A has crashed at this time, the business has been transferred to node B for processing. How does node B know that this user is logged in on the current computer? This relies on the state service to determine the login state. Such information is stored in Redis.

    File service

    The file server is used to store and share the files and data resources required in the cluster to ensure that each BI node can access and use them.

    The following lists some files to help understand the role of the file server:

    FineReport template file

    FineReport template backup file

    FineBI's original Excel file information

    Driver uploaded by Driver Management

    Project resource file (map/image)

    Snapshot file generated by Task Schedule

    Data package generated by Cloud O&M

    Historical project backup file

    Log service

    The log service is used to record every operation of each user in the system, including login, data access, and modification.

    In FineBI 6.1, the Elasticsearch component replacing the original engine Swift provides log services.

    Why logs are required:

    Traceability: Logs can ensure the transparency and traceability of system operations, which is very important for audits and compliance checks.

    Performance analysis: The running and performance data of the system are recorded to help analyze and evaluate the system performance bottlenecks.

    Behavior analysis: User behavior habits and usage patterns are learned to help improve system functionality and user experience.

    Advantage Comparison Between FineBI 6.1 and Earlier Versions

    Item
    6.0/5.xStorage and Calculation Separation in 6.1

    Higher stability

    The calculation engine and BI business are in the same process. The user access and query/update affect each other. If a crash occurs, the entire service crashes, which will be perceived by   users.

    The engine service is separated from the BI business service.

    The engine is separated into an independent process. In this case, engine crashes (if any) will not be perceived by users.

    The engine can be restarted automatically within minutes.

    This can meet the high availability requirements when multiple engine nodes exist.

    Higher scalability

    5.x:

    For extracted data, primary and backup clusters can be achieved through plugins, which cannot ensure high availability.

    Data can be stored only on local disks, which cannot meet users' personalized storage requirements.

    6.0.x:

    Up to five business nodes are supported, restricting the cluster in node expansion.

    Data can be stored only on local disks, which cannot meet users' personalized storage requirements.

    The unified data access layer is added to separate the calculation engine from the data storage. This breaks through the historical bottleneck that local extracted data is persistently synchronized in the multi-node cluster.

    No longer subject to the node quantity limit, the calculation engine can be expanded horizontally, improving data query concurrency linearly and meeting the requirements of users with high concurrent queries.

    Extracted data can be stored in OSS, achieving data persistence without local storage and meeting enterprise data security requirements.

    Higher performance

    Complex daily O&M

    • Operations, such as non-containerized deployment, project start/stop, log acquisition, and retrieval of memory and CPU usage in the monitoring system, all require command execution in the background.

    • The entire project lacks monitoring and alert measures for resource usage. (The memory and CPU usage can be displayed on the frontend only through plugin functions.)

    Tight disk usage

    • In 5.1.x, extracted data in the primary node will be continuously synchronized to the backup node server. In this case, a large amount of disk space must be available on both the primary and backup servers to store the extracted data.

    • In 6.0.x, extracted data will be synchronized between nodes internally. In this case, each  synchronization node server must be equipped with a disk with a large capacity to store the extracted data.

    Simplified daily O&M

    Basic FineBI O&M operations, such as project start/stop, log download, and stack dump generation, can be completed through the frontend visualization on the O&M platform, effectively reducing partial O&M costs.

    Disk space release

    Extracted data is no longer synchronized and stored among multiple nodes. After data is persisted once in the data storage component, multiple engine nodes can invoke such data.

    Reason for Containerized Deployment of FineBI 6.1

    AdvantageDescription

    Consistency

    • Environment consistency: The container packs applications and their dependencies to ensure application consistency in development, testing, and production environments, avoiding various difficult problems caused by environment factors.

    Isolation and security

    • Component isolation: Each container runs in an independent environment, with processes isolated from each other. This avoids interference between different applications, improving security.

    • Reasonable resource allocation: Resource limits (such as CPU and memory limits) can be configured for each container to ensure reasonable resource allocation.

    Simplified O&M

    • Simplified O&M: Container star/stop and management are automated through the O&M platform, simplifying O&M tasks.

    • Better observability: The O&M platform provides logging and monitoring solutions, helping easily observe/analyze the container running state and identify performance crash risks in advance.

    Fast fault recovery

    • Fault tolerance: The processes of worker can be restarted automatically. Other services encountering faults can also be quickly started for recovery through the O&M platform.

    Attachment: Containerization Technology Overview

    If Linux is compared to a kitchen, the Docker technology is like pre-cooked meals.

    The kitchen can store vacuum-packed food (image) ordered from the central kitchen (cloud image repository) in its own freezer (local image repository).

    Chefs (O&M personnel) can transform the pre-cooked vacuum-packed food (image) into various dishes (container) through simple operations, for customers to enjoy (service).

    Three major features of Docker

    (1) Image: Similar to an installation package, an image is used to create containers.

    (2) Container: The container technology can be used to run one application or a group of applications independently. Multiple containers can be created through the image. A container can be considered as a simplified Linux system.

    (3) Repository: As a place to store images, repositories can be divided into public repositories and private repositories, such as Docker Hub, Alibaba Cloud, and Harbor.

    Through Docker, enterprises can manage their systems more efficiently and flexibly, improving service stability and maintainability.

    FineBI6.1架构介绍 图2.png

     

    附件列表


    主题: About FineBI
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    6s后關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy