System Health Monitoring

  • Last update: 2025-11-24
  • Overview

    Version

    FineOps VersionFunctional Change
    V2.3.0Optimized APDEX calculation logic with stricter performance thresholds and heightened sensitivity to performance fluctuations, enabling a more accurate reflection of user satisfaction in scenarios with issues such as authentication latency.
    V2.19.0

    Optimized APDEX calculation logic.

    Excluded export requests and background FineDataLink tasks from APDEX calculations in Health Monitoring.

    Function Description

    FineOps provides a Health Monitoring dashboard for intelligent monitoring of system health and operation status.

    The Health Monitoring dashboard provides a user-experience-oriented monitoring platform that tracks system stability, performance, and O&M efficiency. It also identifies problematic requests and pinpoints their source objects (such as dashboards and templates).

    Function Description

    Prerequisite

    The Health Monitoring function relies on the Tracing function. Ensure you have enabled Tracing and configured the corresponding global settings.

    Function Entry

    1. Log in to FineOps as the admin, select the O&M project, and choose Project Monitoring > Health Monitoring.

    2. Select the required request type from the drop-down list as needed for filtering.

    iconNote:
    The selection of the request type affects data in Indicator DataHealth StatusUser Usage Statistics, and Node Detail Table.
    Request TypeDescription
    AllAll following request types
    Configuration

    Platform operation requests, such as opening directories, searches, and permission calculations

    Any non-data/non-resource request falls under this category.

    DataRequests for accessing reports, data tables, and viewing data results
    ResourceRequests for static resources, such as frontend JS, CSS, fonts, and icons

    3. Switch the monitoring time range as needed, where you can specify the time range after clicking Historical Analysis.

    Time RangeDescription
    Real-Time Monitoring

    1. Indicator Data and Resources with High Memory Usage display the monitoring data in the last 24 hours.

    2. Trend analysis panels (Health StatusUser Usage Statistics, and Node Detail Table) and Exceptional Request List display the monitoring data in the selected time range. Time range options include the last 1/6/12/24/72 hours.

    Historical AnalysisYou can set the monitoring time range to any past month.

    Indicator Data

    Panel Description

    It displays five key indicators. The indicator with an abnormal value will be displayed in red.

    Indicator Description

    IndicatorDescription
    Comprehensive Health Index

    1. Indicator description: a comprehensive indicator indicating the system health

    2. Calculation logic:

    Comprehensive health score = (Satisfactory request count + Tolerable request count/2)/Total request count * 100%

    • Satisfactory requests: successful requests with a latency of less than 3 s

    • Tolerable requests: successful requests with a latency between 3 s (included) and 12 s (excluded)

    3. Calculation scope:

    • Real-Time Monitoring: data in the last 24 hours

    • Historical Analysis: data in the selected month

    4. Recommended value: above 95%

    Application Performance Index (Apdex)

    1. Indicator description: an industry standard used to evaluate application performance

    FanRuan has elevated the scoring standards of APDEX, a critical indicator in FineOps, based on over a year of continuous observation and user research.

    The APDEX calculation logic has been optimized in FineOps V2.3.0 with stricter performance thresholds and heightened sensitivity to performance fluctuations, enabling a more accurate reflection of user satisfaction in scenarios with issues such as authentication latency.

    FanRuan will continuously optimize and improve the product, providing users with better performance experience.

    2. Calculation logic:

    APDEX = (Satisfactory request count + Tolerable request count/2)/Total request count * 100%

    • Satisfactory requests: data requests with a latency of less than 3 s, resource requests with a latency of less than 0.5 s, and configuration requests with a latency of less than 0.5 s

    • Tolerable requests: data requests with a latency between 3 s (included) and 12 s (excluded), resource requests with a latency between 0.5 s (included) and 2 s (excluded), and configuration requests with a latency between 0.5 s (included) and 2 s (excluded)

    • Export requests and background FineDataLink tasks are excluded from APDEX calculations in Health Monitoring.

    3. Calculation scope:

    • Real-Time Monitoring: data in the last 24 hours

    • Historical Analysis: data in the selected month

    4. Recommended value: above 95%

    Request Success Rate

    1. Indicator description: an industry standard used to evaluate application performance

    2. Calculation logic: Request success rate = (Successful request count/Total request count) × 100%

    3. Calculation scope:

    • Real-Time Monitoring: data in the last 24 hours

    • Historical Analysis: data in the selected month

    4. Recommended value: above 95%

    Maximum Number of Concurrent Requests (QPM)

    Indicator description:

    • Real-Time Monitoring: the peak number of per-minute concurrent requests in the last 24 hours, calculated via request slicing

    • Historical Analysis: the peak number of per-minute concurrent requests in the selected month, calculated via request slicing

    Maximum Number of Concurrent Users (QPM)

    Indicator description:

    • Real-Time Monitoring: the peak number of per-minute concurrent users in the last 24 hours, calculated via request slicing

    • Historical Analysis: the peak number of per-minute concurrent users in the selected month, calculated via request slicing

    Trend Analysis

    Health Status

    Panel Description

    It displays performance indices and request success rates within the specified time range.

    • In Real-Time Monitoring mode, if you click a point and View Trace Detail, you will be redirected to the Trace tab page in Tracing. The view automatically displays spans occurring between one minute before and one minute after the selected time.

    • In Real-Time Monitoring mode, if you click a point and View Traffic Detail, you will be redirected to the Traffic Monitoring tab page. The view automatically displays traffic occurring between one minute before and one minute after the selected time.

    • In Real-Time Monitoring mode, if you click a point and Locate Current Time, all trend analysis panels show corresponding indicator data of the selected time.

    Indicator Description

    IndicatorDescription

    Application Performance Index (Apdex)


    1. Indicator description: an industry standard used to evaluate application performance

    2. Calculation logic:

    APDEX = (Satisfactory request count + Tolerable request count/2)/Total request count * 100%

    • Satisfactory requests: data requests with a latency of less than 3 s, resource requests with a latency of less than 0.5 s, and configuration requests with a latency of less than 0.5 s

    • Tolerable requests: data requests with a latency between 3 s (included) and 12 s (excluded), resource requests with a latency between 0.5 s (included) and 2 s (excluded), and configuration requests with a latency between 0.5 s (included) and 2 s (excluded)

    • Export requests and background FineDataLink tasks are excluded from APDEX calculations in Health Monitoring.

    3. Recommended value: above 95%

    Request Success Rate

    1. Indicator description: an industry standard used to evaluate application performance

    2. Calculation logic: Request success rate = (Successful request count/Total request count) × 100%

    3. Recommended value: above 95%

    User Usage Statistics

    Panel Description

    It displays the peak numbers of per-minute concurrent requests and users.

    • In Real-Time Monitoring mode, if you click a point and View Trace Detail, you will be redirected to the Trace tab page in Tracing. The view automatically displays spans occurring between one minute before and one minute after the selected time.

    • In Real-Time Monitoring mode, if you click a point and View Traffic Detail, you will be redirected to the Traffic Monitoring tab page. The view automatically displays traffic occurring between one minute before and one minute after the selected time.

    • In Real-Time Monitoring mode, if you click a point and Locate Current Time, all trend analysis panels show corresponding indicator data of the selected time.

    Indicator Description

    IndicatorDescription
    Maximum Number of Concurrent Requests (QPM)

    Indicator description:

    • Real-Time Monitoring: the peak number of per-minute concurrent requests in the last 24 hours, calculated via request slicing

    • Historical Analysis: the peak number of per-minute concurrent requests in the selected month, calculated via request slicing

    Maximum Number of Concurrent Users (QPM)

    Indicator description:

    • Real-Time Monitoring: the peak number of per-minute concurrent users in the last 24 hours, calculated via request slicing

    • Historical Analysis: the peak number of per-minute concurrent users in the selected month, calculated via request slicing

    Node Detail Table

    Page Description 

    It displays values of system health indicators within the specified time range for each node.

    Indicator Description

    For details about the calculation logic of indicators, see the above content.

    Performance Status

    Panel Description

    It shows dashboard performance within the specified time range.

    • The p90 curve indicates that 90% of requests complete faster than the value. The p95 and p99 curves follow the same logic.

    • In Real-Time Monitoring mode, if you click a point on the Service Response Time (ms) or Data Response Time (ms) chart and click View Trace Detail, you will be redirected to the Trace tab page in Tracing. The view automatically displays spans occurring between one minute before and one minute after the selected time.

    • In Real-Time Monitoring mode, if you click a point on the Blank Page Duration of User Application (ms) or Loading Duration of User Application (ms) chart and click View Trace Detail, you will be redirected to the First-Screen Trace tab page in Tracing. The view automatically displays time consumption details of traces occurring between one minute before and one minute after the selected time.

    • In Real-Time Monitoring mode, if you click a point and Locate Current Time, all trend analysis panels show corresponding indicator data of the selected time.

    Indicator Description

    IndicatorDescription
    Service Response Time It displays the average latency of valid requests. It measures the duration from request receipt to response completion at the server, indicating overall service health.
    Data Response Time

    It displays the average latency of valid requests. It measures the time taken by data engines/databases to process requests, reflecting data computation performance.

    First Contentful PaintIt measures the average duration of completely blank template pages after templates are opened (before content rendering).
    First Meaningful PaintIt measures the average duration from template opening to the completion of first-screen loading (all content rendered).

    Exception Identification

    Exceptional Request List

    Panel Description

    • It displays all error requests and slow requests (with a latency exceeding 10 seconds) within the specified time range.

    • The exceptional requests are classified by resource type. Error information of each exceptional request, including the exception type, the query count, and the affected user count, is displayed.

    • In Real-Time Monitoring mode, you can click View Trace Detail to jump to the Trace tab page in Tracing. The view automatically filters spans by session ID and displays spans occurring within the last three days. The filtering condition can be modified in Filter.

    Resources with High Memory Usage

    Panel Description

    • It displays resources with high memory usage identified within the last 24 hours or the selected month.

    • The resources are classified by resource type. Resource information, including the resource type, the resource name, the resource creator, the used memory, the access user, and the identification time, is displayed.

    Subsequent Operation

    OperationDescription
    Health Inspection

    Administrators should conduct regular health inspections for applications to ensure proper configuration of the application environment and in-app items for normal application operation.

    You are advised to inspect the system immediately and configure regular automatic inspections when the prompt appears: "The health inspection has not been conducted for a month. Start immediately for troubleshooting."

    Tracing

    FineOps provides a Tracing function to help you collect and analyze requests.

    You can click any abnormal point on charts in Health Monitoring and navigate to the corresponding traces to locate system performance issues.

    In Real-Time Monitoring mode, you can click the View Trace Detail button in Exceptional Request List to view corresponding traces and locate system performance issues.


    Attachment List


    Theme: Project Management
    • Helpful
    • Not helpful
    • Only read