After being deployed on a server, the FineDataLink project may encounter crashes due to various reasons. If you directly restart the project without capturing the dump files after a crash, it is hard to locate the root cause, making it impossible to solve the issue quickly or take preventive measures to avoid recurrence. The resulting loss of time and effort is incalculable, posing significant challenges for server O&M.
FineDataLink provides you with the Downtime Handling function. With this function, the FineDataLink project can automatically generate dump files and restart the system.
After a crash, FineDataLink can quickly pinpoint the cause by analyzing the automatically generated dump files and provide targeted handling suggestions, enabling prompt resolution to enhance system stability.
Log in to FineDataLink as the super admin, and choose System Management > Intelligent O&M > Downtime Handling to access the corresponding function module on the platform, as shown in the following figure.
The module consists of five sub-functions: Downtime Self-Service Wizard, Running Check, Downtime Handling, Memory Stack Export Record, and Server Restart Record.
Downtime Self-Service Wizard records the crash time and cause of the project and provides recommended solutions, as shown in the following figure.
The following table describes the common crash causes and recommended solutions.
Overflow errors caused by insufficient memory
If the on-heap memory size of the current system is less than the recommended value, you are advised to use the Health Inspection function to diagnose the system configuration and adjust the on-heap memory size to the recommended value.
For details about the Health Inspection function, see Health Inspection.
JDK version issue
You are advised to use JDK8 of V1.8.0_181 or later versions.
Unreasonable memory configuration in the system
You are advised to use the Health Inspection feature to diagnose the configuration and adjust the memory size to the recommended value.
Insufficient disk space
You are advised to check the disk space and clean up unnecessary files.
For details about disk cleanup, see Disk Cleanup.
Deficient memory-mapped file quota
You are advised to use the Health Inspection function to diagnose the configuration and adjust the maximum number of memory-mapped files to the recommended value.
Product version issue
You are advised to upgrade the project to the latest minor version.
For details about upgrading independently deployed FineDataLink, see Upgrading Independently Deployed FineDataLink.
Unknown reason
You are advised to contact the technical support personnel.
Application termination upon SSH exit
Starting FineDataLink via SSH will cause it to exit when the SSH session exits.
You are advised to use alternative command-line remote tools such as SecureCRT or configure FineDataLink to start automatically on the server.
Thread blocking due to log output
You are advised to adjust the log output level to reduce the output log or check disk space and ensure it is sufficient.
For details about log levels, see Log Introduction.
Prolonged system memory release time
You are advised to adjust the on-heap memory size to a value less than 64 GB if 64 GB or more of on-heap memory is used.
You are advised to use a higher-performance CPU if less than 64 GB of on-heap memory is used.
To use the Downtime Handling function in the "Downtime Handling" section, you need to ensure that every item in Running Check meets the requirements.
The system should meet certain conditions for the automatic crash handling tool to function properly. Therefore, after the FineDataLink project starts, it will check the port status, JDK environment, off-heap memory, and deployment method.
If an issue is detected, the Downtime Handling function will be unavailable with prompts for configuration adjustments. Otherwise, FineDataLink operates stably, as shown in the following figure.
The following table displays the check content corresponding to each item.
1
Operating System
The operating system of the FineDataLink server is not Windows or Linux.
You are advised to use Linux systems to ensure the stable operation of the automatic crash handling tool.
The following four items will not be checked if the OS issue is detected.
2
Port
Port 12100 is unavailable. (Port 12100 is not open or is occupied.)
You are advised to open port 12100 or configure another port to ensure normal system operation. For details about port configuration, see the "Port Setting" section.
3
JDK
1. The project contains a tools.jar file from a non-Oracle JRE, and no JDK is configured for the project (with issues in the system JDK configuration).
2. The project lacks tools.jar, and no JDK is configured for the project (with issues in the system JDK configuration).
You are advised to configure the JDK for the system or add a tools.jar to ensure normal system operation. For details about JDK and tools.jar configuration, see Independent Deployment of FineDataLink on Tomcat.
4
Off-heap Memory
Off-heap memory is insufficient.
You are strongly advised to ensure at least 10 GB of available memory on the host after deducting memory used by the container where FineDataLink is located.
5
Deployment Method
FineDatalink is deployed in a non-Tomcat container.
You are advised to deploy FineDatalink using a Tomcat container. For details about FineDataLink deployment using a Tomcat container, see Independent Deployment of FineDataLink on Tomcat.
When issues are detected in the configuration of the operating system, port, JDK, or off-heap memory, you will be notified via platform messages and the message window popping up in the lower-right corner of the platform.
You can click Process to enter the Downtime Handling configuration page, where you can handle the issues, as shown in the following figure.
The notification content is "Some system configurations are found to be missing or unreasonable, and the auto downtime handling tool is found to be unavailable. To ensure the the proper operation of the function, you are advised to handle this problem in time."
To use the Downtime Handling function, you need to ensure that every item in the "Running Check" section meets the requirements.
You can configure the following setting items in Downtime Handling: Auto Resolve Downtime, Auto Memory Stack Export, Auto Restart Upon Downtime, Process auto recovery, Downtime Notification, and Port Setting, as shown in the following figure.
Modifications to the settings will take effect only after you click Save.
Usage Instruction:
Only when you enable Auto Resolve Downtime can you configure the functions below. Otherwise, all functions below will be unavailable.
This switch is enabled by default.
Prerequisite:
During working hours (6:00 AM-11:00 PM), if the main process of FineDataLink shuts down, Auto Resolve Downtime will also be disabled five minutes later.
During non-working hours (0:00-6:00 AM, 11:00-12:00 PM), if the main process of FineDataLink shuts down, Auto Resolve Downtime will not be disabled.
If you enable Auto Memory Stack Export, crash logs will be automatically exported upon a crash.
The crash logs that can be exported include Stack, histo, and dump.
The crash logs will be exported to the folder named by date under Tomcat directory\logs\FineLog\ on the server where the crash node resides.
If Auto Resolve Downtime is disabled, Auto Memory Stack Export is grayed out and cannot be edited.
If Auto Resolve Downtime is enabled, Auto Memory Stack Export is enabled by default.
When you enable Auto Restart Upon Downtime, the project will automatically restart if a crash occurs due to high load.
1. Enable Auto Resolve Downtime.
2. System Status
When Auto Restart Upon Downtime is enabled, the current system status will be checked. The function checks whether the operating system is Windows and whether the project runs as a Windows service.
If the operating system is Windows and the project runs as a Windows service, a prompt will pop up, as shown in the following figure: "This function is unavailable in the current system."
Click Complete or the closing icon in the upper right corner to dismiss the pop-up window. Auto Restart Upon Downtime will be disabled.
If the operating system is Windows and the project runs as a non-Windows service, a prompt will pop up, as shown in the following figure: "The current system may fail to restart."
Click Complete or the closing icom in the upper right corner to dismiss the pop-up window. Auto Restart Upon Downtime will be enabled.
After enabling Process Auto Recovery, you (the admin) can set the time range during which the Process Auto Recovery function takes effect. During the specified time range after Process Auto Recovery is enabled, if the application process is terminated, the project will automatically restart.
When Process Auto Recovery is enabled, the current system status will be checked. The function checks whether the operating system is Windows and whether the project runs as a Windows service.
Click Complete or the closing icon in the upper right corner to dismiss the pop-up window. Process Auto Recovery will be disabled.
Click Complete or the closing icon in the upper right corner to dismiss the pop-up window. Process Auto Recovery will be enabled.
After enabling Downtime Notification, you can configure SMS Reminder, Platform Message, and Email Notification. Users will be notified according to the configured notification methods upon a crash.
Port Setting allows you to set the port used by the downtime handling tool and is set to 12100 by default.
The port number must be within the range from 1024 to 65535. Otherwise, the downtime handling tool will fail to start, and the Downtime Handling page will be inaccessible.
If Auto Resolve Downtime is disabled, Port Setting is grayed out and cannot be edited.
If Auto Resolve Downtime is enabled, the port is set to 12100 by default.
After entering a new port number, if the new port number is inappropriate, a prompt will pop up, as shown in the following figure: "Please enter a number between 1024~65535, 12100 is recommended."
If the port setting is normal, click Test, and a prompt will pop up, as shown in the following figure: "The port is available. The auto downtime handling tool will restart on the new port after saving."
Memory Stack Export Record displays dump file generation records, including the exported content, export start time, export duration, export result (successful or failed), and the failure reason (if the export fails), as shown in the following figure.
Server Restart Record displays the server restart records, including the restart start time, restart duration, restart result (successful or failed), and the failure reason (if the restart fails), as shown in the following figure.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy