Overview
Version
FineBI Version | Functional Change |
6.0.7 | / |
Application Scenario
Scenario One: Handling Dirty Data
The Delete Duplicate Row function is helpful when you handle dirty data and delete duplicate rows.
For example, dirty data appears when a row of order data in a table is triggered twice. In this case, you can perform Delete Duplicate Row to retain only one row of order data.
Scenario Two: Retaining Partial Data
You need to collect data of the machine's status. However, the random data collection causes an uneven data distribution, with 10 to 20 rows of data collected per minute. In this case, you can perform Delete Duplicate Row to retain only one row of data per minute.
Scenario Three: Deleting Duplicate Rows
For example, you need to analyze user data (required data) in the following wide table.
In this case, you can first click Field Settings to delete other fields. After that, you can perform Delete Duplicate Row to deduplicate user data.
Function Description
The system judges whether there are rows of duplicate data in the deduplication field you selected. If you tick Select All from the drop-down list of Select Deduplication Field, the system will judge whether there are rows of duplicate data in all fields.
If there are rows of duplicate data, the system will only retain the first one.
Example
You can download the sample data: Order Information.xlsx.
1. Upload the sample data to an analysis subject, as shown in the following figure.
Some orders are recorded twice with duplicate data. Only data in the ID field are different.
2. Click More and select Delete Duplicate Row from the drop-down list.
3. The system judges whether there are rows of duplicate data in the deduplication field you selected. For data of duplicated rows, if data in the Date, Name, and Volume fields are all the same, then it can be inferred that the data of duplicated rows come from the same order data.
Select Date, Name, and Volume from the drop-down list of Select Deduplication Field as the judgment basis of duplication.

4. Click Save and Update to obtain data without duplicate values.
The following table shows different results according to different deduplication fields.
Deduplication Field | Result |
Region only | Only one row of order data in each region will be retained. |
Name only | Only one row of order data for each user will be retained. |
Usage Recommendation
The first row of data that is listed at the top among duplicated rows is retained by default after the system has judged duplicated rows.
Therefore, the retained first row of data may be different when you perform Delete Duplicate Row in different steps. You are advised to perform Delete Duplicate Row in the last step of data analysis.