FineBI Version
Functional Change
6.0.7
/
Scenario One: Handling Dirty Data
The Delete Duplicate Row function is helpful when you handle dirty data and delete duplicate rows.
For example, dirty data appears when a row of order data in a table is triggered twice. In this case, you can perform Delete Duplicate Row to retain only one row of order data.
Scenario Two: Retaining Partial Data
You need to collect data of the machine's status. However, the random data collection causes an uneven data distribution, with 10 to 20 rows of data collected per minute. In this case, you can perform Delete Duplicate Row to retain only one row of data per minute.
Scenario Three: Deleting Duplicate Rows
For example, you need to analyze user data (required data) in the following wide table.
In this case, you can first click Field Settings to delete other fields. After that, you can perform Delete Duplicate Row to deduplicate user data.
The system judges whether there are rows of duplicate data in the deduplication field you selected. If you tick Select All from the drop-down list of Select Deduplication Field, the system will judge whether there are rows of duplicate data in all fields.
If there are rows of duplicate data, the system will only retain the first one.
You can download the sample data: Order Information.xlsx.
1. Upload the sample data to an analysis subject, as shown in the following figure.
Some orders are recorded twice with duplicate data. Only data in the ID field are different.
2. Click More and select Delete Duplicate Row from the drop-down list.
3. The system judges whether there are rows of duplicate data in the deduplication field you selected. For data of duplicated rows, if data in the Date, Name, and Volume fields are all the same, then it can be inferred that the data of duplicated rows come from the same order data.
Select Date, Name, and Volume from the drop-down list of Select Deduplication Field as the judgment basis of duplication.
4. Click Save and Update to obtain data without duplicate values.
The following table shows different results according to different deduplication fields.
Deduplication Field
Result
Region only
Only one row of order data in each region will be retained.
Name only
Only one row of order data for each user will be retained.
The first row of data that is listed at the top among duplicated rows is retained by default after the system has judged duplicated rows.
Therefore, the retained first row of data may be different when you perform Delete Duplicate Row in different steps. You are advised to perform Delete Duplicate Row in the last step of data analysis.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy