Deleting Duplicate Data

  • Last update:April 11, 2024
  • Overview

    Version

    FineBI Version

    Functional Change

    6.0.7

    /

    Application Scenario

    Scenario One: Handling Dirty Data

    The Delete Duplicate Row function is helpful when you handle dirty data and delete duplicate rows.

    For example, dirty data appears when a row of order data in a table is triggered twice. In this case, you can perform Delete Duplicate Row to retain only one row of order data.

    a136f6512b5781868ad2a028027bf9d.jpg

    Scenario Two: Retaining Partial Data

    You need to collect data of the machine's status. However, the random data collection causes an uneven data distribution, with 10 to 20 rows of data collected per minute. In this case, you can perform Delete Duplicate Row to retain only one row of data per minute.

    Scenario Three: Deleting Duplicate Rows

    For example, you need to analyze user data (required data) in the following wide table.

    75e3cca3aae1a86fb1063c156bc63a4.png

    In this case, you can first click Field Settings to delete other fields. After that, you can perform Delete Duplicate Row to deduplicate user data.

    b4b750c5becfe4df1506de6dcd4d896.jpg

    Function Description

    The system judges whether there are rows of duplicate data in the deduplication field you selected. If you tick Select All from the drop-down list of Select Deduplication Field, the system will judge whether there are rows of duplicate data in all fields.

    If there are rows of duplicate data, the system will only retain the first one.

    5585b396b081cfac438766ffb03ff2d.png

    Example

    You can download the sample data: Order Information.xlsx.

    1. Upload the sample data to an analysis subject, as shown in the following figure.

    56d9832ec2e776e4b74dd09c4cf203c.png

    Some orders are recorded twice with duplicate data. Only data in the ID field are different.

    2. Click More and select Delete Duplicate Row from the drop-down list.

    ee1b2917aff6143e5714c12ab50d7ff.png

    3. The system judges whether there are rows of duplicate data in the deduplication field you selected. For data of duplicated rows, if data in the Date, Name, and Volume fields are all the same, then it can be inferred that the data of duplicated rows come from the same order data.

    Select Date, Name, and Volume from the drop-down list of Select Deduplication Field as the judgment basis of duplication.

    iconNote:
    The system only retains the first row of data by default after duplication judgment. For example, if the value of A1000005 is duplicated with that of A1000006, only the value of A1000005 will be retained. 

    b81ebf9d5f4a9ff31c75787420e700a.png

    4. Click Save and Update to obtain data without duplicate values.

    The following table shows different results according to different deduplication fields.

    Deduplication Field

    Result

    Region only

    Only one row of order data in each region will be retained.

    4128427b858e3c248a8e1b51f25a97a.png

    Name only

    Only one row of order data for each user will be retained.

    31ac4b160afedf138f6decf413b8adc.png

    Usage Recommendation

    The first row of data that is listed at the top among duplicated rows is retained by default after the system has judged duplicated rows.

    Therefore, the retained first row of data may be different when you perform Delete Duplicate Row in different steps. You are advised to perform Delete Duplicate Row in the last step of data analysis.

    附件列表


    主题: Adding and Editing Data
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    10s後關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy