Statistical Scenarios for Distinct Count

  • Last update:April 22, 2025
  • Overview

    Application Scenario

    In some cases, you need to count the distinct count of a dimension field under certain conditions after removing duplicate data.

    For example, each brand contains various types of products in a shopping mall. Therefore, multiple sales records may exist under the same brand every day. You can calculate the count of product brands sold every day, as shown in the following figure.

    Implementation Method

    This document provides you with four implementation methods for distinct count along with their advantages and disadvantages, as shown in the following table.

    NumberImplementation MethodAdvantageDisadvantage
    1You can use the Delete Duplicate Row function to remove duplicate data with one click in a self-service dataset.Recommended solution/
    2You can choose Group Summary > Summary and select Distinct Count from the drop-down list of the selected field to achieve the expected effect in datasets.High display speed, optimal performance, and high flexibility

    1. You cannot use the function for real-time data.

    2. The summary value does not represent the distinct count of all the data. Instead, the summary value represents the sum of distinct counts obtained based solely on dimensions.

    3You can use Count Dependence (a field that the Count field depends on) in the component.Relatively fast display speed, good performance, and capable of displaying real-time data (Recommended solution)Low flexibility
    4You can use the aggregation function COUNTD_AGG in the component.High flexibility and distinct count results usable for calculationLikely to cause performance issues
    5You can convert the dimension field into an indicator field (distinct count) in the component.Distinct count results usable for calculation and detail filterLikely to cause performance issues

    iconNote:
    Since data (from the dataset) obtained after processing is used for dashboard display directly, this method has the best performance. Since the summary method of the summary row is set to Sum by default, which simplifies the calculation, most performance problems can be avoided when you use the Count field to achieve the expected effect. However, when you use the function or convert the dimension field into an indicator field, the summary method of the summary row is set to Auto by default, which is likely to cause performance problems, for example, data loading failure.

    Method One: Dataset

    You can count the number of brand types sold every day based on the Store Sales Data Statistics table.

    You can download the sample data Store Sales Data Statistics.xlsx.

    Analysis Subject Creation

    You can process the data in the dataset to achieve the distinct count of values.

    Download the sample data Store Sales Data Statistics.xlsx, create an analysis subject, and add the dataset, as shown in the following figure.

    Field Selection

    Select Field Settings, click Deselect All, and select the Sales Date and Brand Description fields, as shown in the following figure.

    Group Summary

    You can select Group Summary, group the data of Brand Description by Sales Date, and select Distinct Count as the summary method to obtain the number of brands sold every day.

    1. After adding the fields, select Group Summary, and drag the Sales Date and Brand Description fields into the Group and Summary bars, respectively. After completing the setting, click Update and Save, as shown in the following figure.

    Group Table Creation

    You can use a group table to display the analysis result of the dataset.

    1. After updating the data, click Component to create a component based on the dataset obtained after editing, as shown in the following figure.

    2. Select Group Table in Chart Type to display the data. Drag the Sales Date field into the Dimension bar and drag the Brand Description field (which displays the number of brand types sold every day) into the Indicator bar, as shown in the following figure.

    Method Two: Count

    You can count the number of brand types sold every day based on the Store Sales Data Statistics table.

    You can use the Count Dependence function for the Count field to obtain the distinct count in the component. To display the data clearly, the chart type is set to Group Table in the example.

    You can download the sample data Store Sales Data Statistics.xlsx.

    Analysis Subject Creation

    Download the sample data Store Sales Data Statistics.xlsx, create an analysis subject, and add the dataset, as shown in the following figure.

    Component Adding

    Click Component in the lower left corner to add a component, as shown in the following figure.

    Group Table Creation

    After you add the component, the Count field appears in the to-be-analyzed area on the left. Drag the Sales Date and Count fields into the Dimension and Indicator bars, respectively, as shown in the following figure.

    iconNote:
    You cannot use the Count function in detail tables.

    Count Dependence Field Adding

    Click the drop-down icon of the Count field, select Count Dependence, and select Brand Description from the drop-down list, as shown in the following figure. In this case, a distinct count is performed on the Count field based on the corresponding number of brands under the same date dimension, thereby obtaining the number of brand types sold each day.

    Method Three: Aggregation Function

    This section takes the Store_Sales_Statistics table in the Retail Industry business package as an example. You can use the COUNTD_AGG function in the component to calculate the number of brand types sold every day.

    Analysis Subject Creation

    Create an analysis subject with the Store_Sales_Statistics table in the Retail Industry business package of the public dataset. The steps are the same as those shown in the above section. For details, see section "Analysis Subject Creation" (in chapter "Method Two: Count").

    Component Adding

    Add a component with the Store_Sales_Statistics table. The steps are the same as those shown in the above section. For details, see section "Component Adding" (in chapter "Method Two: Count").

    Calculation Indicator Adding

    You can use the COUNTD_AGG function to obtain the distinct count result of the Brand Description field.

    1. Click the icon in the to-be-analyzed area of indicators, and select Add Calculation Field, as shown in the following figure.

    2. Enter the formula COUNTD_AGG(Brand Description) to obtain the distinct count value of the Brand Description field in the corresponding dimension. Set a name for the indicator and click OK, as shown in the following figure.

    COUNTD_AGG(Array): According to the current analysis dimension, this function returns the distinct count of a field dynamically. The generated results are listed in a dynamic data column. The number of rows in the data column is consistent with that in the current analysis dimension. For details, see Aggregation Functions Overview.

    Group Table Creation

    Drag the Sales Date field into the Dimension bar and drag the obtained Brand Number field into the Indicator bar in the analysis area. In this case, the obtained Brand Number field displays the distinct count value of the number of brands sold each day, as shown in the following figure.

    Method Four: Dimension-to-Indicator Conversion

    This section takes the Store_Sales_Statistics table in the Retail Industry business package as an example. You can convert the dimension field to an indicator field to obtain the number of brand types sold every day.

    Analysis Subject Creation

    Create an analysis subject with the Store_Sales_Statistics table in the Retail Industry business package of the public dataset. The steps are the same as those shown in the above section. For details, see section "Analysis Subject Creation" (in chapter "Method Two: Count").

    Component Adding

    Add a component with the Store_Sales_Statistics table. The steps are the same as those shown in the above section. For details, see section "Component Adding" (in chapter "Method Two: Count").

    Converting the Dimension Field to an Indicator Field

    Click the drop-down icon of the Brand Description field and select Convert to Indicator to obtain the distinct count result, as shown in the following figure.

    Group Table Creation

    Select Group Table in Chart Type to display the data. Drag the corresponding fields into the Dimension and Indicator bars in the analysis area. The Brand Description field displays the number of brands sold each day, as shown in the following figure.

    Effect Display

    PC

    For details, see section "Application Scenario."

    Mobile Terminal

    The following figure shows the preview effect on the DataAnalyst app and the HTML5 terminal.

    附件列表


    主题: Advanced Data Analysis
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    9s后關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy