In some cases, you need to count the distinct count of a dimension field under certain conditions after removing duplicate data.
For example, each brand contains various types of products in a shopping mall. Therefore, multiple sales records may exist under the same brand every day. You can calculate the count of product brands sold every day, as shown in the following figure.
This document provides you with four implementation methods for distinct count along with their advantages and disadvantages, as shown in the following table.
1. You cannot use the function for real-time data.
2. The summary value does not represent the distinct count of all the data. Instead, the summary value represents the sum of distinct counts obtained based solely on dimensions.
You can count the number of brand types sold every day based on the Store Sales Data Statistics table.
You can download the sample data Store Sales Data Statistics.xlsx.
You can process the data in the dataset to achieve the distinct count of values.
Download the sample data Store Sales Data Statistics.xlsx, create an analysis subject, and add the dataset, as shown in the following figure.
Select Field Settings, click Deselect All, and select the Sales Date and Brand Description fields, as shown in the following figure.
You can select Group Summary, group the data of Brand Description by Sales Date, and select Distinct Count as the summary method to obtain the number of brands sold every day.
1. After adding the fields, select Group Summary, and drag the Sales Date and Brand Description fields into the Group and Summary bars, respectively. After completing the setting, click Update and Save, as shown in the following figure.
You can use a group table to display the analysis result of the dataset.
1. After updating the data, click Component to create a component based on the dataset obtained after editing, as shown in the following figure.
2. Select Group Table in Chart Type to display the data. Drag the Sales Date field into the Dimension bar and drag the Brand Description field (which displays the number of brand types sold every day) into the Indicator bar, as shown in the following figure.
You can use the Count Dependence function for the Count field to obtain the distinct count in the component. To display the data clearly, the chart type is set to Group Table in the example.
Click Component in the lower left corner to add a component, as shown in the following figure.
After you add the component, the Count field appears in the to-be-analyzed area on the left. Drag the Sales Date and Count fields into the Dimension and Indicator bars, respectively, as shown in the following figure.
Click the drop-down icon of the Count field, select Count Dependence, and select Brand Description from the drop-down list, as shown in the following figure. In this case, a distinct count is performed on the Count field based on the corresponding number of brands under the same date dimension, thereby obtaining the number of brand types sold each day.
This section takes the Store_Sales_Statistics table in the Retail Industry business package as an example. You can use the COUNTD_AGG function in the component to calculate the number of brand types sold every day.
Create an analysis subject with the Store_Sales_Statistics table in the Retail Industry business package of the public dataset. The steps are the same as those shown in the above section. For details, see section "Analysis Subject Creation" (in chapter "Method Two: Count").
Add a component with the Store_Sales_Statistics table. The steps are the same as those shown in the above section. For details, see section "Component Adding" (in chapter "Method Two: Count").
You can use the COUNTD_AGG function to obtain the distinct count result of the Brand Description field.
1. Click the icon in the to-be-analyzed area of indicators, and select Add Calculation Field, as shown in the following figure.
2. Enter the formula COUNTD_AGG(Brand Description) to obtain the distinct count value of the Brand Description field in the corresponding dimension. Set a name for the indicator and click OK, as shown in the following figure.
COUNTD_AGG(Array): According to the current analysis dimension, this function returns the distinct count of a field dynamically. The generated results are listed in a dynamic data column. The number of rows in the data column is consistent with that in the current analysis dimension. For details, see Aggregation Functions Overview.
Drag the Sales Date field into the Dimension bar and drag the obtained Brand Number field into the Indicator bar in the analysis area. In this case, the obtained Brand Number field displays the distinct count value of the number of brands sold each day, as shown in the following figure.
This section takes the Store_Sales_Statistics table in the Retail Industry business package as an example. You can convert the dimension field to an indicator field to obtain the number of brand types sold every day.
Click the drop-down icon of the Brand Description field and select Convert to Indicator to obtain the distinct count result, as shown in the following figure.
Select Group Table in Chart Type to display the data. Drag the corresponding fields into the Dimension and Indicator bars in the analysis area. The Brand Description field displays the number of brands sold each day, as shown in the following figure.
For details, see section "Application Scenario."
The following figure shows the preview effect on the DataAnalyst app and the HTML5 terminal.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy