By: Christian Ragland
April 14th, 2023
In data science, aggregate data is when multiple data sources are combined into one set to create a larger idea of a particular issue. Conversely, disaggregate data refers to the isolation of one or more variables within a data set, highlighting specific features or populations and leading to different insights. Any data-driven organization will need to utilize both aggregate and disaggregate data to get the most out of their data collection process. Clear Impact Suite allows users to organize data with ease, and this article will walk you through the process.
Table of Contents:
- Why Should I Bother With Aggregate and Disaggregate Data?
- Understand Scorecard Hierarchy and Organization
- Quickly Viewing Data by Disaggregated Values
- Controlling Data with Compyle
- Using Calculations to Aggregate More Efficiently
- Building Trust With Funding Flexibility
- Final Thoughts
- Read Next
Why Should I Bother With Aggregate and Disaggregate Data?
Before we talk about how to control these features in our software, let’s briefly go over why all organizations should use aggregate measures and dissaggregate their data.
When it comes to aggregate data, you can think of this process as a way to save time and simplify processes. Imagine an organization tracking high school graduation rates across a single county. Using common data collection methods, you’d probably end up with different Scorecards for each individual school’s graduation numbers. These Scorecards will surely be useful to you in different ways, but you’ll also want to take a look at numbers for the county as a whole. This is where an aggregate measure can come in. This feature will allow you to combine the individual schools to view the county statistics as a whole.
Once you have created your aggregate measure Scorecard, you can disaggregate by any chosen fields or demographics to get a closer look at relevant features of your data. Using our previous example, you could compare any number of individual schools on the same graph by disaggregating those schools and hiding the others. At Clear Impact, we encourage all organizations to regularly and routinely disaggregate their data by race. The above linked blog explains that key information about the populations you are serving is often hidden within large, unspecific data sets. Without disaggregating, you are almost certainly missing something important.
Understand Scorecard Hierarchy and Organization
For the sake of organization and ease, measures that are part of a larger hierarchy, or measures that are fed into another measure to make an aggregate graph can be rolled into one place. We call these “parent” and “child” scorecards. Below is a list of the types of relationships that Scorecards can have:
- Parent Scorecard – A Parent scorecard is a primary scorecard in the hierarchy which houses another. Every scorecard has the ability to add a parent, and even be a parent scorecard. Note: There can only be one Parent per scorecard. If it is to be a parent scorecard, it cannot have a parent of its own.
- Child Scorecard – Child scorecards are at the secondary level of the hierarchy. These scorecards have a Parent and can be parents as well.
- Grandchild Scorecard – Grandchild scorecards are at the tertiary level of the hierarchy. These scorecards have a Child scorecard as a Parent. If it had a primary parent, it would be a Child instead of a grandchild. These scorecards can be parents as well.
- Great Grandchild Scorecard – Great Grandchild scorecards are at the quaternary level of the hierarchy. These scorecards have a Grandchild scorecard as a parent. If it had a secondary parent, it would be a Grandchild scorecard. These too have the ability to be parents.
Before building your hierarchy in the software, a best practice would be to start by thinking through and mapping out how you would like your scorecard hierarchy to flow. Decide which scorecard will be your Parent, then move onto which will be your secondary, tertiary, etc.
Remember, because parents are the root, they do not require a parent of their own. To create parents for your secondary level and onward, find your desired scorecard in the scorecard library and Edit it.
Once in the Edit view, locate the Parent Scorecard option underneath the description box. Click the downward tab, find and select your parent scorecard, and Save.
Quickly Viewing Data by Disaggregated Values
When viewing a measure whose data is aggregated from different sources, you can quickly and easily view the data from the different sources. Directly below your Scorecard graph, you will see a column titled “relationships.” Under relationships, you will see all containers, aggregate, and disaggregated measures for that particular measure. Clicking on these will give you quick access to the data for the individual data streams. Below is what you should be seeing if you have set up your relationships correctly.
Controlling Data in Compyle
Compyle can automatically aggregate the data it collects through surveys into Compylations, or visualisations of data. Compylations pull survey answers from each participant and presents the aggregate data in meaningful ways in the form of calculations.
Compylation data includes survey answers entered online by participants, as well as any answers entered manually by staff.
You can also access analytics features by opening the sidebar and opening the Analytics menu.
The filters you apply to your Complyations are simply another way of disaggregating your survey data to ensure your analysis is including relevant observations in relation to populations and demographics. With direct Scorecard feeds, you can make your power of data analysis, calculations, and visualization even stronger.
Using Calculations to Aggregate More Efficiently
Users can aggregate data in the way that we’ve already discussed: by combining two or more measures into one Scorecard that represents a combination of the measures that make it up. But users that want more more can also create more complex relationships between existing data sets via calculated measures.
Calculated measures use mathematical formulas to calculate values based on data from other measures. To demonstrate how this works, we will use the same example as previously: percent of students who graduate high school on time.
This calculation will be based on two other measures: the total number of on-time graduates, and the total number of high school seniors. These two measures must already exist as indicators in the system, even if they are not included in any scorecard container.
To create a calculated measure out of two existing measures, follow these steps*:
- Create a new blank measure with the name of your final calculation (in this case, “% of students who graduate high school on time”)
- Click on edit
- Open Data Properties and choose the Calculation Type (in our example, the calculation type is Ratio x 100.)
- Click Save. The calculation type is listed in the indicator description
- Click Edit Data.
- Click Add Existing Measure.
- Choose the relevant measures for your calculation.
- Click the green check at the top right to continue.
*Keep in mind that reporting frequency for all selected measures must match.
Final Thoughts
Disaggregating data is one of the most important things you can do to get more out of your data and understand the populations you serve. Disaggregation is a central idea within RBA Implementation and essential for racial equity initiatives. If you are not already organizing your data this way, now is as good a time as ever to start!
Leave A Comment