276°
Posted 20 hours ago

Abido PureStar Herbal skin heal Ex. 2021 Original

£57.415£114.83Clearance
ZTS2023's avatar
Shared by
ZTS2023
Joined in 2023
82
63

About this deal

This blog will consist of screenshots of my queries, outputs and descriptions of what the queries are doing. As you can see in the output above, the Joining Date field is probably the easiest field to make because it doesn’t require transforming the unioned output. This is because none of the necessary elements for the field is contained with the Demographic- Value field pair. However, in the case of the TestScoreInteger field, we can just round the test_score field normally. Ultimately, we just have to be careful about what data changes we’re doing to our data. Are those changes actually necessary or correct? If not, then that could cause a lot of confusion for end users who are then going to use the datasets that we prepare, such as populating a C-suite facing dashboard with misleading numbers. Imagine showing a totally wrong KPI to the CEO because of a rounding error or because of some faulty aggregation logic. What a disaster!

Now we can build out the second CTE, averages, which calculates the average score across the English, Psychology and Economics fields and groups this result by the Class field. In case the text in the above image is too small, they show an average score of 73.8, 62.3 and 55.5 for the subjects of English, Psychology and Economics. These values are not the ones we’re looking for. Instead, we are looking for the lowest average score. So, we need the Class field so that the result for the averages CTE is six rows instead of one (this will also be relevant later on).At least for me, the biggest difference between the two tables is that the original input has Demographic and Value fields whereas the desired output does not. When I alluded to the need to transform the shape of the data, this discrepancy was what I was referring to. In other words, the only way for us to get the Account Type, Date of Birth and Ethnicity fields is by transforming the unioned output ( data CTE).

In other words, it’s crucial to be careful with how thresholds are defined if we want to convey our data accurately and meaningfully. We can get totally different results if we get confused about how we use comparison operators! Moving on, the below query and output shows us the min_averages CTE, which results in only one row with the correct values that we’re looking for.The next two fields, Attendance Percentage and TestScoreInteger, involved using the ROUND function. However, as you can see in the desired output, these two fields are two different types of numbers. Attendance Percentage is a decimal while TestScoreInteger is an integer. When using the ROUND function, it matters what the original number format is for the affected field. To see why, take a look at the supplemental query and output below. It showcases why configuring the ROUND function in an appropriate way matters. To make my life easier later down the line, I rounded each of the average test scores using the ROUND function. One thing to note for this CTE is that it’s crucial that we also bring in the Class field and group by it. If not, we get the following supplemental output that is too aggregated. As you can see, if we were to just ROUND the Attendance Percentage field normally, every student’s attendance percentage would be 1. If we used the wrong_ap field in the query that is responsible for the Attendance Flag field, then the logic would totally break and every student would be categorized as having High Attendance. Not only would this differ from the desired output, this would completely undermine any attempt to determine if there’s “a correlation between attendance and test scores” (as the challenge post outlines as the reason for this work.) Moving on, let’s look at how the Student Name column was handled to be able to create two new fields that represent First Name and Surname respectively. This can be seen in the SQL query below. In this blog, I’ll be working through the Preppin’ Data challenge called “Is it the teacher or the student? Part 1”. Try it out here! I’ll be using the incredible SQL to solve this challenge!

The reason why this challenge was difficult for me was because, at one point, I tried nesting two aggregate functions. It looked something like this MIN(AVG(english)) AS “min_avg_english_score” — but this is completely wrong and would return an error. What the UNPIVOT function does is to transform the shape of the table. Like I mentioned earlier, the min_averages CTE has three fields and one row. The fields represent each of the subjects and the single row represents the score for each subject. However, through the UNPIVOT function, the unpivoted_data CTE now has two fields and three rows. The fields are Subject and Grade while the three rows represent each of the scores for the aforementioned subjects. In other words, we took the three fields from the min_averages CTE and transformed to be rows in the unpivoted_data CTE.Finally, we can bring everything together with the code between lines 36 and 45 shown in the query below. Here’s the query and output that represents the join between the two input tables. This is our first CTE which is called data.

Subject” is the name of the field that contains aliased values for each of the scores. So, while the actual value of MIN(“avg_english_score”) is 68.2, its aliased value is “lowest_average_english score”. The data of the Original_Joining Date represents the first clause of the REGEX_REPLACE function. In other words, that field is the original concatenation that needs to be modified further. This is what I was referring to earlier about how the Joining Day field has to be zero-padded. The second clause, which is the RegEx string pattern of ‘ With that, we can now make the unpivoted_data CTE, which is just the min_averages CTE but unpivoted. In our case, since we’re working with Snowflake, our tables aren’t expressed in the form of Excel sheets. No problem! To consolidate these tables, we will need to use the UNION function (and a lot of them!)

You may also like…

In this blog, I’ll be working through the Preppin’ Data challenge called Student Attendance vs Test Scores. You can find it here. This was a pleasantly straightforward challenge where, among other functions, I got to use the ROUND function. Give the challenge a try!

Asda Great Deal

Free UK shipping. 15 day free returns.
Community Updates
*So you can easily identify outgoing links on our site, we've marked them with an "*" symbol. Links on our site are monetised, but this never affects which deals get posted. Find more info in our FAQs and About Us page.
New Comment