5 SQL Queries That Will Help You With Data Analysis

5 SQL Queries That Will Help You With Data Analysis

SQL is certainly one of job hunters’ most valuable skills. For roles like data analyst or data engineer, SQL is your daily bread and butter. In this article, we will cover the basic SQL queries and constructs that will get you started with your data analysis needs.

1.   SELECT and FROM

The SELECT keyword is used to retrieve data from the database. You can specify the names of the columns (i.e. the type of data you want to see, like customer name or age), or you can see all the data in the table.

The FROM keyword specifies the table(s) from which the data is to be retrieved. For instance, say you want to get a list of all the customers in your database with their names and email information. This information is in the "Customers" table shown below:

No alt text provided for this image

Here’s the query we’d use to get a list of all the customer names with their email addresses:

No alt text provided for this image

And the output:

No alt text provided for this image

You can learn more about SELECT in the SQL Basics course and in the article How Do You Write a SELECT Statement in SQL, which includes tons of examples.

2.  WHERE

Suppose you run an e-commerce website that has a promotion for US customers. In this case, you want to send out the email only to US customers; the list you want to select should only have US customers. This is when you can use the WHERE clause.

So, if you use this query …

No alt text provided for this image

… you’ll get these results:

No alt text provided for this image

The WHERE clause is used to filter specific rows based on the value of one or more columns (in this case, it’s the "Country" column, which is not shown in the results).

Now say you only want to list females from the US. You can filter for multiple conditions in WHERE if you use the AND keyword:

No alt text provided for this image

Here’s the result:

No alt text provided for this image

In the above query, the value for "Country" and "Sex" for each row is evaluated against the conditions in the WHERE clause. Only those rows where both conditions are true will be returned. If you want to return the results when either (or any) of the WHERE conditions is true, use the OR keyword:

No alt text provided for this image

This selects all female customers as well as all customers who live in the US. Notice the difference in the results:

No alt text provided for this image

Need more resources on the WHERE clause? Check out these articles:

3.  ORDER BY

Let's say you have another table in your database called the "Orders" table.

No alt text provided for this image

You want to see the top orders and their details. One way to do this is to list orders in descending order (10–1, Z–A) by value. You can use something like this query:

No alt text provided for this image

And you’d get:

No alt text provided for this image

Using a ‘*’ after SELECT tells the SQL engine to retrieve all the columns for a table. The ORDER BY clause directs that the results be shown in a particular order. By default, the results are returned in ascending order (1–10, A–Z); however, since we explicitly mention the DESC keyword, our results are returned in descending order.

If you'd like to know more, check out the excellent Detailed Guide to SQL ORDER BY.

4.  GROUP BY

Oftentimes, to generate insights from large data sets, we may want to group similar items together. This grouping is called aggregation; it can be achieved with the SQL GROUP BY clause.

You generally use a function to calculate the group metric. Commonly used aggregate functions are SUM(), COUNT(), AVG(), MAX(), and MIN(). With these, you can do many things, e.g. calculating the total expenses for a department or counting the number of employees in each location.

Say you run an offer in your business where you send a free gift to the customer who spends the most on your website. You can use SQL GROUP BY clause here to retrieve the top customer in terms of total order value. Here’s the query:

No alt text provided for this image

And here's the result of it:

No alt text provided for this image

Clearly, your top customer is "Customer_Id" 1213.

The query works by first selecting all the unique "Customer_Id" values and then using GROUP BY to find the aggregate total sum for each (i.e. the total value of all their orders). The ORDER BY 2 DESC clause displays the results in decreasing order by SUM(Total_Value).

For more information on GROUP BY in data analysis, check out What Is GROUP BY in SQL?

5.  JOIN

In the previous example, we found the "Customer_id" of the highest-spending customer. But the "Orders" table doesn’t store any other customer information, such as name or communication details. You’d need to write another query to the "Customers" table to retrieve the required data – or you could just use JOIN.

The SQL JOIN feature lets you select information from multiple tables in a single query. Typically, you will connect two tables based on one or more column values common to both tables.

In our case, we can use "Customer_Id" as the common field. It will join the "Customers" and "Orders" tables:

No alt text provided for this image

This is the output:

No alt text provided for this image

Now the query also returns the customer details you needed. It works by comparing the "Customer_Id" fields in both tables and then retrieving only those rows where it finds a match for the values. Notice that the customer IDs 1211, 1214, and 1215 are not in the results; there were no matching rows in the "Orders" table for these values.

If you want to display all the values – even those which do not have a corresponding match in the other table – you can use something called an OUTER JOIN. Joins can be somewhat tricky to understand, but they are one of the most important concepts in SQL. Feel free to check out this course on SQL JOINs. It offers 93 exercises covering all types of JOINs (there are several).

Continue Learning SQL for Data Analysis

The examples we have shown you are of course very simple. We just wanted you to understand the main concepts and see how easy it is to create SQL queries. 

Being data-driven is a trait you can inculcate with practice. So, If you want to learn the queries that I’ve shared with you in practice - take our SQL Basics course, and if you want to excel beyond the basics - check out what our All Forever Package has to offer.

This article is based on the work of Himanshu Kathuria, "These SQL Queries Will Help You With Data Analysis".

To view or add a comment, sign in

More articles by LearnSQL.com

Insights from the community

Others also viewed

Explore topics