#2 SQL tips for everyone
Partition Large Tables:
For very large tables, consider partitioning to improve performance and manageability.
CREATE TABLE orders
(
order_id INT,
order_date DATE,
-- other columns
)
PARTITION BY RANGE (order_date);
Example:
Creating a partitioned table
1. Declare the column and the method of partitioning during table creation:
CREATE TABLE sales (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE NOT NULL,
sales_amount DECIMAL(10,2)
)
PARTITION BY RANGE (order_date);
2. Add partitions:
CREATE TABLE sales_2023q1 PARTITION OF sales
FOR VALUES FROM ('2023–01–01') TO ('2023–04–01');
CREATE TABLE sales_2023q2 PARTITION OF sales
FOR VALUES FROM ('2023–04–01') TO ('2023–07–01');
CREATE TABLE sales_2023q3 PARTITION OF sales
FOR VALUES FROM ('2023–07–01') TO ('2023–10–01');
CREATE TABLE sales_2023q4 PARTITION OF sales
FOR VALUES FROM ('2023–10–01') TO ('2024–01–01');
3. Select queries work as normal but we want to leverage the partitioning column.
-- normal query
SELECT * FROM sales WHERE customer_id = 12;
-- leveraging partitioning column
SELECT * FROM sales WHERE order_date = '2023–10–01' and customer_id = 12;
4. Attaching & Detaching partitions:
-- detaching partition
ALTER TABLE sales DETACH PARTITION sales_2023q1;
-- attaching partition
ALTER TABLE ONLY sales ATTACH PARTITION sales_2023q1
FOR VALUES FROM ('2024–10–01') TO ('2024–04–01');
5. List & Hash Partitioning
-- Table with List Partitioning
CREATE TABLE public.products (
product_id serial,
product_name character varying(100),
category character varying(50) NOT NULL,
price numeric
)
PARTITION BY LIST (category);
-- add list partitions
CREATE TABLE products_foods PARTITION OF products
FOR VALUES IN ('Food', 'Kitchen');
CREATE TABLE products_clothing PARTITION OF products
FOR VALUES IN ('Shirts', 'Dresses');
-- Table With Hash Partitioning
CREATE TABLE orders (
order_id SERIAL,
customer_id INT,
product_id INT,
quantity INT,
total_amount DECIMAL(10, 2)
) PARTITION BY HASH(order_id);
-- Adding partitions
CREATE TABLE orders_partition_1 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE orders_partition_2 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE orders_partition_3 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE orders_partition_4 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 3);
-- NOTE: The MODULUS 4 part indicates that we're dividing the hash space into 4
-- partitions. The REMAINDER x part specifies which portion of the hash space each
-- partition will cover. Each partition covers a range of hash values based on
-- their remainders when divided by 4. For example, orders_partition_1 covers hash
-- values where hash(order_id) % 4 = 0, orders_partition_2 covers hash values
-- where hash(order_id) % 4 = 1, and so on. By using hash partitioning, we
-- distribute the data across multiple partitions based on the hash value of the
-- order_id column. This can help in evenly distributing the data and reducing
-- contention on specific partitions.
Partitioning an existing table
There is no way in PostgreSQL to add declarative partitioning to an already existing unpartitioned table.
We have to create a new partitioned table and import the data from the unpartitioned table to the new partitioned one.
1. Back up your data: Safety first!
2. Create a new, partitioned table: Define the desired scheme.
3. Transfer data: Move data from the old table to the new one, following your partition scheme.
4. Switch over: Make the new table the active one.
Bonus Tip: Regularly monitor and maintain your partitions. Archive, drop, or reorganize them as needed to keep your data kingdom optimized and efficient. Remember, the best partitioning strategy depends on your specific data and needs. Experiment, explore, and conquer those large tables with confidence!
Recommended by LinkedIn
Use Common Table Expressions (CTEs)
CTEs can make complex queries easier to read and maintain. They’re especially useful for recursive queries.
WITH CTE AS
(
SELECT column1, column2
FROM table_name
WHERE condition
)
SELECT * FROM CTE;
Example: “what is the average monthly cost per campaign for the company’s marketing efforts?”
This could accomplished with a derived query (shown below); however, using a CTE improves readability
-- define CTE:
WITH Cost_by_Month AS
(SELECT campaign_id AS campaign,
TO_CHAR(created_date, 'YYYY-MM') AS month,
SUM(cost) AS monthly_cost
FROM marketing
WHERE created_date BETWEEN NOW() - INTERVAL '3 MONTH' AND NOW()
GROUP BY 1, 2
ORDER BY 1, 2)
-- use CTE in subsequent query:
SELECT campaign, avg(monthly_cost) as "Avg Monthly Cost"
FROM Cost_by_Month
GROUP BY campaign
ORDER BY campaign
Using a derived query:
SELECT campaign, avg(monthly_cost) as "Avg Monthly Cost"
FROM
-- this is where the derived query is used
(SELECT campaign_id AS campaign,
TO_CHAR(created_date, 'YYYY-MM') AS month,
SUM(cost) AS monthly_cost
FROM marketing
WHERE created_date BETWEEN NOW() - INTERVAL '3 MONTH' AND NOW()
GROUP BY 1, 2
ORDER BY 1, 2) as Cost_By_Month
GROUP BY campaign
ORDER BY campaign
Optimize Data Retrieval with LIMIT and OFFSET:
When working with large datasets, use LIMIT and OFFSET to retrieve only a subset of results.
SELECT column1
FROM table_name
LIMIT 10 OFFSET 5;
The limit option allows you to limit the number of rows returned from a query,
while offset allows you to omit a specified number of rows before the beginning of the result set.
Using both limit and offset skips both rows as well as limit the rows returned.
Example:
Say you want to get 5 artists, but not the first five. You want to get rows 3 through 8. You’ll want to add an OFFSET of 2 to skip the first two rows:
SELECT * FROM artists LIMIT 5 OFFSET 2;
⚡Database Administrator @Spiraledge Viet Nam | SQL Server, MySQL, Google Cloud | Database Management, Design, and Tuning | Let's connect⚡
4moUseful tips
SQL SERVER DATABASE ADMINISTRATOR
5movery helpful tips 🕺