CodeGym /Courses /SQL SELF /Building Analytical Reports with PL/pgSQL

Building Analytical Reports with PL/pgSQL

SQL SELF
Level 59 , Lesson 4
Available

Building Analytical Reports with PL/pgSQL

Analytical reports are systematic representations of data that help you make decisions. For example:

  • Managers want to see what the revenue was for the past month.
  • Analysts are looking for trends in the market.
  • Developers monitor app performance.

Imagine you're a head chef running a huge restaurant. To figure out which dishes are ordered most often, you need a report. PostgreSQL here is your database of recipes and orders, and PL/pgSQL (procedures) is your kitchen assistant who automates the order analysis process.

Basics of Building Analytical Reports

An analytical report is a tool for aggregating, filtering, sorting, and organizing data to get useful info. Usually, the structure of a report includes these steps:

  1. Data prep: selecting info from tables, filtering, and pre-processing.
  2. Data aggregation: calculating metrics (average check, total sales, etc.).
  3. Formatting: organizing data in a way that's easy to read.
  4. Output: showing the report to users or saving it to a table for storage.

Each of these steps can be done using PL/pgSQL procedures.

Creating a Procedure for an Analytical Report

Let's break down a basic example of creating an analytical report. Suppose we have an orders table that stores order data:

CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    total_amount NUMERIC(10, 2)
);

Our task: create a report of total sales for a given month. So, we want to see:

  • Month.
  • Total sales for that month.

Procedure Structure

Here's the plan for our little procedure (don't worry, PL/pgSQL programming doesn't bite):

  1. Take an input parameter — the month.
  2. Select data for that month from the orders table.
  3. Calculate the total sales.
  4. Return the result.

Procedure Implementation

Example code:

CREATE OR REPLACE FUNCTION monthly_sales_report(p_month DATE)
RETURNS TABLE (
    month DATE,
    total_sales NUMERIC(10, 2)
) AS $$
BEGIN
    -- Select data for the given month and aggregate it
    RETURN QUERY
    SELECT 
        DATE_TRUNC('month', o.order_date) AS month,
        SUM(o.total_amount) AS total_sales
    FROM orders o
    WHERE DATE_TRUNC('month', o.order_date) = DATE_TRUNC('month', p_month)
    GROUP BY 1;
END;
$$ LANGUAGE plpgsql;
  1. Input parameter: p_month — a date. We'll use it to filter data by month.
  2. RETURN QUERY: this is the magic thing that lets you return data right from the procedure.
  3. DATE_TRUNC: used to round order_date to the start of the month.
  4. SUM: aggregate function to sum up all orders.
  5. GROUP BY: we group data by month, since reports are built by month.

Now we can call our function:

SELECT * FROM monthly_sales_report('2023-08-01');

And we'll get something like:

month total_sales
2023-08-01 50000.00

This function is the base. Let's make it more interesting!

Creating a More Complex Report

Now imagine we want to break down sales by customer. So, our report should show:

  • Customer
  • Month
  • Total orders for that customer in the month

Let's change the procedure

CREATE OR REPLACE FUNCTION customer_monthly_report(p_month DATE)
RETURNS TABLE (
    customer_id INT,
    month DATE,
    total_sales NUMERIC(10, 2)
) AS $$
BEGIN
    RETURN QUERY
    SELECT 
        o.customer_id,
        DATE_TRUNC('month', o.order_date) AS month,
        SUM(o.total_amount) AS total_sales
    FROM orders o
    WHERE DATE_TRUNC('month', o.order_date) = DATE_TRUNC('month', p_month)
    GROUP BY o.customer_id, DATE_TRUNC('month', o.order_date);
END;
$$ LANGUAGE plpgsql;

Now call the procedure:

SELECT * FROM customer_monthly_report('2023-08-01');

And the result might look like this:

customer_id month total_sales
101 2023-08-01 20000.00
102 2023-08-01 30000.00

Using Temporary Tables

Sometimes when building complex reports, it's handy to use temporary tables. For example, if you need to process intermediate data.

CREATE OR REPLACE FUNCTION temp_table_example(p_month DATE)
RETURNS VOID AS $$
BEGIN
    -- Create a temporary table
    CREATE TEMP TABLE temp_sales AS
    SELECT
        customer_id,
        DATE_TRUNC('month', order_date) AS month,
        SUM(total_amount) AS total_sales
    FROM orders
    WHERE DATE_TRUNC('month', order_date) = DATE_TRUNC('month', p_month)
    GROUP BY customer_id, DATE_TRUNC('month', order_date);

    -- Do extra calculations or manipulations with this table
    -- For example, show top-3 customers by order sum
    RAISE NOTICE 'Top-3 customers for month %:', p_month;
    FOR record IN
        SELECT customer_id, total_sales
        FROM temp_sales
        ORDER BY total_sales DESC
        LIMIT 3
    LOOP
        RAISE NOTICE 'Customer: %, Amount: %', record.customer_id, record.total_sales;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

In this case, the temp table temp_sales is used to store intermediate results.

Handy Tips

  1. Optimization: use indexes to speed up data selection.
  2. Divide by zero errors: always check the divisor so you don't "kill" your report.
  3. Date formatting: use functions like TO_CHAR for nice output.

Hope you didn't have to yawn too much! Even cooler and more interesting tasks are coming up, so don't get too comfy!

2
Task
SQL SELF, level 59, lesson 4
Locked
Finding the Top-3 Best-Selling Products for a Month
Finding the Top-3 Best-Selling Products for a Month
1
Survey/quiz
Procedures for Analytics, level 59, lesson 4
Unavailable
Procedures for Analytics
Procedures for Analytics
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION