When you’re dealing with small databases, it’s no big deal to run queries or procedures by hand to build reports. But in the real world, databases grow to a size where any repetitive task needs to be automated. Imagine someone asks you every day to prep a sales report. Even if the query takes just two minutes, over a year you’ll spend more than 12 hours running it. You’d probably rather spend that time sipping coffee while an automatic procedure does all the work for you.
Automation helps you:
- Cut down on manual work.
- Make sure reports are always on time (like daily or weekly reports).
- Minimize the chance of mistakes caused by human error.
- Boost trust in your reports: they’re always created with the right parameters.
Main Steps for Automatic Report Generation
Automatically running reports involves these steps:
- Create a procedure in PL/pgSQL that generates the report.
- Set up logging for the results (if you need it).
- Use a task scheduler to run the procedure on a schedule.
Let’s walk through this step by step!
Creating a Procedure to Generate a Report
First, let’s make a simple procedure that calculates the total revenue from all orders for the current day and saves the result in a log table. We already have a table for logging (let’s call it sales_report_log):
CREATE TABLE sales_report_log (
report_date DATE NOT NULL,
total_sales NUMERIC NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Now let’s create a procedure in PL/pgSQL:
CREATE OR REPLACE FUNCTION generate_daily_sales_report()
RETURNS VOID AS $$
BEGIN
-- Calculate total revenue for the current day
INSERT INTO sales_report_log (report_date, total_sales)
SELECT CURRENT_DATE, SUM(order_total)
FROM orders
WHERE order_date = CURRENT_DATE;
RAISE NOTICE 'Report for % created successfully', CURRENT_DATE;
END;
$$ LANGUAGE plpgsql;
Here’s what’s going on:
- We use the aggregate function
SUM()to calculate total revenue from theorderstable. - The report date (
report_date) is always set to today (CURRENT_DATE). - The result gets saved to the
sales_report_logtable. - The
RAISE NOTICEmessage is there for debugging: it tells you the report was created successfully.
Testing the Procedure
Before you automate running this procedure, it’s always a good idea to test it manually. Let’s run the function:
SELECT generate_daily_sales_report();
Now let’s check what’s in the sales_report_log table:
SELECT * FROM sales_report_log;
If you see a row with today’s date and the correct total revenue — congrats, your function works!
Automating Tasks in PostgreSQL
Sometimes it’s handy for the database to do stuff on its own: run reports, clean up old records, or update aggregates on a schedule. PostgreSQL lets you do this with the pg_cron extension or an external task scheduler — like the system cron or Task Scheduler.
If you’re on Linux, your best bet is pg_cron. This extension runs SQL right inside PostgreSQL, so you don’t need to mess with shell or scripts.
You can install pg_cron like this (don’t forget to replace XX with your PostgreSQL version):
sudo apt install postgresql-XX-cron
After installing, you need to hook it up in your config. Open postgresql.conf and add this line:
shared_preload_libraries = 'pg_cron'
Then restart PostgreSQL and activate the extension in your database:
CREATE EXTENSION pg_cron;
Now you can schedule a job. For example, to run generate_daily_sales_report() every day at midnight:
SELECT cron.schedule(
'daily_sales_report',
'0 0 * * *',
$$ SELECT generate_daily_sales_report(); $$
);
Here:
'daily_sales_report'— the job name;'0 0 * * *'— cron-style schedule (in this case — every day at 00:00);- The SQL between
$$— the code that’ll run.
To see all scheduled jobs, use:
SELECT * FROM cron.job;
If you’re on Windows or macOS, pg_cron either isn’t supported at all (on Windows) or you have to build it from source (on macOS). That’s a pain, and in most cases it’s easier to use the system scheduler.
Here’s how you do it:
- Create a SQL file with the command you want:
echo "SELECT generate_daily_sales_report();" > /path/to/script.sql
- Use
psqlto run the file:
psql -h localhost -U postgres -d your_database -f /path/to/script.sql
Add this command to your task scheduler:
On Linux/macOS: via
crontab -e:0 0 * * * psql -h localhost -U postgres -d your_database -f /path/to/script.sqlOn Windows: use Task Scheduler to create a job that runs
psql.exewith the right parameters.
- If you’re on Linux, use
pg_cron— it’s convenient and built into PostgreSQL. - If you’re on Windows or Mac, it’s smarter to rely on the system scheduler (
cronor Task Scheduler) and run SQL throughpsql.
This way you can automate any tasks in PostgreSQL without breaking a sweat.
Examples of Automatic Reporting
- Daily Regional Report
Let’s say you want to automatically create a report for revenue in each region. You can expand our function like this:
CREATE OR REPLACE FUNCTION generate_regional_sales_report()
RETURNS VOID AS $$
BEGIN
INSERT INTO regional_sales_report_log (region, report_date, total_sales)
SELECT region, CURRENT_DATE, SUM(order_total)
FROM orders
WHERE order_date = CURRENT_DATE
GROUP BY region;
RAISE NOTICE 'Regional report for % created successfully', CURRENT_DATE;
END;
$$ LANGUAGE plpgsql;
- Monthly Report
You can make a procedure for a monthly report in a similar way. Just change the filter in the query:
WHERE order_date BETWEEN date_trunc('month', CURRENT_DATE)
AND date_trunc('month', CURRENT_DATE) + interval '1 month - 1 day';
Common Mistakes and How to Avoid Them
When you’re automating report generation, you might run into some issues:
- Syntax error in the function: always test your functions by hand before automating.
- Task runs too often: if a job runs too frequently, it can overload your database. Set up your schedule wisely.
- Duplicate data: if the report runs more than once a day, you might get duplicates. Use unique keys to prevent repeats.
This lecture showed you how to set up automatic report generation in PostgreSQL. Now you can optimize your analytics workflow and free up more time for important stuff... like bug hunting, writing code, or dreaming about making your SQL queries perfect.
GO TO FULL VERSION