Imagine you’re building a weekly sales report. The calculations are done, your clients are happy. But a month later, someone asks: "Hey, can you show what was in that report?" If you didn’t save the data ahead of time, you’ll either have to rebuild it by hand or just say "I can’t." That’s not just inconvenient—it can mess with your reputation.
Logging analytical data solves a bunch of important problems:
- History preservation: you lock in key metrics (like revenue, order count) for specific periods.
- Audit and diagnostics: if something goes wrong, you can always check what data was logged.
- Data comparison: by adding timestamps, you can analyze how your metrics change over time.
- Reusing data: saved metrics can be used in other analytics tasks.
Main Idea: The log_analytics Table
To log analytics data, we create a special table that stores all the key metrics. Every new result is a new row in the table. To get a better feel for how this works, let’s start with a basic scenario.
Sample Table Structure
In the log_analytics table, we’ll store report data. Here’s the structure (DDL — Data Definition Language):
CREATE TABLE log_analytics (
log_id SERIAL PRIMARY KEY, -- Unique row identifier
report_name TEXT NOT NULL, -- Report or metric name
report_date DATE DEFAULT CURRENT_DATE, -- Date the report refers to
category TEXT, -- Data category (like region, product)
metric_value NUMERIC NOT NULL, -- Metric value
created_at TIMESTAMP DEFAULT NOW() -- Logging timestamp
);
log_id: the main row identifier.report_name: the name of the report or metric, like "Weekly Sales".report_date: the date the metric refers to. For example, if it’s sales for October 1st, this would be2023-10-01.category: helps you group data, for example, by region.metric_value: the numeric value for the report metric.created_at: the logging timestamp.
Example of Inserting Data into log_analytics
Let’s say we calculated revenue for October in the "North" region. How do we save that value?
INSERT INTO log_analytics (report_name, report_date, category, metric_value)
VALUES ('Monthly Revenue', '2023-10-01', 'North', 15000.75);
Result:
| log_id | report_name | report_date | category | metric_value | created_at |
|---|---|---|---|---|---|
| 1 | Monthly Revenue | 2023-10-01 | North | 15000.75 | 2023-10-10 14:35:50 |
Creating a Logging Procedure
Obviously, we can’t insert data by hand every week or month. So let’s automate the process with a procedure.
Let’s make a simple procedure for logging revenue data:
CREATE OR REPLACE FUNCTION log_monthly_revenue(category TEXT, revenue NUMERIC)
RETURNS VOID AS $$
BEGIN
INSERT INTO log_analytics (report_name, report_date, category, metric_value)
VALUES ('Monthly Revenue', CURRENT_DATE, category, revenue);
END;
$$ LANGUAGE plpgsql;
Now the log_monthly_revenue procedure takes two parameters:
category: the data category, like region.revenue: the revenue value
Here’s how you call this function to log revenue:
SELECT log_monthly_revenue('North', 15000.75);
The result will be the same as if you used INSERT directly.
Extra Ideas for Log Structure
Sometimes your key metric isn’t just one thing, but a few at once. Let’s see how to handle extra indicators, like order count or average check.
Let’s update the table structure:
CREATE TABLE log_analytics_extended (
log_id SERIAL PRIMARY KEY,
report_name TEXT NOT NULL,
report_date DATE DEFAULT CURRENT_DATE,
category TEXT,
metric_values JSONB NOT NULL, -- Store metrics as JSONB
created_at TIMESTAMP DEFAULT NOW()
);
The big new thing here is using JSONB to store several metrics in one field.
Example of Inserting into the Extended Table
Let’s say you want to save three metrics at once: revenue, order count, and average check. Here’s an example query:
INSERT INTO log_analytics_extended (report_name, category, metric_values)
VALUES (
'Monthly Revenue',
'North',
'{"revenue": 15000.75, "orders": 45, "avg_check": 333.35}'::jsonb
);
Result:
| log_id | report_name | category | metric_values | created_at |
|---|---|---|---|---|
| 1 | Monthly Revenue | North | {"revenue": 15000.75, "orders": 45, "avg_check": 333.35} | 2023-10-10 14:35:50 |
Log Usage Examples: Revenue Analysis
Suppose we want to know the total revenue for all regions in October. Here’s the query:
SELECT SUM((metric_values->>'revenue')::NUMERIC) AS total_revenue
FROM log_analytics_extended
WHERE report_date BETWEEN '2023-10-01' AND '2023-10-31';
Log Usage Examples: Trends by Region
Let’s analyze revenue changes by region:
SELECT category, report_date, (metric_values->>'revenue')::NUMERIC AS revenue
FROM log_analytics_extended
ORDER BY category, report_date;
Handling Common Mistakes
When logging analytics data, it’s easy to make a few mistakes. Let’s talk about them and how to avoid them.
- Mistake: forgot to specify category or date. It’s a good idea to set default values in the table, like
DEFAULT CURRENT_DATE. - Mistake: duplicate entries. To avoid duplicates, you can add a unique index:
CREATE UNIQUE INDEX unique_log_entry ON log_analytics (report_name, report_date, category); - Mistake: calculating metrics with division by zero. Always check your divisor! Use
NULLIF:SELECT revenue / NULLIF(order_count, 0) AS avg_check FROM orders;
Real-World Use Cases
Logging analytics data is useful in all sorts of areas:
- Retail: tracking revenue and sales by product category.
- Services: analyzing server or app load.
- Finance: monitoring transactions and expenses.
This data helps you not just explain what happened, but actually make decisions based on what you see in the logs. Now you know how to keep a history of your analytics data in PostgreSQL. Awesome—there’s even more cool stuff ahead!
GO TO FULL VERSION