SaaS Analytics & Reporting — SKILL.md
Raw skill file that agents receive when using this skill
---
name: "SaaS Analytics & Reporting"
description: "Build analytics for SaaS: dashboard design, chart types (when to use each), data architecture (pre-aggregation, ClickHouse, materialized views), CSV/PDF export, embedded analytics (Metabase, Cube.js), cohort analysis, real-time dashboards, multi-tenant data isolation."
version: "1.0.0"
author: "skynet"
category: "dev"
agents: ["claude-code", "codex", "gemini"]
tags: ["saas", "analytics", "dashboards", "reporting", "metrics", "charts"]
---
# SaaS Analytics & Reporting
---
name: "SaaS Analytics & Reporting"
description: "Build analytics for SaaS: dashboard design, chart types (when to use each), data architecture (pre-aggregation, ClickHouse, materialized views), CSV/PDF export, embedded analytics (Metabase, Cube.js),"
version: "1.0.0"
author: "skynet"
category: "dev"
tags: ["saas", "analytics", "dashboards", "reporting", "metrics", "charts"]
---
# The SaaS Analytics & Reporting Playbook
Building analytics into a SaaS product is the transition from "providing a tool" to "providing a solution." This guide covers the architectural, design, and implementation patterns required to build a world-class analytics engine.
---
## 1. Dashboard Design & UX
A SaaS dashboard should move from **Summary** to **Insight** to **Action**.
* **Metric Cards:** The "Hero" section. Top-level KPIs (e.g., MRR, Active Users) with comparison indicators (`+12% vs last month`).
* **Time-Series Charts:** Primary area for trend analysis. Always include a date-range picker.
* **Filter Bars:** Global filters (Date, Segment, Team, Region) should update all charts simultaneously. Use URL state for filter persistence.
* **Progressive Loading:** Don't wait for all data. Load the dashboard skeleton, then fetch metric cards, then heavy charts.
* **Drill-downs:** Clicking a chart element (e.g., a bar in a "Churn by Reason" chart) should take the user to a filtered table of those specific users.
---
## 2. Chart Types & Libraries
Choose the chart that minimizes "time to insight."
| Chart Type | Best Use Case | Avoid When... |
| :--- | :--- | :--- |
| **Line** | Trends over time (Revenue, DAU) | Comparing non-temporal categories |
| **Bar** | Comparisons (Revenue by Region) | You have >15 categories (use horizontal) |
| **Area** | Volume/Total buildup over time | Comparing many small fluctuations |
| **Donut** | Composition (Market share) | You have >4 segments (hard to read) |
| **Funnel** | Conversion steps (Onboarding flow) | Steps aren't sequential |
| **Heatmap** | Activity density (Feature usage by hour) | Data is sparse |
### Recommended Stack:
* **UI Components:** [Tremor](https://www.tremor.so/) (React-based, built for dashboards), [shadcn/ui Charts](https://ui.shadcn.com/charts).
* **Engines:** [Recharts](https://recharts.org/) (Composable), [Chart.js](https://www.chartjs.org/) (Simple/Fast), [D3.js](https://d3js.org/) (Custom/Complex).
---
## 3. Data Architecture: OLTP vs. OLAP
Never run complex analytics directly against your production transactional database (OLTP) in a high-traffic app.
* **OLTP (Postgres/MySQL):** Optimized for row-level writes/updates. Bad at `SUM()` or `COUNT()` over millions of rows.
* **OLAP (ClickHouse/StarRocks):** Optimized for columnar reads. Can aggregate billions of rows in milliseconds.
* **Materialized Views:** Use for "Daily Aggregates." A background job calculates totals once per day and stores them in a separate table.
* **Lambda Architecture:**
* **Batch Layer:** Historical data (yesterday and older) in an OLAP/Aggregated table.
* **Speed Layer:** Today's data queried from the live OLTP table.
* **Merge:** Application logic combines both for the "Real-time" view.
---
## 4. Key SaaS Metrics (SQL Patterns)
### Net Revenue Retention (NRR)
The most important metric for growth.
```sql
-- Simplified NRR: (Starting MRR + Expansion - Contraction - Churn) / Starting MRR
SELECT
tenant_id,
(SUM(mrr_start) + SUM(mrr_expansion) - SUM(mrr_churn)) / NULLIF(SUM(mrr_start), 0) as nrr
FROM revenue_log
WHERE report_month = '2024-03-01';
```
### Churn Rate (Logo Churn)
```sql
SELECT
(COUNT(DISTINCT CASE WHEN status = 'cancelled' AND month = '2024-03' THEN user_id END)::float /
COUNT(DISTINCT CASE WHEN month = '2024-02' THEN user_id END)) * 100 as churn_rate
FROM subscriptions;
```
---
## 5. Cohort Analysis
Understanding retention by "Signup Month."
**Implementation Pattern:**
1. **Define the Cohort:** Group users by their `created_at` month.
2. **Define the Activity:** Check if they had an active session in `Month + N`.
3. **The Query:**
```sql
WITH user_cohorts AS (
SELECT id, date_trunc('month', created_at) as cohort_month
FROM users
),
active_months AS (
SELECT DISTINCT user_id, date_trunc('month', activity_at) as activity_month
FROM user_activity
)
SELECT
c.cohort_month,
EXTRACT(MONTH FROM age(a.activity_month, c.cohort_month)) as month_number,
COUNT(DISTINCT c.id) as active_users
FROM user_cohorts c
JOIN active_months a ON c.id = a.user_id
GROUP BY 1, 2;
```
---
## 6. Multi-tenancy & Security
Analytics must be "Tenant-Isolated."
* **The Golden Rule:** Every analytics query *must* include a `tenant_id` filter in the `WHERE` clause.
* **Row-Level Security (RLS):** In Postgres, use RLS to ensure a database user (or session variable) can only see data for their specific `tenant_id`.
* **Shared vs. Isolated:**
* **Shared:** All tenants in one table (cheaper, easier to aggregate "global" benchmarks).
* **Isolated:** Database-per-tenant (highly secure, expensive, hard to update schema).
---
## 7. Export & Reporting
* **CSV Streaming:** For large datasets, use Node.js streams or Python generators to pipe DB results directly to the HTTP response. Do not load 100k rows into memory.
* **PDF Generation:** Use **Puppeteer** to render the actual dashboard UI to a PDF. It ensures the charts in the PDF look exactly like the ones on screen.
* **Scheduled Emails:** Use a CRON job to trigger a worker that generates a summary and sends it via Postmark/SendGrid.
---
## 8. Embedded Analytics: Build vs. Buy
How much control do you need?
| Option | Pros | Cons |
| :--- | :--- | :--- |
| **Custom (D3/Recharts)** | Infinite control, perfect UI match. | High dev cost, long time-to-market. |
| **Headless BI (Cube.js)** | Handles caching, modeling, and security. | You still have to build the UI. |
| **Embedded (Metabase/Preset)** | Rapid setup, drag-and-drop for users. | Iframe styling is hard; "Look & Feel" is obvious. |
| **Product Analytics (PostHog)** | Autocapture, heatmaps, session recording. | Less focused on "Revenue/Billing" data. |
### The "Smarter" Build: Headless BI
Use **Cube.js**. It sits between your DB and your UI. It handles the SQL generation, caching (Redis), and multi-tenancy logic, while you build the charts using your own UI components (shadcn/Tremor).
---
## 9. Permissions (RBAC)
* **Owner/Admin:** Full access to financial data (MRR, Churn, Billing).
* **Manager:** Access to team-level performance and usage metrics.
* **Standard User:** Access only to their own "Usage" stats or "Personal Productivity" dashboard.
* **Reporting API:** Provide a Bearer Token for Enterprise customers to pull their data into their own BI tools (Tableau/PowerBI).
---
## 10. Summary Checklist for V1
1. [ ] **Data Pipeline:** Move events to a separate table/DB.
2. [ ] **Aggregation:** Set up a daily cron to calculate "Daily Totals."
3. [ ] **API:** Build endpoints for the 4-5 core KPIs.
4. [ ] **UI:** Implement a date-range picker and 3-4 primary charts.
5. [ ] **Export:** Add a "Download CSV" button for raw data tables.
6. [ ] **Security:** Verify that `tenant_id` is enforced at the query level.
curl -s https://skills.skynet.ceo/api/skills/saas-analytics-reporting/skill.md