Data Science | VoidX Academy

14. Data Products

Module 14: Data Products

Dashboards, KPI Frameworks, and Metrics that Drive Decisions

A data product is any system that delivers data-driven value to users. Dashboards are data products. Recommendation engines are data products. Automated reports, anomaly alerts, and KPI scorecards are all data products. The key distinction between a data product and a data analysis is that a product serves ongoing, recurring information needs with defined SLAs, ownership, and governance.

📊 The KPI Framework

Before building any dashboard, you need a KPI framework — a structured hierarchy of metrics that connects business strategy to operational data:

North Star Metric: The single metric that best captures the value your product/business delivers. Airbnb: Nights booked. Spotify: Time spent listening. Uber: Completed trips. Every other metric exists in service of this one.
Level-1 KPIs (Outcome Metrics): Top-line business outcomes. Revenue, Customer Lifetime Value, Net Revenue Retention, Market Share. These are what the business cares about ultimately.
Level-2 KPIs (Driver Metrics): Metrics that drive Level-1 outcomes. Conversion rate, Average Order Value, Churn Rate, Customer Acquisition Cost. These are what operations teams can influence.
Level-3 KPIs (Diagnostic Metrics): Granular metrics that diagnose problems in Level-2 drivers. Page load time, checkout abandonment by step, support ticket resolution time. Actionable by individual teams.

🔧 Building Production Dashboards

import dash
from dash import dcc, html, Input, Output, callback
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
from datetime import datetime, timedelta

app = dash.Dash(__name__, title='Sales Intelligence Dashboard')

# In production: replace with database queries or cached data
def load_data(start_date: str, end_date: str) -> pd.DataFrame:
    df = pd.read_csv('sales_data.csv', parse_dates=['date'])
    return df[(df['date'] >= start_date) & (df['date'] <= end_date)]

app.layout = html.Div([
    html.H1('Sales Intelligence Dashboard', style={'textAlign': 'center', 'color': '#2c3e50'}),
    
    # Filter controls
    html.Div([
        dcc.DatePickerRange(
            id='date-range',
            min_date_allowed=datetime(2023, 1, 1),
            max_date_allowed=datetime.today(),
            start_date=datetime.today() - timedelta(days=90),
            end_date=datetime.today()
        ),
        dcc.Dropdown(id='region-filter', multi=True,
                     placeholder='All Regions',
                     options=[{'label': r, 'value': r} for r in ['North', 'South', 'East', 'West']])
    ], style={'display': 'flex', 'gap': '20px', 'padding': '20px'}),
    
    # KPI Cards
    html.Div(id='kpi-cards', style={'display': 'flex', 'gap': '20px', 'padding': '0 20px'}),
    
    # Charts
    dcc.Graph(id='revenue-trend'),
    html.Div([
        dcc.Graph(id='region-chart', style={'width': '50%'}),
        dcc.Graph(id='customer-chart', style={'width': '50%'})
    ], style={'display': 'flex'})
])

@callback(
    Output('kpi-cards', 'children'),
    Output('revenue-trend', 'figure'),
    Output('region-chart', 'figure'),
    Input('date-range', 'start_date'),
    Input('date-range', 'end_date'),
    Input('region-filter', 'value')
)
def update_dashboard(start_date, end_date, regions):
    df = load_data(start_date, end_date)
    if regions:
        df = df[df['region'].isin(regions)]
    
    # KPI Metrics
    total_rev = df['revenue'].sum()
    total_customers = df['customer_id'].nunique()
    avg_order = df['revenue'].mean()
    
    kpi_cards = [
        create_kpi_card('Total Revenue', f'${total_rev:,.0f}', '+12.4%', 'positive'),
        create_kpi_card('Unique Customers', f'{total_customers:,}', '+8.1%', 'positive'),
        create_kpi_card('Avg Order Value', f'${avg_order:.2f}', '-2.3%', 'negative'),
    ]
    
    # Trend chart
    monthly = df.resample('ME', on='date')['revenue'].sum().reset_index()
    trend_fig = px.area(monthly, x='date', y='revenue', title='Monthly Revenue Trend',
                        template='plotly_white')
    
    # Region breakdown
    region_data = df.groupby('region')['revenue'].sum().reset_index()
    region_fig = px.bar(region_data, x='region', y='revenue', color='region',
                        title='Revenue by Region', template='plotly_white')
    
    return kpi_cards, trend_fig, region_fig

def create_kpi_card(title, value, change, direction):
    color = '#27ae60' if direction == 'positive' else '#e74c3c'
    return html.Div([
        html.H4(title, style={'margin': '0', 'color': '#7f8c8d', 'fontSize': '14px'}),
        html.H2(value, style={'margin': '5px 0', 'color': '#2c3e50', 'fontSize': '28px'}),
        html.Span(change, style={'color': color, 'fontWeight': 'bold'})
    ], style={'background': 'white', 'padding': '20px', 'borderRadius': '8px',
              'boxShadow': '0 2px 8px rgba(0,0,0,0.1)', 'flex': '1'})

if __name__ == '__main__':
    app.run(debug=False, host='0.0.0.0', port=8050)

⚠️ Metrics Anti-Patterns to Avoid

Vanity Metrics: Numbers that look impressive but don't connect to business outcomes. Total registered users (vs. active users), total page views (vs. meaningful engagement), total downloads (vs. actual usage). Always ask: 'Could this metric increase while the business gets worse?'
Metric Overfitting: Teams optimize so aggressively for the measured metric that they break the underlying goal. Measuring customer satisfaction by call handle time leads to reps hanging up on customers to hit targets.
Leading vs Lagging Confusion: Lagging indicators (revenue, churn) tell you what happened. Leading indicators (trial sign-ups, feature usage) tell you what will happen. Build dashboards with both, but act on leading indicators.
Dashboard Proliferation: When everyone builds their own dashboard, there's no single source of truth. Standardize on a small number of authoritative dashboards rather than allowing hundreds of variations.

14. Data Products

Module 14: Data Products

Dashboards, KPI Frameworks, and Metrics that Drive Decisions

📊 The KPI Framework

Before building any dashboard, you need a KPI framework — a structured hierarchy of metrics that connects business strategy to operational data:

North Star Metric: The single metric that best captures the value your product/business delivers. Airbnb: Nights booked. Spotify: Time spent listening. Uber: Completed trips. Every other metric exists in service of this one.
Level-1 KPIs (Outcome Metrics): Top-line business outcomes. Revenue, Customer Lifetime Value, Net Revenue Retention, Market Share. These are what the business cares about ultimately.
Level-2 KPIs (Driver Metrics): Metrics that drive Level-1 outcomes. Conversion rate, Average Order Value, Churn Rate, Customer Acquisition Cost. These are what operations teams can influence.
Level-3 KPIs (Diagnostic Metrics): Granular metrics that diagnose problems in Level-2 drivers. Page load time, checkout abandonment by step, support ticket resolution time. Actionable by individual teams.

🔧 Building Production Dashboards

import dash
from dash import dcc, html, Input, Output, callback
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
from datetime import datetime, timedelta

app = dash.Dash(__name__, title='Sales Intelligence Dashboard')

# In production: replace with database queries or cached data
def load_data(start_date: str, end_date: str) -> pd.DataFrame:
    df = pd.read_csv('sales_data.csv', parse_dates=['date'])
    return df[(df['date'] >= start_date) & (df['date'] <= end_date)]

app.layout = html.Div([
    html.H1('Sales Intelligence Dashboard', style={'textAlign': 'center', 'color': '#2c3e50'}),
    
    # Filter controls
    html.Div([
        dcc.DatePickerRange(
            id='date-range',
            min_date_allowed=datetime(2023, 1, 1),
            max_date_allowed=datetime.today(),
            start_date=datetime.today() - timedelta(days=90),
            end_date=datetime.today()
        ),
        dcc.Dropdown(id='region-filter', multi=True,
                     placeholder='All Regions',
                     options=[{'label': r, 'value': r} for r in ['North', 'South', 'East', 'West']])
    ], style={'display': 'flex', 'gap': '20px', 'padding': '20px'}),
    
    # KPI Cards
    html.Div(id='kpi-cards', style={'display': 'flex', 'gap': '20px', 'padding': '0 20px'}),
    
    # Charts
    dcc.Graph(id='revenue-trend'),
    html.Div([
        dcc.Graph(id='region-chart', style={'width': '50%'}),
        dcc.Graph(id='customer-chart', style={'width': '50%'})
    ], style={'display': 'flex'})
])

@callback(
    Output('kpi-cards', 'children'),
    Output('revenue-trend', 'figure'),
    Output('region-chart', 'figure'),
    Input('date-range', 'start_date'),
    Input('date-range', 'end_date'),
    Input('region-filter', 'value')
)
def update_dashboard(start_date, end_date, regions):
    df = load_data(start_date, end_date)
    if regions:
        df = df[df['region'].isin(regions)]
    
    # KPI Metrics
    total_rev = df['revenue'].sum()
    total_customers = df['customer_id'].nunique()
    avg_order = df['revenue'].mean()
    
    kpi_cards = [
        create_kpi_card('Total Revenue', f'${total_rev:,.0f}', '+12.4%', 'positive'),
        create_kpi_card('Unique Customers', f'{total_customers:,}', '+8.1%', 'positive'),
        create_kpi_card('Avg Order Value', f'${avg_order:.2f}', '-2.3%', 'negative'),
    ]
    
    # Trend chart
    monthly = df.resample('ME', on='date')['revenue'].sum().reset_index()
    trend_fig = px.area(monthly, x='date', y='revenue', title='Monthly Revenue Trend',
                        template='plotly_white')
    
    # Region breakdown
    region_data = df.groupby('region')['revenue'].sum().reset_index()
    region_fig = px.bar(region_data, x='region', y='revenue', color='region',
                        title='Revenue by Region', template='plotly_white')
    
    return kpi_cards, trend_fig, region_fig

def create_kpi_card(title, value, change, direction):
    color = '#27ae60' if direction == 'positive' else '#e74c3c'
    return html.Div([
        html.H4(title, style={'margin': '0', 'color': '#7f8c8d', 'fontSize': '14px'}),
        html.H2(value, style={'margin': '5px 0', 'color': '#2c3e50', 'fontSize': '28px'}),
        html.Span(change, style={'color': color, 'fontWeight': 'bold'})
    ], style={'background': 'white', 'padding': '20px', 'borderRadius': '8px',
              'boxShadow': '0 2px 8px rgba(0,0,0,0.1)', 'flex': '1'})

if __name__ == '__main__':
    app.run(debug=False, host='0.0.0.0', port=8050)

⚠️ Metrics Anti-Patterns to Avoid

Vanity Metrics: Numbers that look impressive but don't connect to business outcomes. Total registered users (vs. active users), total page views (vs. meaningful engagement), total downloads (vs. actual usage). Always ask: 'Could this metric increase while the business gets worse?'
Metric Overfitting: Teams optimize so aggressively for the measured metric that they break the underlying goal. Measuring customer satisfaction by call handle time leads to reps hanging up on customers to hit targets.
Leading vs Lagging Confusion: Lagging indicators (revenue, churn) tell you what happened. Leading indicators (trial sign-ups, feature usage) tell you what will happen. Build dashboards with both, but act on leading indicators.
Dashboard Proliferation: When everyone builds their own dashboard, there's no single source of truth. Standardize on a small number of authoritative dashboards rather than allowing hundreds of variations.

14. Data Products

Dashboards, KPI Frameworks, and Metrics that Drive Decisions

📊 The KPI Framework

🔧 Building Production Dashboards

⚠️ Metrics Anti-Patterns to Avoid

Knowledge Check

14. Data Products

Dashboards, KPI Frameworks, and Metrics that Drive Decisions

📊 The KPI Framework

🔧 Building Production Dashboards

⚠️ Metrics Anti-Patterns to Avoid

Knowledge Check