Automation & Bots | VoidX Academy

15. Projects (Operator Level)

Module 15: Projects

Six Production-Grade Automation Projects

Theory without execution is incomplete. This module presents six real-world automation projects that each combine multiple techniques from previous modules. Each project includes a complete architecture, the full implementation pattern, deployment considerations, and extension ideas. Build at least two to have a portfolio that demonstrates genuine operator-level capability.

🏪 Project 1: Price Intelligence System

Objective: Monitor competitor prices across 10+ e-commerce sites daily and deliver a Telegram alert when significant price changes are detected.

Architecture:

Redis URL queue with priority scoring (high-volatility products checked more frequently)
Celery workers with Playwright for JavaScript-rendered sites
SQLite database with delta detection (only alert on actual price changes)
Telegram bot delivering formatted alerts with price history graphs
Cron-scheduled daily summary report

# Core price monitoring logic
from dataclasses import dataclass
from datetime import datetime
from typing import Optional

@dataclass
class PricePoint:
    product_id: str
    url: str
    price: float
    currency: str
    scraped_at: datetime
    previous_price: Optional[float] = None
    
    @property
    def change_percent(self) -> Optional[float]:
        if self.previous_price is None or self.previous_price == 0:
            return None
        return ((self.price - self.previous_price) / self.previous_price) * 100
    
    @property
    def is_significant_change(self) -> bool:
        pct = self.change_percent
        return pct is not None and abs(pct) >= 5.0  # 5% threshold
    
    def format_alert(self) -> str:
        direction = '📉' if self.price < self.previous_price else '📈'
        return (
            f'{direction} Price Change Alert\n'
            f'Product: {self.product_id}\n'
            f'Old: ${self.previous_price:.2f} → New: ${self.price:.2f}\n'
            f'Change: {self.change_percent:+.1f}%\n'
            f'View Product'
        )

📊 Project 2: Lead Generation Pipeline

Objective: Scrape LinkedIn, company directories, and industry sites to build a qualified lead database with enriched contact data, then push qualified leads to a CRM via API.

Architecture:

Multi-source scraper: LinkedIn (via Sales Navigator scraping), Apollo.io, Crunchbase
Email verification API integration (Hunter.io, ZeroBounce)
Company enrichment via Clearbit API
Lead scoring model based on company size, industry, and technology signals
HubSpot/Salesforce CRM API integration for qualified leads
Deduplication across sources using fuzzy matching

from fuzzywuzzy import fuzz
from typing import List, Dict

def deduplicate_leads(leads: List[Dict], threshold: int = 85) -> List[Dict]:
    '''Fuzzy deduplication based on name + company similarity'''
    unique = []
    for lead in leads:
        is_duplicate = False
        for existing in unique:
            name_sim = fuzz.ratio(lead['name'].lower(), existing['name'].lower())
            company_sim = fuzz.ratio(lead['company'].lower(), existing['company'].lower())
            if name_sim > threshold and company_sim > threshold:
                is_duplicate = True
                # Merge — prefer the richer record
                if lead.get('email') and not existing.get('email'):
                    existing['email'] = lead['email']
                break
        if not is_duplicate:
            unique.append(lead)
    return unique

def score_lead(lead: Dict) -> int:
    score = 0
    if lead.get('email_valid'): score += 30
    employees = lead.get('company_employees', 0)
    if employees > 100: score += 20
    if employees > 500: score += 10
    if lead.get('title', '').lower() in ['cto', 'vp engineering', 'director']: score += 25
    return score

🤖 Project 3: Multi-Platform Social Bot

Objective: Build a Telegram + Discord bot that serves a community — scheduling posts, answering FAQ, delivering daily briefings, and moderating content.

import asyncio
from telegram.ext import Application
import discord
from discord.ext import commands

class CrossPlatformBot:
    def __init__(self, telegram_token: str, discord_token: str):
        self.tg_app = Application.builder().token(telegram_token).build()
        
        intents = discord.Intents.default()
        intents.message_content = True
        self.dc_bot = commands.Bot(command_prefix='!', intents=intents)
        
        self.telegram_token = telegram_token
        self.discord_token = discord_token
    
    async def broadcast_to_all(self, message: str, telegram_chat_ids: list, discord_channel_ids: list):
        '''Send the same message to all platforms simultaneously'''
        tasks = []
        for chat_id in telegram_chat_ids:
            tasks.append(self.tg_app.bot.send_message(chat_id=chat_id, text=message, parse_mode='HTML'))
        for channel_id in discord_channel_ids:
            channel = self.dc_bot.get_channel(channel_id)
            if channel:
                tasks.append(channel.send(message))
        await asyncio.gather(*tasks, return_exceptions=True)
    
    async def run(self):
        '''Run both bots concurrently'''
        await asyncio.gather(
            self.tg_app.run_polling(),
            self.dc_bot.start(self.discord_token)
        )

🗂️ Project 4: Document Intelligence System

Objective: Scrape documents (PDFs, web articles, reports) from specified sources, process them through an AI pipeline, and build a searchable knowledge base accessible via Telegram or Discord bot.

Stack: Playwright for scraping → LangChain for document processing → ChromaDB for vector storage → Claude API for QA → Telegram bot as the query interface.

📈 Project 5: E-Commerce Arbitrage Detector

Objective: Monitor wholesale supplier prices and retail marketplaces simultaneously. Alert when the margin exceeds a threshold, accounting for fees and shipping.

@dataclass
class ArbitrageOpportunity:
    product_name: str
    asin: str
    buy_price: float
    sell_price: float
    amazon_fees: float
    shipping_cost: float
    
    @property
    def net_profit(self) -> float:
        return self.sell_price - self.buy_price - self.amazon_fees - self.shipping_cost
    
    @property
    def roi_percent(self) -> float:
        total_cost = self.buy_price + self.shipping_cost
        return (self.net_profit / total_cost) * 100 if total_cost > 0 else 0
    
    @property
    def is_viable(self) -> bool:
        return self.net_profit > 3.0 and self.roi_percent > 30

🔔 Project 6: Automated Job Board Aggregator

Objective: Scrape job postings from 20+ job boards, deduplicate, normalize to a common schema, filter by criteria (remote, salary, tech stack), and deliver personalized daily digests via email and Telegram.

Key technical challenges: 20+ different HTML structures → LLM-powered extraction. Daily volume of ~5,000 new postings → Redis queue + Celery. Deduplication across boards → fuzzy matching on title + company + location. Personalization → user preference profiles stored in SQLite.

Knowledge Check

Ready to test your understanding of 15. Projects (Operator Level)?

14. Security & Ethics

16. Capstone System

15. Projects (Operator Level)

Module 15: Projects

Six Production-Grade Automation Projects

🏪 Project 1: Price Intelligence System

Objective: Monitor competitor prices across 10+ e-commerce sites daily and deliver a Telegram alert when significant price changes are detected.

Architecture:

Redis URL queue with priority scoring (high-volatility products checked more frequently)
Celery workers with Playwright for JavaScript-rendered sites
SQLite database with delta detection (only alert on actual price changes)
Telegram bot delivering formatted alerts with price history graphs
Cron-scheduled daily summary report

# Core price monitoring logic
from dataclasses import dataclass
from datetime import datetime
from typing import Optional

@dataclass
class PricePoint:
    product_id: str
    url: str
    price: float
    currency: str
    scraped_at: datetime
    previous_price: Optional[float] = None
    
    @property
    def change_percent(self) -> Optional[float]:
        if self.previous_price is None or self.previous_price == 0:
            return None
        return ((self.price - self.previous_price) / self.previous_price) * 100
    
    @property
    def is_significant_change(self) -> bool:
        pct = self.change_percent
        return pct is not None and abs(pct) >= 5.0  # 5% threshold
    
    def format_alert(self) -> str:
        direction = '📉' if self.price < self.previous_price else '📈'
        return (
            f'{direction} Price Change Alert\n'
            f'Product: {self.product_id}\n'
            f'Old: ${self.previous_price:.2f} → New: ${self.price:.2f}\n'
            f'Change: {self.change_percent:+.1f}%\n'
            f'View Product'
        )

📊 Project 2: Lead Generation Pipeline

Objective: Scrape LinkedIn, company directories, and industry sites to build a qualified lead database with enriched contact data, then push qualified leads to a CRM via API.

Architecture:

Multi-source scraper: LinkedIn (via Sales Navigator scraping), Apollo.io, Crunchbase
Email verification API integration (Hunter.io, ZeroBounce)
Company enrichment via Clearbit API
Lead scoring model based on company size, industry, and technology signals
HubSpot/Salesforce CRM API integration for qualified leads
Deduplication across sources using fuzzy matching

from fuzzywuzzy import fuzz
from typing import List, Dict

def deduplicate_leads(leads: List[Dict], threshold: int = 85) -> List[Dict]:
    '''Fuzzy deduplication based on name + company similarity'''
    unique = []
    for lead in leads:
        is_duplicate = False
        for existing in unique:
            name_sim = fuzz.ratio(lead['name'].lower(), existing['name'].lower())
            company_sim = fuzz.ratio(lead['company'].lower(), existing['company'].lower())
            if name_sim > threshold and company_sim > threshold:
                is_duplicate = True
                # Merge — prefer the richer record
                if lead.get('email') and not existing.get('email'):
                    existing['email'] = lead['email']
                break
        if not is_duplicate:
            unique.append(lead)
    return unique

def score_lead(lead: Dict) -> int:
    score = 0
    if lead.get('email_valid'): score += 30
    employees = lead.get('company_employees', 0)
    if employees > 100: score += 20
    if employees > 500: score += 10
    if lead.get('title', '').lower() in ['cto', 'vp engineering', 'director']: score += 25
    return score

🤖 Project 3: Multi-Platform Social Bot

Objective: Build a Telegram + Discord bot that serves a community — scheduling posts, answering FAQ, delivering daily briefings, and moderating content.

import asyncio
from telegram.ext import Application
import discord
from discord.ext import commands

class CrossPlatformBot:
    def __init__(self, telegram_token: str, discord_token: str):
        self.tg_app = Application.builder().token(telegram_token).build()
        
        intents = discord.Intents.default()
        intents.message_content = True
        self.dc_bot = commands.Bot(command_prefix='!', intents=intents)
        
        self.telegram_token = telegram_token
        self.discord_token = discord_token
    
    async def broadcast_to_all(self, message: str, telegram_chat_ids: list, discord_channel_ids: list):
        '''Send the same message to all platforms simultaneously'''
        tasks = []
        for chat_id in telegram_chat_ids:
            tasks.append(self.tg_app.bot.send_message(chat_id=chat_id, text=message, parse_mode='HTML'))
        for channel_id in discord_channel_ids:
            channel = self.dc_bot.get_channel(channel_id)
            if channel:
                tasks.append(channel.send(message))
        await asyncio.gather(*tasks, return_exceptions=True)
    
    async def run(self):
        '''Run both bots concurrently'''
        await asyncio.gather(
            self.tg_app.run_polling(),
            self.dc_bot.start(self.discord_token)
        )

🗂️ Project 4: Document Intelligence System

Stack: Playwright for scraping → LangChain for document processing → ChromaDB for vector storage → Claude API for QA → Telegram bot as the query interface.

📈 Project 5: E-Commerce Arbitrage Detector

Objective: Monitor wholesale supplier prices and retail marketplaces simultaneously. Alert when the margin exceeds a threshold, accounting for fees and shipping.

@dataclass
class ArbitrageOpportunity:
    product_name: str
    asin: str
    buy_price: float
    sell_price: float
    amazon_fees: float
    shipping_cost: float
    
    @property
    def net_profit(self) -> float:
        return self.sell_price - self.buy_price - self.amazon_fees - self.shipping_cost
    
    @property
    def roi_percent(self) -> float:
        total_cost = self.buy_price + self.shipping_cost
        return (self.net_profit / total_cost) * 100 if total_cost > 0 else 0
    
    @property
    def is_viable(self) -> bool:
        return self.net_profit > 3.0 and self.roi_percent > 30

🔔 Project 6: Automated Job Board Aggregator

Knowledge Check

Ready to test your understanding of 15. Projects (Operator Level)?

14. Security & Ethics

16. Capstone System