11. Cryptography Fundamentals
The Math Behind the Magic
In a world where packets are constantly intercepted and databases are regularly stolen, cryptography is the last line of defense between user privacy and total exposure. It is the science of transforming data so only authorized parties can understand it—and the engineering discipline of deploying that science correctly.
Understanding cryptography is non-negotiable for security professionals. Misimplemented cryptography is often worse than no cryptography at all—it creates a false sense of security. This module covers the three pillars of applied cryptography: Symmetric Encryption, Asymmetric Encryption, and Hashing.
🔐 Symmetric Encryption — One Key, Both Ways
Symmetric encryption uses a single secret key to both encrypt (scramble) and decrypt (unscramble) data. The same key that locks the box unlocks it. It is computationally efficient and extremely fast—capable of encrypting gigabytes of data per second on modern hardware. This makes it the right choice for bulk data encryption.
How It Works: The plaintext message plus the key are fed through the encryption algorithm. The output is ciphertext—random-looking data that cannot be read without the key. Feed the ciphertext plus the same key through the decryption function and you recover the original plaintext.
Current Standard Algorithm — AES (Advanced Encryption Standard):
- AES was selected by NIST in 2001 after a global competition and is the universal symmetric encryption standard.
- AES-128: 128-bit key. Considered secure for most purposes. Brute-forcing 128-bit key space with all computing power on Earth would take longer than the age of the universe.
- AES-256: 256-bit key. Considered quantum-resistant for the foreseeable future. Used for highly sensitive data and government classified information.
- AES is a block cipher—it encrypts fixed-size blocks (128 bits) of data at a time. For larger data, it requires a mode of operation.
AES Modes of Operation — Why This Matters:
- ECB (Electronic Codebook): Each block encrypted independently with the same key. Identical plaintext blocks produce identical ciphertext blocks. This leaks patterns. The famous ECB-mode penguin image perfectly demonstrates this—encrypt a bitmap with ECB and the penguin's shape is still clearly visible in the ciphertext. Never use ECB.
- CBC (Cipher Block Chaining): Each block XORed with the previous ciphertext block before encryption. Identical plaintext produces different ciphertext depending on position. Requires a random IV (Initialization Vector) for each encryption. Vulnerable to padding oracle attacks if not implemented carefully.
- GCM (Galois/Counter Mode): The modern standard. Provides both encryption AND authentication (AEAD — Authenticated Encryption with Associated Data). Detects any tampering with the ciphertext. Used in TLS 1.3, AES-GCM-128, and AES-GCM-256. This is what you should use.
The Fatal Flaw: The Key Distribution Problem
If Alice wants to send an encrypted message to Bob using symmetric encryption, she must somehow get the shared key to Bob securely. But if she could send the key securely, why not just send the message securely? This circular problem was the fundamental unsolved challenge in cryptography for centuries—until asymmetric encryption was invented in 1976.
Dead Algorithms — Know These to Avoid Them:
- DES (Data Encryption Standard, 56-bit): Completely broken. A 56-bit key was exhaustively brute-forced in 1999 by EFF's Deep Crack in 22 hours. Do not use.
- 3DES (Triple-DES): Applied DES three times to extend security. Still weak. NIST deprecated it in 2017. Do not use.
- RC4: Stream cipher used in old WEP WiFi and early TLS. Multiple cryptographic weaknesses discovered. Broken. Prohibited in TLS by RFC 7465. Do not use.
🔑 Asymmetric Encryption — The Padlock and Key
Asymmetric encryption (Public Key Cryptography) solves the key distribution problem through mathematical elegance. It uses two mathematically linked keys: a Public Key and a Private Key. What one key encrypts, only the other can decrypt. The public key can be shared openly with the world. The private key is never, ever shared.
The Mental Model — The Padlock Analogy:
Bob wants to receive encrypted messages from anyone. He manufactures thousands of open padlocks (public keys) and distributes them worldwide—anyone can get one. He keeps the single key that opens all those padlocks (private key) exclusively to himself. When Alice wants to send Bob a secret, she puts her message in a box, snaps one of Bob's open padlocks closed, and sends it. Only Bob's key opens it. Eve can see the padlock (public key) and the locked box (ciphertext) but cannot open it without Bob's private key.
Standard Algorithms:
- RSA (Rivest-Shamir-Adleman): The original and most widely used asymmetric algorithm. Security based on the computational difficulty of factoring the product of two very large prime numbers. A 2048-bit RSA key is currently secure; 4096-bit is recommended for long-term security. Used in TLS handshakes, SSH, S/MIME email encryption, and PGP.
- ECC (Elliptic Curve Cryptography): Based on the mathematics of elliptic curves over finite fields. Achieves equivalent security to RSA with dramatically smaller key sizes—a 256-bit ECC key is equivalent to a 3072-bit RSA key. Smaller keys mean faster operations and less bandwidth. ECC-256 is the modern standard (ECDH for key exchange, ECDSA for signatures). Used in TLS 1.3, Bitcoin, and modern SSH.
- Diffie-Hellman Key Exchange (DHE/ECDHE): Allows two parties to establish a shared secret over an insecure channel without ever transmitting the secret. Used in TLS to achieve Perfect Forward Secrecy—if the server's private key is later compromised, past session recordings cannot be decrypted because each session used a unique ephemeral key.
Digital Signatures — Authenticity and Non-Repudiation:
Asymmetric encryption can work in reverse for signatures. Bob hashes a document and encrypts the hash with his private key. Anyone can decrypt it with Bob's public key, confirming: (1) The document came from Bob—only his private key could have produced this signature. (2) The document hasn't been modified—the hash matches. This is called a Digital Signature and provides Non-Repudiation—Bob cannot later deny signing the document because only he possesses the private key used to create the signature.
Digital signatures power: HTTPS (server proves it's the real site), software updates (OS proves the update came from the vendor), code signing (OS proves an application came from a trusted developer), and email authentication (DKIM proves an email came from the claimed sender).
The Hybrid Encryption Model — How TLS Actually Works:
Pure asymmetric encryption is 1,000–10,000x slower than symmetric encryption. Encrypting large amounts of data with RSA or ECC directly is impractical. The solution: hybrid encryption, used by HTTPS (TLS) and every other practical encrypted protocol:
- Use asymmetric encryption (RSA or ECDH) to securely establish a shared secret between client and server—solving the key distribution problem.
- Use that shared secret as the key for symmetric encryption (AES-GCM) for all actual data transfer—gaining maximum performance.
Every HTTPS connection in the world uses this pattern. The asymmetric handshake takes milliseconds. The symmetric bulk data transfer runs at gigabytes per second.
#️⃣ Hashing — The One-Way Street
Unlike encryption, hashing is designed to be a one-way function. A hash function takes an input of any size and produces a fixed-length output (the hash, digest, or checksum). The same input always produces the same hash. But given only the hash, you cannot reverse it to find the original input—this is mathematically guaranteed by the construction of cryptographic hash functions.
Critical Properties of Cryptographic Hash Functions:
- Deterministic: The same input always produces the same hash. SHA-256("hello") is always
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824. - Pre-image Resistance: Given a hash, you cannot find the original input. This is the one-way property—fundamental to password storage security.
- Second Pre-image Resistance: Given an input, you cannot find a different input that produces the same hash.
- Collision Resistance: You cannot find any two inputs that produce the same hash. MD5 and SHA-1 have been broken—collisions have been found. This is why they are deprecated.
- Avalanche Effect: Change one bit of the input and approximately 50% of the output bits change. SHA-256("hello") vs SHA-256("Hello") produces completely different hashes with no similarity.
Standard Algorithms:
- SHA-256 (Secure Hash Algorithm 256-bit): The current standard. Produces a 256-bit (64 hex character) hash. Part of the SHA-2 family. Secure. Used for password storage (with proper salting), file integrity verification, digital signatures, Bitcoin mining, and TLS certificates.
- SHA-3 (Keccak): An alternative to SHA-2 with different mathematical underpinnings. Equally secure but designed by a completely independent team, providing algorithmic diversity if SHA-2 is ever weakened.
- MD5 (Message Digest 5): Produces a 128-bit hash. Cryptographically broken—do not use for security purposes. Collisions can be computed in seconds on modern hardware. The 2012 Flame malware exploited an MD5 collision to forge a Microsoft code-signing certificate. Acceptable only for non-security checksums (file corruption detection).
- SHA-1: Produces a 160-bit hash. Broken in 2017 (SHAttered attack). Google demonstrated a practical collision. Not acceptable for digital signatures or certificates. Most browsers and CAs rejected SHA-1 certificates in 2017.
Password Storage — The Correct Way:
Websites must never store plaintext passwords. Even hashing is insufficient if done naively:
- Naive Hashing (Wrong): Store
SHA-256(password). If two users share the same password, their hashes are identical—an attacker who cracks one cracks both. More critically, Rainbow Tables (pre-computed hash dictionaries) allow instant lookup of any hash that appears in the table. - Salted Hashing (Better): Generate a unique random value (salt) for each user. Store
SHA-256(salt + password)along with the salt. Identical passwords produce different hashes. Rainbow tables are useless—they'd need to be computed for every possible salt value. - Proper Password Hashing (Best — What You Should Use): Use purpose-built password hashing algorithms that are intentionally slow and computationally expensive: bcrypt, Argon2, or PBKDF2. These apply the hash function thousands of times (work factor/iterations) making offline cracking 10,000x slower. Argon2id is the current recommended choice, with configurable memory and time costs that scale with hardware improvements.
Without salting: An attacker who steals your database of SHA-256 hashed passwords can crack them with Hashcat at 10+ billion hashes per second on a modern GPU. A simple 8-character password cracks in seconds.
With Argon2id (properly configured): The attacker can test perhaps 100–1000 candidates per second against each hash. An 8-character complex password goes from "cracks in seconds" to "takes years". Salting + work factor together make offline cracking computationally infeasible for strong passwords.
The lesson: SHA-256 was designed for speed (verifying millions of file downloads quickly). Speed is exactly wrong for passwords. Use algorithms designed for slowness.
🔗 PKI — Public Key Infrastructure
Asymmetric cryptography solves key distribution, but creates a new problem: How do you verify that a public key actually belongs to who you think it does? If you download "google.com's public key" from the internet, how do you know it's not a fake key posted by an attacker claiming to be Google?
PKI (Public Key Infrastructure) solves this through a chain of trust:
- Certificate Authority (CA): A trusted third party (DigiCert, Let's Encrypt, Comodo) that digitally signs certificates after verifying the owner's identity. Your OS and browser ship with a list of ~150 trusted root CAs.
- Digital Certificate (X.509): Contains the domain name, the domain's public key, the CA's digital signature, and validity dates. When your browser connects to google.com, it receives Google's certificate. Your browser verifies the CA's signature using the CA's public key (which it already has). If valid, you know the public key genuinely belongs to google.com.
- Chain of Trust: Root CA → Intermediate CA → End-Entity Certificate. Most certificates are signed by intermediate CAs whose certificate is signed by a root CA. This limits the exposure of the root CA's private key.
- Certificate Transparency (CT) Logs: Public, append-only logs where all issued certificates must be recorded. Allows detection of rogue certificates issued for domains the real owner didn't request—helping catch CA compromises and fraudulent certificate issuance.
✅ Module 11 Summary
- Symmetric encryption (AES) uses one key for both encrypt/decrypt. Fast, ideal for bulk data. Problem: key distribution. Use AES-256-GCM for new implementations.
- Asymmetric encryption (RSA, ECC) uses a public/private keypair. Solves key distribution. Slow—used only for key exchange and signatures. The hybrid model (TLS) combines both.
- Hashing is one-way. SHA-256 is the standard. MD5 and SHA-1 are broken—never use them for security.
- Passwords must be stored using purpose-built slow hashing algorithms: Argon2id > bcrypt > PBKDF2. Never SHA-256 alone.
- PKI provides verified ownership of public keys through certificate authorities. This is the trust foundation of HTTPS.
Knowledge Check
Ready to test your understanding of 11. Cryptography Fundamentals?