Data Protection & Secure Communication ππ‘
In a world of constant data breaches and strict regulations (GDPR, HIPAA), protecting data at every stage of its lifecycle is not optionalβit's a core requirement of modern system design.
This content is adapted from Mastering System Design from Basics to Cracking Interviews (Udemy). It has been curated and organized for educational purposes on this portfolio. No copyright infringement is intended.
π Why Data Protection Matters
Data is the most valuable asset of a modern company, and its loss can lead to:
- Major Legal Fines: Regulations like GDPR can fine companies up to 4% of their global annual turnover.
- Loss of Reputation: Users won't trust a system that leaks their private information.
- Operational Disruption: Ransomware and data corruption can halt business entirely.
π Understanding Encryption
Encryption is the process of scrambling information so that only authorized parties can understand it.
- Plaintext: The original, readable message (e.g., "Hello").
- Ciphertext: The encrypted, unreadable result (e.g., "SNIfgNI+k0").
- Key: The "secret" used to lock (encrypt) and unlock (decrypt) the data.
Symmetric vs. Asymmetric Encryption
| Feature | Symmetric Encryption | Asymmetric Encryption |
|---|---|---|
| Keys | One single secret key for both. | A Public/Private key pair. |
| Speed | Very fast (used for bulk data). | Slower (used for key exchange). |
| Use Case | AES-256 for database storage. | RSA/ECC for the start of a TLS connection. |
π Encryption at Rest and Transit
1. Encryption at Rest
Protects data while it is stored on physical media (disks, databases, cloud storage).
- Techniques: Full-disk encryption, Database-level encryption (TDE), Field-level encryption.
- Goal: If an attacker steals the hard drive or gains access to the database files, they still can't read the data.
2. Encryption in Transit
Secures data while it is moving across networks (browser to server, service to service).
- Protocols: TLS (Transport Layer Security) and its predecessor SSL.
- Goal: Prevents Man-in-the-Middle (MITM) attacks and eavesdropping.
π€ TLS/SSL and HTTPS
HTTPS is simply HTTP over a secure TLS connection. It ensures Confidentiality, Integrity, and Authenticity.
The TLS Handshake (Simplified):
- Hello: Client and Server agree on encryption algorithms.
- Identity: Server sends its Certificate (containing its Public Key).
- Key Exchange: Client and Server generate a shared "Symmetric Key" using asymmetric encryption.
- Finished: All future data is sent using the fast Symmetric Key.
𧬠Hashing and Salting Passwords
Passwords should never be encrypted; they should be hashed.
- Hashing: A one-way transformation. You can't turn a hash back into the original password.
- Salting: Adding random data (a "salt") to the password before hashing. This ensures that even if two users have the same password, their hashes will look completely different.
- Goal: Protects against Rainbow Table attacks (pre-computed lists of common password hashes).
[!IMPORTANT] Always use "slow" hashing functions designed for passwords, such as Argon2, BCrypt, or SCrypt.
ποΈ Public Key Infrastructure (PKI)
PKI is the system of hardware, software, and policies used to manage digital certificates.
- Certificate Authority (CA): A trusted third party (e.g., Let's Encrypt, DigiCert) that verifies the identity of a website and issues a certificate.
- Digital Signatures: Used to prove that a piece of code or a document hasn't been altered since it was signed.
π‘οΈ Secure API Communication
- Use HTTPS for all traffic: No exceptions.
- Mutual TLS (mTLS): Both the client and server must provide certificates to one another. Highly used in Zero-Trust microservice architectures.
- Authentication Tokens: Use JWT or OAuth tokens for signed requests.
- Rate Limiting & Whitelisting: Limit how many requests can be made and restrict access to known IP ranges where possible.
Interview Questions - Data Protection and Secure Communication π‘
1. What is the difference between hashing and encryption?
Answer:
- Encryption: Purpose is Confidentiality. It transforms data into ciphertext that is reversible with a key (Symmetric like AES, or Asymmetric like RSA).
- Hashing: Purpose is Integrity and Identity. It generates a fixed-size, irreversible digest. The same input always results in the same output, and small changes produce vastly different hashes.
- Comparison: Encryption is for securing data in transit/rest; hashing is for password storage and integrity checks.
2. Why is asymmetric encryption slower than symmetric?
Answer:
- Asymmetric: Uses mathematically complex operations (exponentiation over large primes) with huge keys (2048-bit+). It's CPU-intensive.
- Symmetric: Uses lightweight bitwise operations and blocks. It's much faster and often hardware-accelerated (AES-NI).
- Real-World: In TLS, asymmetric is used for the key exchange, then the session switches to faster symmetric encryption for bulk data.
3. How does PKI build trust online?
Answer:
- Digital Certificates: Issued by Trusted CAs, binding a public key to an identity.
- Chain of Trust: Browsers trust Root CAs, which sign Intermediate CAs, which sign your site's certificate.
- Validation: Browsers check validity, expiration, and issuer. Revocation checks (OCSP/CRL) ensure compromised certs are blocked.
4. What is the concept behind securing Data at Rest and Data in Motion?
Answer:
- Data at Rest: Data stored on disk/DB. Protected via Full-disk encryption (BitLocker), Database encryption (TDE), and IAM roles to prevent physical or unauthorized local theft.
- Data in Motion: Data moving across the network. Protected via TLS/SSL (HTTPS), VPNs, and mTLS to prevent eavesdropping and MITM attacks.
Next up? Integrating security into the lifecycle β Security in the SDLC