++++
Engineering
Mar 2025Ɨ10 min read

Object Storage is a storage architecture that manages data as discrete units called 'objects,' rather than files in a hierarchy or blocks on a disk. It is the backbone of the internet, powering everything from Netflix streaming to astronomical data research.

Object Storage: Scalable Data for the Modern Web šŸ“¦

Driptanil Datta
Driptanil DattaSoftware Developer
šŸŒ
References & Disclaimer

This content is adapted from Mastering System Design from Basics to Cracking Interviews (Udemy). It has been curated and organized for educational purposes on this portfolio. No copyright infringement is intended.


Key Concepts in Object Storage

Unlike traditional storage, object storage is flat. There are no nested folders, only "Buckets" containing "Objects."

  • Object: A self-contained unit of data (the file itself).
  • Bucket: A logical container for storing objects. Think of it as a top-level cloud folder with a globally unique name.
  • Metadata: Custom information that describes the object (e.g., content-type: image/png, user-id: 123).

Popular Platforms

All major cloud providers offer a highly durable (often "11 nines" durability) object storage service:

  • Amazon S3: The industry standard with a massive ecosystem.
  • Google Cloud Storage (GCS): Integrated deeply with GCP’s ML and BigQuery tools.
  • Azure Blob Storage: Best for systems heavily integrated with the Microsoft stack.

Performance & Cost Considerations

Object storage is designed for massive scale and high throughput, but it comes with trade-offs.

1. Performance

  • Latency: Generally higher latency than Block or File storage. Not ideal for small, random I/O.
  • Throughput: Excellent for massive parallel access (e.g., streaming).
  • Consistency: Many platforms now provide strong read-after-write consistency.

2. Cost & Tiers

Providers offer Storage Classes based on access frequency:

  • Standard: Frequently accessed data (Live website images).
  • Infrequent Access (IA): Data accessed occasionally (Monthly reports).
  • Archive / Glacier: Rarely accessed data, low cost but long retrieval times.

Common Use Cases āœ…

  • Media Storage: Storing trillions of photos and videos.
  • Backups & Archives: Durable, long-term data retention.
  • Data Lakes: Storing raw data for massive-scale analytics.
  • Static Website Hosting: Serving HTML/CSS directly via the bucket URL.

Interview Questions & Answers šŸ’”

1. What is object storage and how is it different from file or block storage?

Object storage manages data as discrete units (objects) with rich metadata in a flat address space. File storage is hierarchical (folders), and Block storage is raw data blocks used for DBs.

2. Design a media platform (YouTube). How would you use object storage?

Upload

Store raw video uploads in an S3 Bucket.

Trigger

Use Lambda Triggers to transcode videos into different resolutions.

Processing

Store processed resolutions as separate objects.

Global Delivery

Integrate with a CDN (CloudFront) to cache videos globally.

3. How would you architect a cost-efficient backup system for petabytes of logs?

Implement S3 Lifecycle Policies. Automatically transition logs from Standard → Infrequent Access after 30 days, and then to Glacier Deep Archive after 90 days. This can reduce storage costs by up to 90%.

4. How do services securely share large files in a microservices environment?

Instead of passing the file through APIs, service A uploads the file to object storage and sends a Presigned URL to service B. Service B can then download the file directly using that time-limited link.


Final Thoughts šŸŽÆ

Object storage is the most scalable storage type, with metadata being the key differentiator that allows for advanced automation and search. By leveraging Lifecycle Rules and Storage Tiers, architects can optimize for both performance and cost.

What's next? Mastering SQL vs. NoSQL Databases