Designing Data-Intensive Applications
An interactive learning platform for mastering the big ideas behind reliable, scalable, and maintainable systems.
1
Part I: Foundations of Data Systems
Reliable, Scalable, and Maintainable Applications
0/4
Understanding the core requirements of data systems: Reliability, Scalability, and Maintainability.
Data Models and Query Languages
0/3
How data is structured and accessed.
Storage and Retrieval
0/3
How databases store data on disk and find it again.
Data Structures That Power Your Database20 minTransaction Processing or Analytics?10 minColumn-Oriented Storage15 min
Encoding and Evolution
0/2
Formats for encoding data and handling schema changes.
2
Part II: Distributed Data
Replication
0/4
Keeping a copy of the same data on several nodes.
Leaders and Followers15 minProblems with Replication Lag15 minMulti-Leader Replication15 minLeaderless Replication15 min
Partitioning
0/3
Breaking a large dataset into smaller subsets.
Partitioning Key-Value Data15 minPartitioning and Secondary Indexes15 minRebalancing Partitions10 min
Transactions
0/3
Grouping several reads and writes into a logical unit.
The Trouble with Distributed Systems
0/4
Everything that can go wrong in a distributed system.
Faults and Partial Failures10 minUnreliable Networks15 minUnreliable Clocks15 minKnowledge, Truth, and Lies15 min
Consistency and Consensus
0/3
Getting all nodes to agree on something.
3
Part III: Derived Data
Batch Processing
0/3
Processing large amounts of data offline.
Batch Processing with Unix Tools10 minMapReduce and Distributed Filesystems20 minBeyond MapReduce15 min
Stream Processing
0/3
Processing data as it arrives.
The Future of Data Systems
0/3
Where are we going from here?