<aside> 🐨
Assignment 02 - January 21, 2026 https://classroom.github.com/a/fgf2ebz9
</aside>
MemoryError
Dataset vs laptop memory
Chart shows estimated in-memory size; raw on-disk sizes are in the table below.
Health datasets outgrow laptop RAM quickly: a handful of CSVs with vitals, labs, and encounters can exceed 16 GB once loaded. Attempting to “just read the file” leads to system thrash, swap usage, and eventually Python MemoryErrors that interrupt the workflow.
| Dataset | Typical raw size | In-memory pandas size | Fits on 16 GB laptop? |
|---|---|---|---|
| Intake forms (CSV) | 250 MB | ~1.2 GB (due to dtype inflation) | ✅ |
| Longitudinal vitals (CSV) | 6 GB | ~14 GB | ⚠️ borderline |
| EHR encounter log (CSV) | 18 GB | ~42 GB | ❌ |
| Imaging metadata (Parquet) | 9 GB | ~9 GB | ⚠️ if other apps closed |
| Claims archive (partitioned Parquet) | 120 GB | streamed | ✅ (with streaming) |
top or Activity Monitor shows Python ballooning toward total RAMMemoryError or Killed: 9 messages