Cancer Genomics

A live mutation atlas across cancers.

Cancer Genomics

Built for a bioinformatics course, this interactive atlas maps mutation frequency across major cancer types on top of the NCI Genomic Data Commons API. It queries live GDC data in the browser for five key oncogenes and tumor suppressors — TP53, KRAS, EGFR, PIK3CA, and PTEN — across TCGA cohorts, so the data reflects the actual GDC dataset rather than a static snapshot.

Two comparison modes address a real limitation of raw mutation counts: large cohorts will always show more mutations just by having more patients. Toggling to the normalized rate view divides mutated cases by cohort size, letting you compare TP53 prevalence in a 500-patient cohort against a 50-patient cohort on equal footing. A pinned second-gene comparison adds another layer — you can hold a cohort fixed and see how two genes’ mutation patterns differ within the same project.

The project also includes a pancreatic cancer case study that walks from cohort selection through mutation-rate interpretation to a peptide-window comparison — narrowing from “which genes are mutated here” down to a specific region of a protein sequence, the kind of drill-down that an antigen or therapeutic-target analysis actually starts from.