Preamble
Cancer data science is a vast field with considerable history and impact. Materials assembled in this book aim to give a flavor of the questions entertained and of the methods employed by researchers in cancer epidemiology, biology, and treatment.
Fundamental questions arise whenever we contemplate sources of cancer risk and public health responses to cancer.
How should we count cancers and communicate this information to the public and to decisionmakers?
What features of environment and lifestyle affect cancer risk? Are variations in geographic distribution of cancer types related to pollution or regionally variable lifestyle patterns?
How does progress in methods of cell biology help us to pinpoint molecular processes underlying cancer and its spread?
What kinds of experiments are needed to create and improve effective treatments for cancer?
In this book we take advantage of the R language, the CRAN repository of R packages for general data science, and the Bioconductor project that provides many R packages and data resources for genomic data science. Relevant texts are:
Peter Dalgaard, Introductory Statistics with R
Gareth James, Daniela Witten et al., An Introduction to Statistical Learning
The Bioconductor website and support site are also potentially helpful.
Back to top