Statistical Learning for Biomedical Data 1st Edition by James D. Malley – Ebook PDF Instant Download/Delivery: 0521875803, 9780521875806
Full download Statistical Learning for Biomedical Data 1st Edition after payment
Product details
ISBN 10: 0521875803
ISBN 13: 9780521875806
Author: James D. Malley
This book is for anyone who has biomedical data and needs to identify variables that predict an outcome, for two-group outcomes such as tumor/not-tumor, survival/death, or response from treatment. Statistical learning machines are ideally suited to these types of prediction problems, especially if the variables being studied may not meet the assumptions of traditional techniques. Learning machines come from the world of probability and computer science but are not yet widely used in biomedical research. This introduction brings learning machine techniques to the biomedical world in an accessible way, explaining the underlying principles in nontechnical language and using extensive examples and figures. The authors connect these new methods to familiar techniques by showing how to use the learning machine models to generate smaller, more easily interpretable traditional models. Coverage includes single decision trees, multiple-tree techniques such as Random Forests™, neural nets, support vector machines, nearest neighbors and boosting.
Statistical Learning for Biomedical Data 1st Table of contents
Part I Introduction
1 Prologue
1.1 Machines that learn – some recent history
1.2 Twenty canonical questions
1.3 Outline of the book
Part I Introduction
Part II A machine toolkit
Part III Analysis fundamentals
Part IV Machine strategies
1.4 A comment about example datasets
1.5 Software
Note
2 The landscape of learning machines
2.1 Introduction
2.2 Types of data for learning machines
2.3 Will that be supervised or unsupervised?
2.4 An unsupervised example
2.5 More lack of supervision – where are the parents?
2.6 Engines, complex and primitive
2.7 Model richness means what, exactly?
2.8 Membership or probability of membership?
2.9 A taxonomy of machines?
2.10 A note of caution – one of many
2.11 Highlights from the theory
Notes
3 A mangle of machines
3.1 Introduction
3.2 Linear regression
3.3 Logistic regression
3.4 Linear discriminant
3.5 Bayes classifiers – regular and naïve
3.6 Logic regression
3.7 k-Nearest neighbors
3.8 Support vector machines
3.9 Neural networks
3.10 Boosting
3.11 Evolutionary and genetic algorithms
Notes
4 Three examples and several machines
4.1 Introduction
4.2 Simulated cholesterol data
4.3 Lupus data
4.4 Stroke data
4.5 Biomedical means unbalanced
4.6 Measures of machine performance
4.7 Linear analysis of cholesterol data
4.8 Nonlinear analysis of cholesterol data
4.9 Analysis of the lupus data
4.10 Analysis of the stroke data
4.11 Further analysis of the lupus and stroke data
Notes
Part II A machine toolkit
5 Logistic regression
5.1 Introduction
5.2 Inside and around the model
5.3 Interpreting the coefficients
5.4 Using logistic regression as a decision rule
5.5 Logistic regression applied to the cholesterol data
5.6 A cautionary note
5.7 Another cautionary note
5.8 Probability estimates and decision rules
5.9 Evaluating the goodness-of-fit of a logistic regression model
5.10 Calibrating a logistic regression
5.11 Beyond calibration
5.12 Logistic regression and reference models
Notes
6 A single decision tree
6.1 Introduction
6.2 Dropping down trees
6.3 Growing a tree
6.4 Selecting features, making splits
6.5 Good split, bad split
6.6 Finding good features for making splits
6.7 Misreading trees
6.8 Stopping and pruning rules
6.9 Using functions of the features
6.10 Unstable trees?
6.11 Variable importance – growing on trees?
6.12 Permuting for importance
6.13 The continuing mystery of trees
7 Random Forests – trees everywhere
7.1 Random Forests in less than five minutes
7.2 Random treks through the data
7.3 Random treks through the features
7.4 Walking through the forest
7.5 Weighted and unweighted voting
7.6 Finding subsets in the data using proximities
7.7 Applying Random Forests to the Stroke data
Definitions
Training results
7.8 Random Forests in the universe of machines
Notes
Part III Analysis fundamentals
8 Merely two variables
8.1 Introduction
8.2 Understanding correlations
8.3 Hazards of correlations
8.4 Correlations big and small
Notes
9 More than two variables
9.1 Introduction
9.2 Tiny problems, large consequences
9.3 Mathematics to the rescue?
9.4 Good models need not be unique
9.5 Contexts and coefficients
9.6 Interpreting and testing coefficients in models
9.7 Merging models, pooling lists, ranking features
Notes
10 Resampling methods
10.1 Introduction
10.2 The bootstrap
10.3 When the bootstrap works
10.4 When the bootstrap doesn’t work
10.5 Resampling from a single group in different ways
10.6 Resampling from groups with unequal sizes
10.7 Resampling from small datasets
10.8 Permutation methods
10.9 Still more on permutation methods
Note
11 Error analysis and model validation
11.1 Introduction
11.2 Errors? What errors?
11.3 Unbalanced data, unbalanced errors
11.4 Error analysis for a single machine
11.5 Cross-validation error estimation
11.6 Cross-validation or cross-training?
11.7 The leave-one-out method
11.8 The out-of-bag method
11.9 Intervals for error estimates for a single machine
11.10 Tossing random coins into the abyss
11.11 Error estimates for unbalanced data
11.12 Confidence intervals for comparing error values
11.13 Other measures of machine accuracy
11.14 Benchmarking and winning the lottery
11.15 Error analysis for predicting continuous outcomes
Notes
Part IV Machine strategies
12 Ensemble methods – let’s take a vote
12.1 Pools of machines
12.2 Weak correlation with outcome can be good enough
12.3 Model averaging
Notes
13 Summary and conclusions
13.1 Where have we been?
13.2 So many machines
13.3 Binary decision or probability estimate?
13.4 Survival machines? Risk machines?
13.5 And where are we going?
Appendix
A1: Software used in this book
Classification and Regression Tree; CART
k-Nearest Neighbor; k-NN
Support Vector Machines; SVM
Fisher Linear Discriminant Analysis; LDA
Logistic Regression
People also search for Statistical Learning for Biomedical Data 1st
statistical learning for biomedical data
statistical learning biomedical data pdf
james d malley statistical learning
biomedical machine learning book
statistics and learning in medicine
Tags: Statistical learning, biomedical data, James Malley, prediction problems



