Exercise 6: Fitting models of discrete character evolution
For this exercise we will fit models of discrete character evolution. We can use a dataset from a paper by Brandley et al. (2008).
First, load the data files into R. You can download the files here:
library(ape)
library(phytools)
sqData<-read.csv("brandley_table.csv")
sqTree<-read.nexus("squamate.tre")
plotTree(sqTree,type="fan",lwd=1,fsize=0.5)
The data set that we have is composed of measurements for continuous & discrete characters. E.g.:
head(sqData)
## Species Family SVL TL SE FLL HLL Fingers Toes N
## 1 Agamodon anguliceps Amphisbaenia 74.7 8.3 3.1 0.0 0 0 0 1
## 2 Amphisbaena alba Amphisbaenia 474.0 40.1 12.6 0.0 0 0 0 6
## 3 Bipes biporus Amphisbaenia 167.2 16.0 3.6 6.1 0 5 0 6
## 4 Bipes caniculatus Amphisbaenia 182.5 29.6 4.5 6.5 0 4 0 3
## 5 Bipes tridactylus Amphisbaenia 127.5 36.0 3.7 6.0 0 3 0 2
## 6 Blanus cinereus Amphisbaenia 174.9 21.8 4.1 0.0 0 0 0 5
## Ecology Source PC1 PC2 PC3 Morph Ecology.1
## 1 burrowing Zug et al. (2001) -0.880 -0.410 -2.726 1 1
## 2 burrowing Zug et al. (2001) -1.078 2.057 -1.055 1 1
## 3 burrowing Zug et al. (2001) -0.021 0.566 -2.780 1 1
## 4 burrowing Zug et al. (2001) -0.220 0.634 -1.932 1 1
## 5 burrowing Zug et al. (2001) -0.418 0.028 -1.522 1 1
## 6 burrowing Zug et al. (2001) -1.132 0.296 -1.673 1 1
Now we're ready to fit our models. We will do this using the function
fitDiscrete
in the R package 'geiger'.
library(geiger)
We will model the number of digits in the hindfoot. In class, we can try to use
fitMk
, just because it runs a little faster (though it may be less accurate as
a consequence):
species<-sqData$Species
species<-gsub(" ","_",species)
toes<-setNames(as.character(round(sqData$Toes)),
species)
fitER<-fitDiscrete(sqTree,toes,model="ER")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
## Gonatodes_albogularis
## Lepidophyma_flavimaculatum
## Trachyboa_boulengeri
print(fitER,digits=3)
## GEIGER-fitted comparative model of discrete data
## fitted Q matrix:
## 0 1 2 3 4 5
## 0 -0.004580 0.000916 0.000916 0.000916 0.000916 0.000916
## 1 0.000916 -0.004580 0.000916 0.000916 0.000916 0.000916
## 2 0.000916 0.000916 -0.004580 0.000916 0.000916 0.000916
## 3 0.000916 0.000916 0.000916 -0.004580 0.000916 0.000916
## 4 0.000916 0.000916 0.000916 0.000916 -0.004580 0.000916
## 5 0.000916 0.000916 0.000916 0.000916 0.000916 -0.004580
##
## model summary:
## log-likelihood = -235.614300
## AIC = 473.228600
## AICc = 473.244225
## free parameters = 1
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## frequency of best fit = 1.00
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
plot(fitER)
title(main="Fitted 'ER' model\nfor number of digits in hindfoot")
Or, we can try an ordered model:
model<-matrix(c(0,1,0,0,0,0,
2,0,3,0,0,0,
0,4,0,5,0,0,
0,0,6,0,7,0,
0,0,0,8,0,9,
0,0,0,0,10,0),6,6)
rownames(model)<-colnames(model)<-0:5
fitOrdered<-fitDiscrete(sqTree,toes,model=model)
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
## Gonatodes_albogularis
## Lepidophyma_flavimaculatum
## Trachyboa_boulengeri
## Warning in fitDiscrete(sqTree, toes, model = model): Parameter estimates appear at bounds:
## q13
## q14
## q15
## q16
## q24
## q25
## q26
## q31
## q35
## q36
## q41
## q42
## q46
## q51
## q52
## q53
## q61
## q62
## q63
## q64
print(fitOrdered,digits=3)
## GEIGER-fitted comparative model of discrete data
## fitted Q matrix:
## 0 1 2 3 4 5
## 0 -5.34e-17 5.34e-17 0.00e+00 0.0000 0.0000 0.000
## 1 2.11e-02 -2.11e-02 1.09e-15 0.0000 0.0000 0.000
## 2 0.00e+00 8.02e-02 -3.06e+00 2.9843 0.0000 0.000
## 3 0.00e+00 0.00e+00 4.22e+00 -4.2527 0.0295 0.000
## 4 0.00e+00 0.00e+00 0.00e+00 0.0649 -4.2881 4.223
## 5 0.00e+00 0.00e+00 0.00e+00 0.0000 0.2411 -0.241
##
## model summary:
## log-likelihood = -203.914844
## AIC = 427.829687
## AICc = 428.720376
## free parameters = 10
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## frequency of best fit = 0.01
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
plot(fitOrdered,show.zeros=FALSE,signif=4)
title(main="Fitted ordered model\nfor number of digits in hindfoot")
Finally, the most parameter rich model:
fitARD<-fitDiscrete(sqTree,toes,model="ARD")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
## Gonatodes_albogularis
## Lepidophyma_flavimaculatum
## Trachyboa_boulengeri
print(fitARD,digits=3)
## GEIGER-fitted comparative model of discrete data
## fitted Q matrix:
## 0 1 2 3 4 5
## 0 -8.23e-07 7.47e-09 2.68e-07 2.20e-07 1.30e-07 1.98e-07
## 1 1.51e-02 -1.51e-02 7.90e-07 4.34e-07 2.84e-07 5.33e-07
## 2 9.13e-04 3.14e-02 -3.26e-02 1.39e-06 1.29e-05 2.95e-04
## 3 1.39e-02 5.81e-07 5.14e-03 -4.81e-02 1.38e-05 2.90e-02
## 4 2.23e-08 2.10e-02 1.35e-05 1.43e-02 -8.87e-02 5.35e-02
## 5 2.57e-13 2.97e-07 9.17e-04 4.13e-04 4.57e-03 -5.90e-03
##
## model summary:
## log-likelihood = -194.252862
## AIC = 448.505723
## AICc = 456.699556
## free parameters = 30
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## frequency of best fit = 0.01
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
plot(fitARD,show.zeros=FALSE,signif=4)
title(main="Fitted 'ARD' model\nfor number of digits in hindfoot")
And, the AIC values (& weights) from our fitted models:
aicc<-setNames(
c(fitER$opt$aicc,fitOrdered$opt$aicc,
fitARD$opt$aicc),
c("ER","Ordered","ARD"))
aic.w(aicc)
## ER Ordered ARD
## 0.00000000 0.99999916 0.00000084
This result suggests that a model in which digits can be lost & regained, but in an ordered fashion, fits best by far.
Written by Liam J. Revell & Luke J. Harmon. Last updated 11 December 2017.