Skip to content

Commit

Permalink
version 2.0.0
Browse files Browse the repository at this point in the history
  • Loading branch information
Mathieu Gautier authored and cran-robot committed Aug 1, 2016
1 parent 8a3678f commit 97e6ea5
Show file tree
Hide file tree
Showing 57 changed files with 50,723 additions and 45,872 deletions.
8 changes: 8 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
Changes from Version 1.13 to 2.0.0 [July 30 2016]

* Major modification of the algorithm to explore haplotype variability (more than one order of magnitude faster), including parallelization
* Major modification of the data2haplohh function to improve reading efficiency and allele recoding. In addition, a new input data file format (haplotype.in.columns) is now available
* Computation of xp-EHH
* A vignette that details how to use the package is now included in the package
* Several other minor modifications in other functions

Changes from Version 1.12 to 1.13 [May 13 2015]

* Removing smartlegend() calls (deprecated in new version of gplot) in calc_ehh.R and fistribplot.R
Expand Down
12 changes: 7 additions & 5 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,18 +1,20 @@
Package: rehh
Maintainer: Mathieu Gautier <[email protected]>
Author: Mathieu Gautier and Renaud Vitalis
Version: 1.13
Author: Mathieu Gautier, Alexander Klassmann and Renaud Vitalis
Version: 2.0.0
License: GPL (>= 2)
Title: Searching for Footprints of Selection using Haplotype
Homozygosity Based Tests
Description: Functions for the detection of footprints of selection on
dense SNP data using Extended Homozygosity Haplotype (EHH)
based tests. The package includes computation of EHH, iHS
(within population) and Rsb (across pairs of populations)
(within population) and Rsb or XP-EHH (across pairs of populations)
statistics. Various plotting functions are also included to
facilitate visualization and interpretation of the results.
Depends: R (>= 2.10), gplots, methods
Suggests: knitr, rmarkdown
VignetteBuilder: knitr
NeedsCompilation: yes
Packaged: 2015-05-13 14:36:55 UTC; mathieu
Packaged: 2016-08-01 12:46:28 UTC; mathieu
Repository: CRAN
Date/Publication: 2015-05-13 19:54:58
Date/Publication: 2016-08-01 15:35:27
88 changes: 51 additions & 37 deletions MD5
Original file line number Diff line number Diff line change
@@ -1,42 +1,56 @@
6984428c4aa45239a5a0b97e84240d5b *ChangeLog
eb7e311e133b8ff8b8f1dee961730365 *DESCRIPTION
bcdae995c3c93898c7f6adc04767cc5c *NAMESPACE
0d7cb76701f9400146447856aa473fd8 *ChangeLog
71620313c7e3a4866190332e8d1d5883 *DESCRIPTION
05b360a82b2bd675ab11bd88a90b393c *NAMESPACE
66b19d927cd1910977ea9a424e808f09 *R/bifurcation.diagram.R
67d44ad2dea81d0183e1cab53a577563 *R/calc_ehh.R
f2ff441060f34c66e0ae54153d83468f *R/calc_ehhs.R
ae6d29a3c3fd66aba573924440851407 *R/data2haplohh.R
b4bc9634a424e3aa52346765577a964d *R/distribplot.R
85a4171cae44228896492839af39bb68 *R/ies2rsb.R
b76e82e3b88c8c45377c0cc854bff0d2 *R/ihh2ihs.R
662baf5629173558243cd7219a1926dd *R/ihsplot.R
dd71cccaae6eed93076e2483a8dfecfd *R/rsbplot.R
c5825d26fe79dba43c63b89b1a30fad0 *R/scan_hh.R
699746379ea262def40efbc56c679658 *R/calc_ehh.R
cbed731420d9fcd1e9bab0b57e799d37 *R/calc_ehhs.R
48ca2d501fa11855708c042a2a5f7adf *R/data2haplohh.R
8e7ac48d09bd3f9caaec529f51828418 *R/distribplot.R
87f5b93f5e1e93ef0acf6e1fdb80e052 *R/ies2rsb.R
67af403e866224c046dbb725f8c20053 *R/ies2xpehh.R
236c2ad942460aa02c508fb85b2d930d *R/ihh2ihs.R
2991884ead6039f9a7c5fb4fea3949a8 *R/ihsplot.R
0100f29f5ec45e2de523dea58eda3f83 *R/rsbplot.R
827eb56411c08141b2f3f9a86305202c *R/scan_hh.R
0a5b13c5112b4f13eb371ae329e82dd0 *R/xpehhplot.R
56ca3e289b27856b18d0d191721240e8 *build/vignette.rds
df37a969a6c3282c7fd1be1a2690cfc5 *data/datalist
4ff9b9f05e00e5a6555311a35f2ea999 *data/haplohh_cgu_bta12.RData
ddd497430735de0aebb28752e2e65b6c *data/wgscan.cgu.RData
9ad082e4c440f44cb88ed1be370cef9e *data/wgscan.eut.RData
47039ecd7d04ba08df80ffe3869b0309 *inst/CITATION
c7450d209e5e28fd010cbf819daff970 *inst/bta12_cgu.hap
967ad01ab376789623691df52f9da6c7 *inst/bta12_hapguess_switch.out
0f0a48a088e9396eb5f910071752404d *inst/map.inp
6da19214c71e247efd27581fa6333c42 *data/haplohh_cgu_bta12.RData
62937dadac4fb0efc170e763e074b8bd *data/wgscan.cgu.RData
a61b7ace8362a7a7d238b1a84da38c13 *data/wgscan.eut.RData
350e782fc3afbab45c52656e2e5294d0 *inst/CITATION
db3e62212c980a90891cac657b25826d *inst/bta12_cgu.hap
d8bb7a5c602165e4422327930f6bbdeb *inst/bta12_cgu.thap
5cf68acec6f7b58ba857662a3784fe60 *inst/bta12_hapguess_switch.out
c5f1e79e086517d85b4460dde32edba5 *inst/doc/rehh.R
904b52d7f42d303d82f763336fe02422 *inst/doc/rehh.Rmd
2ce34fb320d6da7868de232031337767 *inst/doc/rehh.pdf
e7104828e7c42393294e3c79dd6d5c95 *inst/map.inp
40e6f7e1e4b3c42113bd33fc7bf47e07 *man/bifurcation.diagram.Rd
77147d026fc337a849783cad1f2ee3d4 *man/calc_ehh.Rd
a6e674c9ce11f2f0a6d83cda2c072590 *man/calc_ehhs.Rd
986b3fc2f01e6eb2d4d4896154ff5ea5 *man/data2haplohh.Rd
ac1505aaaa9b9fbbc96f234216acdb13 *man/distribplot.Rd
44b27e0a791be5fafb566e8a4d73f5f6 *man/haplohh-class.Rd
1b224df9a4cf1f8c625858006a4fcbdc *man/calc_ehh.Rd
bf9d680f4af94b3870cb967aa585bed6 *man/calc_ehhs.Rd
38995eada0ef7f7e02f12ca2543ff1e3 *man/data2haplohh.Rd
c9828e94efc2e4ce70235e7adc2dede8 *man/distribplot.Rd
d6781ea56e68cec123287ef0ebdd18f3 *man/haplohh-class.Rd
b95c45a34dbd6b6480e768dd7844c404 *man/haplohh_cgu_bta12.Rd
02cb75fb3960c33d63f148d012bc54b2 *man/ies2rsb.Rd
fc13e25715051cf157c81858b5c73db2 *man/ihh2ihs.Rd
9104fbe8964e64d51965e7d646695bed *man/ihsplot.Rd
6ea84dda9606bc6a6faf06140d5fd815 *man/ies2rsb.Rd
549af2380abeca0f4a25ee66c4666a33 *man/ies2xpehh.Rd
116899717cb600e3c88b2a687f9311c6 *man/ihh2ihs.Rd
4d7e53156b8477334cb1e6b8c8c0d259 *man/ihsplot.Rd
d03d38fed02ff83e81198daa01c1f802 *man/make.example.files.Rd
916d9427d0fd783259f206c2a2fc4b0a *man/rehh-package.Rd
7049b50f122d7176097541aa0890dbb7 *man/rsbplot.Rd
129675738cf3ca416e8afba38115a760 *man/scan_hh.Rd
e1386dbc371f9e02682db399a757b92f *man/wgscan.cgu.Rd
9554d23c2d20e5e2ec02c6bdb5911c31 *man/wgscan.eut.Rd
24e42067c1759e98d35bf5d5f7830df6 *src/ehh_utils.c
8990d49dfe459f062f81582dbf258a4e *src/ehh_utils.h
f41f1d0cc587e35bfe4b3f8f5932df85 *src/r_ehh.c
bf70ef94c905a2a24fff8a51df7df98e *src/r_ehhs.c
f7dbd9d25463fdbbd04c356a75a5213d *src/r_scan_hh.c
93822329242be4b66b8275040740e582 *man/rehh-package.Rd
b9a5558a7be81080e0af61be87b00e8d *man/rsbplot.Rd
83db5b58f86b554d857b6cd60b016875 *man/scan_hh.Rd
b22f317e8431f2045fa5634da6c35ee0 *man/wgscan.cgu.Rd
abe08226b2cf658641fafc7cbe1bbb46 *man/wgscan.eut.Rd
44502b4653e8932a5bf74880ccd57331 *man/xpehhplot.Rd
8a5e5d619066ed0ed9cf639954215fc0 *src/CALL_EHH.c
b41d4ec2f8e6eb8ee0b212e20207ea10 *src/CALL_EHHS.c
dc32d80f034a1d49cbd3bef501d5f005 *src/CALL_SCAN_HH.c
95e3011e37d9dde0d75f3a3819b2acd3 *src/Makevars
361ae38b8fca7de155202143366bc241 *src/hh_utils.c
b223f181cf1e51b26fcce58242e451d2 *src/hh_utils.h
80ac2e78110aa3e54fb5ce824bd5574f *vignettes/genetics.csl
904b52d7f42d303d82f763336fe02422 *vignettes/rehh.Rmd
1e78f3b2a2af30127833e772ce889507 *vignettes/rehh.html
5eb01f115834ab23893ed2278d9e1b20 *vignettes/vignette.bib
7 changes: 7 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,18 @@ export(calc_ehhs)
export(data2haplohh)
export(distribplot)
export(ies2rsb)
export(ies2xpehh)
export(ihh2ihs)
export(ihsplot)
export(make.example.files)
export(rsbplot)
export(scan_hh)
export(xpehhplot)

import(gplots)
import(methods)
importFrom("grDevices", "dev.new")
importFrom("graphics", "abline", "axis", "curve", "legend", "lines","matplot", "par", "plot", "plot.new")
importFrom("stats", "density", "dnorm", "median", "pnorm", "qqnorm","sd")
importFrom("utils", "read.table", "tail")

96 changes: 58 additions & 38 deletions R/calc_ehh.R
Original file line number Diff line number Diff line change
@@ -1,39 +1,59 @@
calc_ehh<-function(haplohh,mrk,limhaplo=2,limehh=0.05,plotehh=TRUE,main_leg="EHH plot"){
if(!(is.haplohh(haplohh))){stop("Data oject is not of valid haplohh object... (see data2haplohh() function)")}
if(mrk<1 | mrk>haplohh@nsnp){stop(paste("Focal snp index must be between",1,"and",haplohh@nsnp))}
if(limhaplo<2){stop("limhaplo must be >1")}
if(limehh<0 | limehh>1){stop("limehh must be between 0 and 1")}

ehh<-matrix(0,nrow=haplohh@nsnp,ncol=2) ; nhaplo_eval<-matrix(0,nrow=haplohh@nsnp,ncol=2) ; ihh<-rep(0,2)
res.ehh<-.C("r_ehh",
Rdata = as.integer(haplohh@haplo),
number_SNPs = as.integer(haplohh@nsnp),
number_chromosomes = as.integer(haplohh@nhap),
focal_SNP = as.integer(mrk),
map = as.double(haplohh@position),
number_haplotypes = as.integer(nhaplo_eval),
EHH = as.double(ehh),
IHH = as.double(ihh),
min_number_haplotypes = as.integer(limhaplo),
min_EHH = as.double(limehh)
)

ehh=matrix(res.ehh$EHH,2,haplohh@nsnp,byrow=T)
nhaplo_eval=matrix(res.ehh$number_haplotypes,2,haplohh@nsnp,byrow=T)
rownames(ehh)=rownames(nhaplo_eval)=c("Anc. Allele","Der. Allele")
colnames(ehh)=colnames(nhaplo_eval)=haplohh@snp.name
ihh=res.ehh$IHH ; names(ihh)=c("Anc. Allele","Der. Allele")

if(plotehh){
sel_reg<-(colSums(nhaplo_eval)>0)
if(sum(sel_reg)>0){
matplot(haplohh@position[sel_reg],t(ehh[,sel_reg]),col=c("red","blue"),lty=1,
type="l",main=main_leg,bty="n",xlab="Position",ylab="EHH")
abline(v=haplohh@position[mrk],lty=2)
legend("left","top",c("Anc. Allele","Der. Allele"),col=c("red","blue"),bty="n",lty=1)
}
}

return(list(ehh=ehh,nhaplo_eval=nhaplo_eval,freq_all1=nhaplo_eval[1,mrk]/sum(nhaplo_eval[,mrk]),ihh=ihh))

calc_ehh <- function(haplohh,mrk,limhaplo = 2,limehh = 0.05,plotehh = TRUE,lty = 1,lwd = 1.5,col = c("blue","red"),xlab = "Position",ylab = expression(Extended~haplotype~homozygosity~(italic(EHH))),cex.lab = 1.25,main = NA,cex.main = 1.5) {
if (!(is.haplohh(haplohh))) {stop("The data are not formatted as a valid haplohh object... (see the data2haplohh() function)")}
if (mrk < 1 | mrk > haplohh@nsnp) {stop(paste("The focal SNP index must lie between",1,"and",haplohh@nsnp))}
if (limhaplo < 2) {stop("limhaplo must be larger than 1")}
if (limehh < 0 | limehh > 1) {stop("limehh must lie between 0 and 1")}
ehh <- nhaplo_eval <- matrix(0,nrow = haplohh@nsnp,ncol = 2)
ihh <- rep(0,2)
res.ehh <- .C("CALL_EHH",
Rdata = as.integer(haplohh@haplo),
focal_SNP = as.integer(mrk),
number_SNPs = as.integer(haplohh@nsnp),
number_chromosomes = as.integer(haplohh@nhap),
number_haplotypes = as.integer(nhaplo_eval),
min_number_haplotypes = as.integer(limhaplo),
min_EHH = as.double(limehh),
map = as.double(haplohh@position),
EHH = as.double(ehh),
IHH = as.double(ihh)
)
nhaplo_eval = matrix(res.ehh$number_haplotypes,2,haplohh@nsnp,byrow = T)
ehh = matrix(res.ehh$EHH,2,haplohh@nsnp,byrow = T)
rownames(ehh) = rownames(nhaplo_eval) = c("Ancestral allele","Derived allele")
colnames(ehh) = colnames(nhaplo_eval) = haplohh@snp.name
ihh = res.ehh$IHH
names(ihh) = c("Ancestral allele","Derived allele")
if (plotehh) {
sel_reg <- (colSums(nhaplo_eval) > 0)
if (sum(sel_reg) > 0) {
if (max(haplohh@position[sel_reg]) < 1e3) {
scale <- 1
unit = "(bp)"
}
else if (max(haplohh@position[sel_reg]) < 1e6) {
scale <- 1e3
unit = "(kb)"
}
else if (max(haplohh@position[sel_reg]) < 1e9) {
scale <- 1e6
unit = "(Mb)"
}
else {
scale <- 1e9
unit = "(Gb)"
}
dev.new()
plot.new()
par(mar = c(5,5,4,2) + 0.1)
matplot(haplohh@position[sel_reg] / scale,t(ehh[,sel_reg]),col = col,type = "l",lty = lty,lwd = lwd,main = main,bty = "n",xlab = paste(xlab,unit),ylab = ylab,cex.lab = cex.lab, cex.main = cex.main)
abline(v = haplohh@position[mrk] / scale,lty = 2)
if (haplohh@position[mrk] > sum(range(haplohh@position[sel_reg])) / 2) {
legend("topleft",c("Ancestral allele","Derived allele"),col = col,bty = "n",lty = lty,lwd = lwd)
}
else {
legend("topright",c("Ancestral allele","Derived allele"),col = col,bty = "n",lty = lty,lwd = lwd)
}
}
}
return(list(ehh = ehh,nhaplo_eval = nhaplo_eval,freq_all1 = nhaplo_eval[1,mrk] / sum(nhaplo_eval[,mrk]),ihh = ihh))
}
97 changes: 65 additions & 32 deletions R/calc_ehhs.R
Original file line number Diff line number Diff line change
@@ -1,36 +1,69 @@
calc_ehhs<-function(haplohh,mrk,limhaplo=2,limehhs=0.05,plotehhs=TRUE,main_leg="EHHS plot"){
if(!(is.haplohh(haplohh))){stop("Data oject is not of valid haplohh object... (see data2haplohh() function)")}
if(mrk<1 | mrk>haplohh@nsnp){stop(paste("Focal snp index must be between",1,"and",haplohh@nsnp))}
if(limhaplo<2){stop("limhaplo must be >1")}
if(limehhs<0 | limehhs>1){stop("limehhs must be between 0 and 1")}

nhaplo_eval<-rep(0,haplohh@nsnp) ; ehhs<-rep(0,haplohh@nsnp) ; ies<-0
res.ehhs<-.C("r_ehhs",
Rdata = as.integer(haplohh@haplo),
number_SNPs = as.integer(haplohh@nsnp),
number_chromosomes = as.integer(haplohh@nhap),
focal_SNP = as.integer(mrk),
map = as.double(haplohh@position),
number_haplotypes = as.integer(nhaplo_eval),
EHHS = as.double(ehhs),
IES = as.double(ies),
min_number_haplotypes = as.integer(limhaplo),
min_EHH = as.double(limehhs)
)
calc_ehhs <- function(haplohh,mrk,limhaplo = 2,limehhs = 0.05,plotehhs = TRUE,lty = 1,lwd = 1.5,col = c("blue","red"),xlab = "Position",ylab = expression(Site~specific~italic(EHH)~(italic(EHHS))),cex.lab = 1.25,main = NA,cex.main = 1.5) {

ehhs=res.ehhs$EHHS ; nhaplo_eval=res.ehhs$number_haplotypes
names(ehhs)=names(nhaplo_eval)=haplohh@snp.name

if (!(is.haplohh(haplohh))) {stop("The data are not formatted as a valid haplohh object... (see the data2haplohh() function)")}
if (mrk < 1 | mrk > haplohh@nsnp) {stop(paste("The focal SNP index must lie between",1,"and",haplohh@nsnp))}
if (limhaplo < 2) {stop("limhaplo must be larger than 1")}
if (limehhs < 0 | limehhs > 1) {stop("limehhs must lie between 0 and 1")}

if(plotehhs){
sel_reg<-(nhaplo_eval>0)
if(sum(sel_reg)>0){
plot(haplohh@position[sel_reg],ehhs[sel_reg],col=c("red"),lty=1,
type="l",main=main_leg,bty="n",xlab="Position",ylab="EHHS")
abline(v=haplohh@position[mrk],lty=2)
}
}

return(list(ehhs=ehhs,nhaplo_eval=nhaplo_eval,ies=res.ehhs$IES))
nhaplo_eval <- ehhs_tang <- ehhs_sabeti <- rep(0,haplohh@nsnp)
ies_tang <- ies_sabeti <- 0
res.ehhs <- .C("CALL_EHHS",
Rdata = as.integer(haplohh@haplo),
focal_SNP = as.integer(mrk),
number_SNPs = as.integer(haplohh@nsnp),
number_chromosomes = as.integer(haplohh@nhap),
number_haplotypes = as.integer(nhaplo_eval),
min_number_haplotypes = as.integer(limhaplo),
min_EHHS = as.double(limehhs),
map = as.double(haplohh@position),
EHHS_TANG = as.double(ehhs_tang),
IES_TANG = as.double(ies_tang),
EHHS_SABETI = as.double(ehhs_sabeti),
IES_SABETI = as.double(ies_sabeti)
)

nhaplo_eval = res.ehhs$number_haplotypes
ehhs_tang = res.ehhs$EHHS_TANG
ies_tang = res.ehhs$IES_TANG
ehhs_sabeti = res.ehhs$EHHS_SABETI
ies_sabeti = res.ehhs$IES_SABETI
names(ehhs_tang) = names(ehhs_sabeti) = names(nhaplo_eval) = haplohh@snp.name

if (plotehhs) {
sel_reg <- (nhaplo_eval > 0)
if (sum(sel_reg) > 0) {
if (max(haplohh@position[sel_reg]) < 1e3) {
scale <- 1
unit = "(bp)"
}
else if (max(haplohh@position[sel_reg]) < 1e6) {
scale <- 1e3
unit = "(kb)"
}
else if (max(haplohh@position[sel_reg]) < 1e9) {
scale <- 1e6
unit = "(Mb)"
}
else {
scale <- 1e9
unit = "(Gb)"
}
dev.new()
plot.new()
par(mar = c(5,5,4,2) + 0.1)
plot(haplohh@position[sel_reg] / scale,ehhs_sabeti[sel_reg],col = col[1],type = "l",lty = lty,lwd = lwd,main = main,bty = "n",xlab = paste(xlab,unit),ylab = ylab,cex.lab = cex.lab, cex.main = cex.main)
lines(haplohh@position[sel_reg] / scale,ehhs_tang[sel_reg],col = col[2],lty = lty,lwd = lwd)
abline(v = haplohh@position[mrk] / scale,lty = 2)
if (haplohh@position[mrk] > sum(range(haplohh@position[sel_reg])) / 2) {
legend("topleft",c("Sabeti et al. (2007)","Tang et al. (2007)"),col = col,bty = "n",lty = lty,lwd = lwd)

}
else {
legend("topright",c("Sabeti et al. (2007)","Tang et al. (2007)"),col = col,bty = "n",lty = lty,lwd = lwd)
}
}
}

return(list(nhaplo_eval = nhaplo_eval,EHHS_Tang_et_al_2007 = ehhs_tang,IES_Tang_et_al_2007 = ies_tang,EHHS_Sabeti_et_al_2007 = ehhs_sabeti,IES_Sabeti_et_al_2007 = ies_sabeti))
}

Loading

0 comments on commit 97e6ea5

Please sign in to comment.