Adding covariates to methylated ChAMP packages

Recently, in the analysis of methylation data, it was found that the input end of ChAMP package was beta (methylation data) and pheno (phenotypic data), and the input end of correction covariates (such as age, gender, batch, etc.) could not be found.

The following code shows:

champ.DMP(beta = myNorm,
              pheno = myLoad$pd$Sample_Group,
              compare.group = NULL,
              adjPVal = 0.05,
              adjust.method = "BH",
              arraytype = "450K")

Check the code of the ChAMP package and find that the ChAMP package calls the limma function.

Therefore, this is a better solution.

You only need to modify the model from the original code.

Here's how to modify the model.

1 Create methylated and phenotypic, covariate datasets

beta <- matrix(runif(60,min=0,max=1),10,6)
rownames(beta) <- c("cg18478105","cg09835024","cg14361672","cg01763666","cg12950382","cg02115394","cg25813447","cg07779434","cg13417420","cg12480843")
colnames(beta) <-paste("sample",1:6)
da=data.frame(pheno=c("1","1","1","2","2","2"),SEX=c("1","2","1","1","1","2"),AGE=c(54,23,58,43,68,36))
rownames(da) <- paste("sample",1:6)

The methylation data are as follows:

Note: Bea data is generated randomly, so everyone produces different data.

The pheno type, covariate (SEX, AGE) data are as follows:

2 Modify model, adjust covariates

Here we assume that the covariates to be corrected are SEX and AGE.

I created it in Tian YuanChamp.DMPOn the basis of the function, two input parameters, cov1=da$cov1 and cov2=da$cov2, are added.
stayChamp.DMPFunction has been modified

design <- model.matrix( ~ 0+ factor(p)+cov1+cov2 )
colnames(design) <- c("control", "case","cov1","cov2")
contrast.matrix <- makeContrasts(contrasts=paste(colnames(design)[2:1],collapse="-"), levels = design)

The last modified function is as follows:

champ.DMP1 <- function(beta = myNorm,
                      pheno = da$Sample_Group,
                      cov1=da$cov1,
                      cov2=da$cov2,
                      adjPVal = 0.05,
                      adjust.method = "BH",
                      compare.group = NULL,
                      arraytype = "EPIC")
{
    message("[===========================]")
    message("[<<<<< ChAMP.DMP START >>>>>]")
    message("-----------------------------")

    if(is.null(pheno) | length(unique(pheno))<=1)
    {
        stop("pheno parameter is invalid. Please check the input, pheno MUST contain at least two phenotypes.")
    }else
    {
        message("<< Your pheno information contains following groups. >>")
        sapply(unique(pheno),function(x) message("<",x,">:",sum(pheno==x)," samples."))
        message("[The power of statistics analysis on groups contain very few samples may not strong.]")
    }
	
    if(is.null(compare.group))
    {
        message("You did not assign compare groups. The first two groups: <",unique(pheno)[1],"> and <",unique(pheno)[2],">, will be compared automatically.")
        compare.group <- unique(pheno)[1:2]
    }else if(sum(compare.group %in% unique(pheno))==2)
    {
        message("As you assigned, champ.DMP will compare ",compare.group[1]," and ",compare.group[2],".")
    }else
    {
        message("Seems you did not assign correst compare groups. The first two groups: <",unique(pheno)[1],"> and <",unique(pheno)[2],">, will be compared automatically.")
        compare.group <- unique(pheno)[1:2]
    }
    
    p <- pheno[which(pheno %in% compare.group)]
    beta <- beta[,which(pheno %in% compare.group)]
   design <- model.matrix( ~ 0+ factor(p)+cov1+cov2 )
	colnames(design) <- c("control", "case","cov1","cov2")
    contrast.matrix <- makeContrasts(contrasts=paste(colnames(design)[2:1],collapse="-"), levels = design)
    message("\n<< Contrast Matrix >>")
    print(contrast.matrix)

    message("\n<< All beta, pheno and model are prepared successfully. >>")
	
	fit <- lmFit(beta, design)
	fit2 <- contrasts.fit(fit,contrast.matrix)
	tryCatch(fit3 <- eBayes(fit2),
      warning=function(w) 
      {
      	stop("limma failed, No sample variance.\n")
      })
    DMP <- topTable(fit3,coef=1,number=nrow(beta),adjust.method=adjust.method)
    message("You have found ",sum(DMP$adj.P.Val <= adjPVal), " significant MVPs with a ",adjust.method," adjusted P-value below ", adjPVal,".")
    message("\n<< Calculate DMP successfully. >>")

    if(arraytype == "EPIC") data(probe.features.epic) else data(probe.features)
    com.idx <- intersect(rownames(DMP),rownames(probe.features))
    avg <-  cbind(rowMeans(beta[com.idx,which(p==compare.group[1])]),rowMeans(beta[com.idx,which(p==compare.group[2])]))
    avg <- cbind(avg,avg[,2]-avg[,1])
    colnames(avg) <- c(paste(compare.group,"AVG",sep="_"),"deltaBeta")
    DMP <- data.frame(DMP[com.idx,],avg,probe.features[com.idx,])

    message("[<<<<<< ChAMP.DMP END >>>>>>]")
    message("[===========================]")
    message("[You may want to process DMP.GUI() or champ.GSEA() next.]\n")
    return(DMP)
}

3. UseChamp.DMP1Analyzing methylation data

champ.DMP1 <- champ.DMP1(beta =beta,pheno = da$pheno, cov1=da$SEX,cov2=da$AGE, adjust.method = "BH", arraytype = "EPIC")

beta is the methylation data generated in the first step, and pheno, SEX, AGE on the da data frame are also generated in the first step.
arraytype = "EPIC" indicates that the methylation data type is 850K.If it is "450K" data, change "EPIC" to "450K";

4. Modification Skills

The example given above is to correct two covariates. If you want to correct three covariates, make the following three changes:

First change:

champ.DMP1 <- function(beta = myNorm,
                      pheno = da$Sample_Group,
                      cov1=da$cov1,
                      cov2=da$cov2,
                      cov3=da$cov3,
                      adjPVal = 0.05,
                      adjust.method = "BH",
                      compare.group = NULL,
                      arraytype = "EPIC")

Second change:

design <- model.matrix( ~ 0+ factor(p)+cov1+cov2+cov3 )
colnames(design) <- c("control", "case","cov1","cov2","cov3")
contrast.matrix <- makeContrasts(contrasts=paste(colnames(design)[2:1],collapse="-"), levels = design)

Third change:

champ.DMP1 <- champ.DMP1(beta =beta,pheno = da$pheno, cov1=da$SEX,cov2=da$AGE, cov3=da$cov3,adjust.method = "BH", arraytype = "EPIC")

5. Acknowledgements

Thanks to the original author tian yuan (this package is all-round!);

Thank you for sharing the introductory methylation analysis exercise: General analysis process of methylation chip

Suggest that you can see the video of Jianming in station B, which is very detailed for students who are just getting started with methylation.

Posted on Tue, 19 May 2020 16:54:03 -0700 by Bazzaah