主要的get.diff.meth参数
| |参数描述 | |------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 数据| multiAssayExperiment的DNA甲基化和基因表达数据。参见' createMAE '函数。| | diff.dir |一个字符可以是“hypo”或“hyper”,表示不同的甲基化方向。它可以是“hypo”只选择低甲基化的探针;“hyper”只选择高甲基化探针| | minSubgroupFrac |一个从0到1的数字,指定来自第1组和第2组的极端样本的比例,用于识别差异DNA甲基化。默认值是0.2,因为我们通常希望能够检测出肿瘤中特定的(可能未知的)分子亚型;这些亚型通常只占样本的少数,为了统计能力的目的,选择20%作为下限。如果您使用预定义的组标签,例如处理过的复制与未处理的复制,请使用值1.0(***监督***模式)| | pvalue |一个数字指定显著P值(通过BH调整P值),用于选择显著低甲基化/超甲基化探针。默认为0.01 | |组。col |定义样本组的列。 You can view the available columns using: `colnames(MultiAssayExperiment::colData(data))`. | | group1 | A group from group.col. ELMER will run group1 vs group2. That means, if direction is hyper, get probes hypermethylated in group 1 compared to group 2. | | group2 | A group from group.col. ELMER will run group1 vs group2. That means, if direction is hyper, get probes hypermethylated in group 1 compared to group 2. | | sig.dif | A number specifies the smallest DNA methylation difference as a cutoff for selecting significant hypo/hyper-methylated probes. Default is 0.3. |
{r,eval=TRUE, message=FALSE, warning =FALSE, results = "hide"} sig.diff <- get.diff。冰毒(数据= mae,组。col = "definition", group1 = "原发实体瘤",group2 = "实体组织正常",minSubgroupFrac = 0.2, sig.dif = 0.3, diff.dir = "hypo", #搜索组1核的低甲基化探针= 1,dir。Out ="result", pvalue = 0.01)``` ```{r,eval=TRUE, message=FALSE, warning =FALSE} as.data.frame(sig.diff) %>% datatable(options = list(scrollX =TRUE)) # get.diff.meth自动保存输出文件。csv包含所有探测的统计信息。# getMethdiff.hypo.probes.significant.csv只包含有意义的探针,#与sig.diff相同dir(path = "result", pattern = "getMethdiff")识别假定的探针-基因对这一步是将甲基化改变的增强子探针与表达改变的靶基因联系起来,并报告所选探针的假定靶基因。这是由函数' get.pair '实现的。对于每个甲基化差异的远端探针,检测最近的10个上游基因和最近的10个下游基因,以确定探针甲基化与基因表达之间的相关性。因此,对每个探测执行了20个统计检验,如下所示。对于每个探针基因对,样本(所有实验样本和对照样本)被分为两组:M组,由上甲基化五分位数组成(增强子探针上甲基化最高的20%的样本),U组,由最低甲基化五分位数组成(甲基化最低的20%的样本)。 For each probe-gene pair tested, the raw p-value `Pr` was corrected for multiple hypothesis using a permutation approach. ![Source: Yao, Lijing, et al. "Inferring regulatory element landscapes and transcription factor networks from cancer methylomes." Genome biology 16.1 (2015): 105.](figures/paper_get.pairs.png)
主要得到的。对参数
| |参数描述 | |-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 数据| multiAssayExperiment DNA甲基化和基因表达数据。参见createMAE函数。| | nearGenes |可以是包含GetNearGenes函数输出的列表,也可以是包含' GetNearGenes '函数输出的rda文件路径。| | minSubgroupFrac |一个从0到1.0的数字,指定用于创建U组(未甲基化)和M组(甲基化)的样本百分比,用于将探针连接到基因。默认值是0.4(最低的五分位数样本将在U组中,最高的五分位数样本将在M组中)。| | permu。size |排列次数。缺省值是10000。| | pvalue |用于定义有效值对的原始p值截断值。默认值是0.05。 It will select the significant P value cutoff before calculating the empirical p-values. | | Pe | A number specify the empirical p-value cutoff for defining signficant pairs. Default is 0.001. | | group.col | A column defining the groups of the sample. You can view the available columns using: `colnames(MultiAssayExperiment::colData(data))`. | | group1 | A group from group.col. | | group2 | A group from group.col. | | filter.probes | Should filter probes by selecting only probes that have at least a certain number of samples below and above a certain cut-off. See `preAssociationProbeFiltering` function. | | filter.portion | A number specify the cut point to define binary methlation level for probe loci. Default is 0.3. When beta value is above 0.3, the probe is methylated and vice versa. For one probe, the percentage of methylated and unmethylated samples should be above filter.percentage value. Only used if filter.probes is TRUE. See preAssociationProbeFiltering function. | | filter.percentage | Minimum percentage of samples to be considered in methylated and unmethylated for the filter.portion option. Default 5%. Only used if filter.probes is TRUE. See preAssociationProbeFiltering function. |
{r, eval = TRUE, message = FALSE, warning = FALSE, results = "hide"} nearGenes <- GetNearGenes(data = mae, probes = sig.diff$probe, numFlankingGenes = 20, # 10个上游和10个下游基因cores = 1)Pair <- get。配对(数据= mae,组。col = "definition", group1 = "原发实体瘤",group2 = "实体组织正常",nearGenes = nearGenes, minSubgroupFrac = 0.4, # %的样本用于创建组U/M permu。Dir = "result/permu", permu。pvalue = 0.05, Pe = 0.01, #请设置为0.001,以获得显著结果过滤。probes = TRUE, #见preAssociationProbeFiltering函数filter。百分比= 0.05,过滤器。部分= 0.3,dir。Out = "result", cores = 1, label = "hypo")``` ```{r, eval = TRUE, message = FALSE, warning = FALSE}假设。get %>% datatable(options = list(scrollX = TRUE)) #Pair自动保存输出文件。# getpair . sub .all.pairs.statistic.csv包含所有探针-基因对的统计信息。# getpair . hypos .pairs.significant.csv只包含与hypos .pair相同的重要探测。dir(path = "result", pattern = "getPair")这一步是通过函数get. enrichment . Motif来识别一组探针中的富集Motif。 This function uses a pre-calculated data `Probes.motif` which was generated using HOMER with a $p-value \le 10^{–4}$ to scan a $\pm250bp$ region around each probe using HOmo sapiens COmprehensive MOdel COllection [http://hocomoco.autosome.ru/](HOCOMOCO) v10 [@kulakovskiy2016hocomoco] position weight matrices (PWMs). For each probe set tested (i.e. the list of gene-linked hypomethylated probes in a given group), a motif enrichment Odds Ratio and a 95% confidence interval were calculated using following formulas: $$ p= \frac{a}{a+b} $$ $$ P= \frac{c}{c+d} $$ $$ Odds\quad Ratio = \frac{\frac{p}{1-p}}{\frac{P}{1-P}} $$ $$ SD = \sqrt{\frac{1}{a}+\frac{1}{b}+\frac{1}{c}+\frac{1}{d}} $$ $$ lower\quad boundary\quad of\quad 95\%\quad confidence\quad interval = \exp{(\ln{OR}-SD)} $$ where `a` is the number of probes within the selected probe set that contain one or more motif occurrences; `b` is the number of probes within the selected probe set that do not contain a motif occurrence; `c` and `d` are the same counts within the entire enhancer probe set. A probe set was considered significantly enriched for a particular motif if the 95% confidence interval of the Odds Ratio was greater than 1.1 (specified by option `lower.OR`, 1.1 is default), and the motif occurred at least 10 times (specified by option `min.incidence`. 10 is default) in the probe set. As described in the text, Odds Ratios were also used for ranking candidate motifs.
主要得到的。对参数
|参数|描述||-------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 数据从createMAE |多分析实验功能。如果设置并探测。缺少主题/背景探测,这将用于正确获取其他两个参数。此参数不是必需的,您可以设置探测。母题和背景。手动探针。| | probes |一个向量列出探针的名称,用于定义计算motif富集OR和置信区间的探针集合。| |低。OR |比值比95%置信区间的最小下界。优势比95%置信区间下界高于数量的基序为显著富集基序(详见文献)。| |最小发生率|非负整数指定给定探针集中motif的最小发生率。 10 is default. |
{r,eval=TRUE, message=FALSE, warning =FALSE} #识别显著低甲基化探针的富集motif,其中#具有假定的靶基因。充实。Motif <- get. enrichment。motif(数据= mae,探针= Hypo。两美元的探针,dir。Out = "result", label = "hypo", min.incidence = 10, lower。OR = 1.1) names(富贵motif) #富贵motif head(富贵motif["P73_HUMAN.H10MO.A"]) ##给定集合中有TP53 motif的探针。# get. enrichment .motif自动保存输出文件。# getmotif . sub . enrichment .motifs.rda包含了丰富的motif和带有motif的探针。# getmotif . sub .motif.enrichment.csv包含丰富的motif的摘要。dir(path = "result", pattern = "getMotif")enrichment <- read.csv("result/getMotif.hypo.motif.enrichment.csv") motif.enrichment %>% datatable(options = list(scrollX = TRUE)) # motif enrichment figure will be automatically generated. dir(path = "result", pattern = "motif.enrichment.pdf") ``` ### Figure: hypo.all.quality.motif.enrichment.pdf
主要得到的。对参数
| |参数描述 | |--------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 数据| multiAssayExperiment DNA甲基化和基因表达数据。参见' createMAE '函数。| |富集。motif |包含get. enrichment .motif函数输出或XX路径的列表。包含get. enrichment .motif函数输出的Rda文件。| |组。col |定义样本组的列。您可以使用:' colnames(MultiAssayExperiment::colData(data)) '查看可用的列。| group1 | group.col中的组。| | group2 | group.col中的组。| | minSubgroupFrac |一个从0到1的数字,指定用于创建U(未甲基化)和M(甲基化)组的样本百分比,用于将探针连接到TF表达。 Default is 0.4 (lowest quintile of all samples will be in the U group and the highest quintile of all samples in the M group). |
{r,eval=TRUE, message=FALSE, warning =FALSE, results = "hide"} ##识别富集motif TF <- get的调控TF。tf (data = mae, group。col = "definition", group1 = "原发实体瘤",group2 = "实体组织正常",minSubgroupFrac = 0.4,富贵。主题=丰富。主题,dir。Out = "result", cores = 1, label = "hypo")``` ```{r,eval=TRUE, message=FALSE, warning =FALSE} TF %>% datatable(options = list(scrollX =TRUE)) # get。tf会自动保存输出文件。# gettf . hypos . tfs .with.motif.pvalue.rda包含了在富基序位点具有平均# DNA甲基化的所有TF的统计数据。# gettf . sub .significant. tfs .with.motif.summary.csv只包含重要探测。dir(path = "result", pattern = "getTF") #基于统计数据的TF排名图将自动生成。dir(path = "result/TFrankPlot_family/", pattern = "pdf")图:FOXJ3_HUMAN.H10MO.S.TFrankPlot.pdf