本课程中的材料需要R版本3.2和Biocumon V9.2
stopifnot(getRversion()> ='3.2'&& getRversion()<'3.3',biocinstaller :: biocversion()==“3.2”)
生物相投
那基因组夫妇
Scanbamparam()
哪一个
:利益基因组范围什么
:BAM文件的'列',例如,'seq','flag'Bamfile(...,egitysize = 100000)
迭代的编程模型
用GenomicFiles :: DreambyByield()
库(GenomicFiles)收益率< -函数(bfl){# #输入块比对库(GenomicAlignments) readGAlignments (bfl、参数= ScanBamParam (= seq))} < -映射函数(aln){# #计算G或C核苷酸每读库(Biostrings) gc <——letterFrequency (mcols (aln) seq美元,gc) # #总结的读0,1,……G或C核苷酸表(1 + gc, 73) # max。读取长度:72}reduce <- ' + '
例子
fls <- RNAseqData.HNRNPC.bam.chr14。gc <- reduceByYield(bf, yield, map, reduce) plot(gc, type="h", xlab=" gc Content per Aligned Read", ylab="Number of Reads")
很多问题都是尴尬地平行-lapply()
-like - 尤其是在生物信息学中,并行评估跨文件
示例:几个BAM文件中的GC内容
图书馆(Biocparallel)GC < - BPLapply(Bamfilelist(FLS),CrameByByield,产量,地图,减少)库(GGPlot2)DF < - 堆叠(AS.Data.frame(Lapply(GC,Cumsum))DF $ GC < -0:72 ggplot(df,aes(x = gc,y =值))+ geom_line(aes(color = ind))+ xlab(“每个读取的gc核苷酸数”)+ ylab(“读数数”)
致谢
核心(西雅图):Sonali Arora,Marc Carlson,Nate Hayden,Jim Hester,Valerie Obenchain,HervéPagès,Paul Shannon,Dan Tenenbaum。
本演示文稿中报告的研究得到了国家癌症研究所和国家人类基因组研究所,国家人类基因组研究所在奖励号码U24CA180996和U41HG004059和U41HG004059下,并在奖项1247813下的国家科学基金会。内容完全是负责的作者并不一定代表国家卫生研究所或国家科学基金会的官方意见。
sessionInfo ()
sessionInfo ()
## R 3.2.2(2015-08-14)##平台:X86_64-PC-Linux-GNU(64位)##正在运行:debian gnu / linux stretting / sid ## ## locale:##[1] lc_ctype = en_us.utf-8 lc_numeric = c lc_time = en_us.utf-8 ## [4] lc_collate = en_us.utf-8 lc_monetary = en_us.utf-8 lc_messages = en_us.utf-8 ## [7] lc_paper = en_us.utf-8 lc_name = c lc_address = c ## [10] lc_telephone = c lc_measurement = en_us.utf-8 lc_identified = c ## ##附加基本包:## [1] stats4并行统计图形grdevicesUtils数据集方法基础## ##其他附加包:## [1] Genomicfiles_1.5.8 BiocParallel_1.3.54 ## [3] homo.sapiens_1.1 go.db_3.2.2 ## [5] Organismdbi_1.11.43 BioMart_2.25.3##[7] annotationhub_2.1.45 variantannotation_1.15.34 ## [9] rnaseqdata.hnrnpc.bam.chr14_0.7.0 Genomicalignments_1.5.18 ## [11] rsamtools_1.21.21 all_1.11.0 ## [13] org.hs.eg.db_3.2.3 rsqlite_1.0.0 ## [15] dbi_0.3.1 ggplot2_1.0.1 ## [17] Airway_0.103.1 Limma_3.25.18 ## [19] deseq2_1.9.51 rcpparmadillo_0.6.100.0.0 ## [21] RCPP_0.12.1 BSGenome.hsapiens.ucsc.hg19_1.4.0 ## [23] bsgenome_1.37.6 rtracklayer_1.29.28 ## [25] txdb.hsapiens.ucsc.hg19.knowngene_3.2.2 Genomicfeatures_1.21.33 ## [27] AnnotationDBI_1.31.19概述DBI_1.31.19概述分析_ 0.3.11## [29] Biobase_2.29.1 GenomicRanges_1.21.32 ## [31] GenomeinfodB_1.5.16 Microbenchmark_1.4-2 ## [33] BioStrings_2.37.8 XVector_0.9.4 ## [35] Iranges_2.3.26 S4Vectors_0.7.23 ## [37] Biocgenerics_0.15.11 biocstyle_1.7.9 ## ##通过命名空间加载(and未附加):## [1] bitops_1.0-6 rcolorbrewer_1-2 httr_1.0.0 ## [4] Tools_3.2.2 R6_2.1.1Rpart_4.1-10 ## [7] HMISC_3.17-0 ColorSpace_1.2-6 NNet_7.3-11 ## [10] GRIDEXTRA_2.0.0 Graph_1.47.2 FormatR_1.2.1 ## [13] Sandwich_2.3-4 Labeling_0.3 scaleS_0.3.0 ## [16] mvtnorm_1.0-3 Genefilter_1.51.1 RBGL_1.45.1 ## [19] Stringr_1.0.0 digest_0.6.8 figure_0.8-66 ## [22] RmarkDown_0.8.1 HTMLTools_0.2.6 BioCinstaller_1。19.14 ## [25] Shiny_0.12.2 Zoo_1.7-12 Acepack_1.3-3.3 ## [28] RCurl_1.95-4.7 Magrittr_1.5公式_2-1 ## [31] futile.logger_1.4.1 mUNSELL_0.4.2 PROTO_0.3-10 ## [34] Stringi_0.5-5 MULTCOMP_1.4-1 YAML_2.1.13 ## [37] MASS_7.3-44 ZLIBBIOC_1.15.0 PLYR_1.8.3 ## [40] GRID_3.2.2lattice_0.20-33 vithins_3.2.2 ## [43] annotate_1.47.4 locfit_1.5-9.1 knitr_1.11 ## [46] geneplotter_1.47.0 Reshape2_1.4.1 CodeTools_0.2-14 ## [49] futile.options_1.0.0 XML_3.98-1.3 evaluate_0.8 ## [52] latticeExtra_0.6-26 lambda.r_1.1.7 httpuv_1.3.3 ## [55] gtable_0.1.2 mime_0.4 xtable_1.7-4 ## [58] survival_2.38-3 cluster_2.0.3 TH.data_1.0-6 ## [61] interactiveDisplayBase_1.7.3