这些笔记是在课程期间创建的,服务器作为所涵盖主题的成绩单。
工作流程
GenomicFeatures :: SummarizeOverlaps()
)从Linux命令行查看...
zcat * fastq.gz |较少的
samtools查看-h * bam
......或之内R./生物体:FASTQ文件
图书馆(Shortread)
##加载所需包:Biocgenerics ##加载所需包:并行## ##附加包:'Biocgenerics'## ##以下对象从“包:并行”屏蔽:## ## ClusterApply,ClusterApplylB,ClusterCall,ClusterApplylb,ClusterCall,clusterevalq,## clusterexport,clustermap,parapply,parcapply,parlapply,## parlapplylb,parrapply,parsapply,parsapplylb ######################################以下对象从“包:base”屏蔽:## ##AnyDupleated,Append,As.Data.frame,AS.Vector,Cbind,## Colnames,Do.call,重复,eval,EVALQ,Filter,查找,获取,## Grep,Grepl,Intersect,IS.Unsorted,Lapply,Legress,Map,## Makply,Make,Mget,订单,粘贴,PMAX,PMAX.int,PMIN,## PMIN.INT,位置,位置,等级,rbind,减少,rownames,sapply,## setdiff,sort,table,tappety,Union,unique,dilp,unsplit ##加载所需包:Biocomplallel ##加载所需包:BioStrings ##加载所需的包:S4VectorS ##加载所需包:atts4 ##加载所需包装:装载所需包:Xvector ##加载所需包:RsamTools ##加载所需包:GenomeinfodB ##加载所需包:Genomicranges ##加载所需包装:基因组##加载所需包:摘要,加载所需包装:BioBase ##欢迎来到Biocumon ##### Vignettes包含介绍性材料;与##'BrowSevignettes()'查看。为了引用生物导体,请参阅##'引文(“Biobase”)',以及包装的引文(“PKGNAME”)'。
strm = fastqstreamer(“bigdata / srr1039508_1.fastq.gz”,100000)fq =产量(strm)fq
##类:ShortReadq ##长度:100000读;宽度:63个循环
SREAD(FQ)
##长度100000 ##宽度SEQ ##的DNAStringSet实例[1] 63 CATTGCTGATACCAANNNNNNNNGCATTC ... GTCTTCCTCCTTCCCTTACGGAATTACA ## [2] 63 CCCTGGACTGCTTCTTGAAAAGTGCCATC ... CTATCTTTGGGGAGAGTATGATAGAGAT ## [3] 63 TCGATCCATCGATTGGAAGGCACTGATCT ... TCAGGTTGGTGGTCTTATTTGCAAGTCC ## [4]63 GAAGAGTTAGCAGCGACCGTGACAGACCA ... GCTCCCAACTCCAGGGTGCCAATCCGAT ## [5] 63 CGTGCAGGAGATCATGATCCCCGCGGGCA ... GCCTGGTCATTGGCAAGGGCGGGGAGAC ## ... ... ... ## [99996] 63 GAGAGAAGCTTTGTATGGCTGTCATGCTT ... TGATTCCTGCAACTTGACCTTCAGGCTG ## [99997] 63 TTATGGTGCAGACATGGCCAAGTCCAAGA ... CCACACACAACCAGTCCCGAAAATGGCA## [99998] 63 TTAAAGTAGAGCATCTAGTTTGAGAAATA ... AATTATTAAAGATGTCTTTTTTCTACCC ## [99999] 63 TCCCAACTGTAGGCTGAGTGACCTGAAGG ... AGACTGCCGAAGTCCAAAAGCTTCAGCA ## [100000] 63 GTGTTTTCTGGTATCGTCCCTTCGTGGTT ... AAAAAATGGTACTGGAAAGGGGTCCCAA
质量(FQ)
##类:FASTQUALITY ##质量?... JJJJJJJJJJJJGHHIDHIJJHHHHHHF ## [3] 63 HJJJJJJJJJJJJJJJJJJJJJJJJJJJJ ... GHIJJBGIJCGIAHIJHHHHHHHFFFFF ## [4] 63 HIJJJJIIJJJJJJJJJJJIJJJJJJJJJ ... IHHHHHHFFFFEEEEDC @ DDDDDDDDDD ## [5] 63 HIGGIIIIIIIGHIIIGIHIIIIJGIFAC ... @@ DDBDDCCDECCDDDB BBBBBD @ B; <##。... ... ## [99996] 63 HJJJJJJJJJJJJGIJJJJJJGGIJJGHH ... CHJJJGGHIJJJJJIJJJJJJJJIHHHH ## [99997] 63 HJJJIJHHIIJJJJIJJJJJIJIJJIJJI ... HHFFFFDDDDDDDDCDDDDD @ DDDDDDD ## [99998] 63 HJJJJJJHIJJJJJJJJJJJJJIJJJJJJ ... JJJJJJJJJJJJJJJJJJJJJJJJJJIJ ## [99999] 63 HJJJJJJJJHIJJJJJJJGHIJJJJJJJJ... JJJJJJJJJJJJJJJHHHHFFFFFFF ## [100000] 63 Haefhijjjjjhijjjjjjjjjihijfh ... Ijjjjjijhhhhhhfffffdd> BDDDD
x = rnorm(1000)y = x + rnorm(1000,sd = .5)df = data.frame(x = x,y = y)图(y〜x,df)
fit = lm(y〜x,df)类(适合)
## [1]“lm”
方法(class = class(fit))
##Kappa ##Vcov ##查看“?方法”用于访问帮助和源代码
方法(“Anova”)
## [1] anova.glm * anova.glmlist * anova.lm * anova.lmlist * ## [5] anova.loess * anova.mlm * anova.nls * ##请参阅“?方法”用于访问帮助和源代码
帮助!
?log?plot#generic'lot'?plot.lm#类'lm'对象的#loc
广泛使用'S4'课程
合身
(从lm()
)是S3类的一个例子SREAD(FQ)
返回A.dnastringset,S4类的一个例子库(Shortread)strm = FastQStreamer(“BigData / SRR1039508_1.FastQ.gz”,100000)FQ =产量(strm)#'Shittreadq'S4类类(FQ)#Introspection
## [1]“shortreadq”## attr(,包装“)## [1]”Shortread“
方法(类=类(FQ))
## [1] [[<--phetBycycle ## [4] AlphabetsCore附录Clean ## [7]胁迫详细散热镜头## [10] ID长度窄## [13] Pirewisealignment QA续订## [16]可再生逆转逆转## [19]显示srdistance srdupleatics ## [22] srad srorder srrank ## [25] srsort表trimends ## [28] trimlrpatterns trimtails trimtailw ## [31]宽度writea writeftq ##请参阅“?方法”以访问帮助和源代码
reads = sread(fq)#catororor - 获取读取读取#'dnastringset's s4类
##长度100000 ##宽度SEQ ##的DNAStringSet实例[1] 63 CATTGCTGATACCAANNNNNNNNGCATTC ... GTCTTCCTCCTTCCCTTACGGAATTACA ## [2] 63 CCCTGGACTGCTTCTTGAAAAGTGCCATC ... CTATCTTTGGGGAGAGTATGATAGAGAT ## [3] 63 TCGATCCATCGATTGGAAGGCACTGATCT ... TCAGGTTGGTGGTCTTATTTGCAAGTCC ## [4]63 GAAGAGTTAGCAGCGACCGTGACAGACCA ... GCTCCCAACTCCAGGGTGCCAATCCGAT ## [5] 63 CGTGCAGGAGATCATGATCCCCGCGGGCA ... GCCTGGTCATTGGCAAGGGCGGGGAGAC ## ... ... ... ## [99996] 63 GAGAGAAGCTTTGTATGGCTGTCATGCTT ... TGATTCCTGCAACTTGACCTTCAGGCTG ## [99997] 63 TTATGGTGCAGACATGGCCAAGTCCAAGA ... CCACACACAACCAGTCCCGAAAATGGCA## [99998] 63 TTAAAGTAGAGCATCTAGTTTGAGAAATA ... AATTATTAAAGATGTCTTTTTTCTACCC ## [99999] 63 TCCCAACTGTAGGCTGAGTGACCTGAAGG ... AGACTGCCGAAGTCCAAAAGCTTCAGCA ## [100000] 63 GTGTTTTCTGGTATCGTCCCTTCGTGGTT ... AAAAAATGGTACTGGAAAGGGGTCCCAA
方法(类=类(读取))
## [1]!!= ## [3] [[[## [5] [[< - ## [7]%<## [9] <= == ##]>> = ##[13] $ < - ## [15]聚合字母频道## [17] ANYNA附加## [19] AS.CHARACTER AS.COPLED ## [21] AS.DATA.FRAME AS.ENV ## [23]As.Integer as.list ## [25] As.Logical AS.Matrix ## [27] As.Numeric AS.raw ## [29] AS.Vector C ## [31] Chartr Clean ## [33]胁迫Compact ## [35]比较比较比较## [37]补充共识usmatrix ##SmoteScore元素长度##] Head High2Low ## [63] IFELSE交叉## [65]是.NA是。une.unsorted ## [67] isempty ismatchingithingat ###[75]匹配## [到达Getop(“max.print”) - 省略了102条条目] ##查看“?方法”用于访问帮助和源代码
GC = Letterfrequency(读取,“GC”,AS.Prob = True)HOST(GC)
帮助!
?Dnastringset#类,通常经常使用方法?Letterfrequency#通用方法(“Letterfrequency”)?“Letterfrequency,XStringset-方法”
关键软件包......
进口()
进口床,假发,GFF,GTF,...,文件......和班级
测定()
rowranges()
用于排行的注释Coldata()
用于列注释注解
org。*
包TXDB。*
包bsgenome。*
包使用大数据的策略
FASTQSTREAMER()
那RSAMTOOLS :: BAMFILE(...,ExeciendSize = 1000000)
;GenomicFiles :: DreambyByield()
(见示例?reamenyByield.
)所有材料课程材料页