假定Illumina测序的碱基错误服从二项分布,错误率为1%。对某一长度为1000bp的基因进行测序,尝试在R中模拟分析,发生测序错误的碱基数95%的可能不会超过多少?
curve(expr=dbinom(x,1000,0.01),0,1000)
curve(expr=pbinom(x,1000,0.01),0,1000)
Observation_times<-100000000
length_of_gene<-1000
Probability_of_success<-0.01
result<-rbinom(Observation_times,length_of_gene,Probability_of_success)
result<-sort(result)
result[length(result)*0.95]
[1] 15
升序排列一亿次重复实验的结果,取第95%位置的结果
发生测序错误的碱基数95%的可能不会超过15个碱基。