用户名: 密码: 验证码:
轮奸案混合DNA分析的关键技术基础研究及分离软件研发
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
混合DNA(mixed DNA,DNA mixture)包含多名来源个体的DNA信息,如何对混合斑生物检材DNA进行正确分型检验并对其结果进行科学解释是法医DNA鉴定领域中亟待解决的理论技术难题。本研究通过构建批量轮奸案混合DNA的实验模型,将通过科学性验证的实验数据用于混合DNA的参数评估和约束性条件的挖掘;进而构建基于STR分型数据的混合DNA分离模型,并将分离模型与国外的数学模型(如:mixsepsoftware package)进行比较分析;将构建的混合DNA分离模型进行软件转化,并通过模拟混合DNA分型数据完成研发软件的效能分析和应用验证;从而为DNA鉴定人员解决轮奸案混合DNA的个人识别提供初步的自动化专家系统。本研究构建的方法和模型不受混合样本的来源种类(如:混合精斑、混合血痕、混合脱落上皮细胞等)限制,故对不同案件类型的混合DNA均适用。
     第一部分:构建轮奸案混合DNA的实验模型
     目的:以两男混合DNA和三人混合DNA(1男+1男+1女)模拟轮奸案的混合DNA作为研究对象;利用ABI7500实时荧光定量仪构建模拟混合样本,包括不同来源个体和不同混合梯度的样本制备;将通过科学性验证的实验数据用于混合DNA分析的参数评估以及分离模型的研发。
     方法:将来自河北省血液中心的50份人全血样本提取DNA并进行ABI7500实时荧光定量,以DNA浓度非常接近为标准对单一DNA原液进行归类,作为构建模拟两男混合DNA和三人混合DNA的来源样本,确保通过调整DNA溶液的体积能够实现不同混合梯度的制备。为避免后续构建分离模型时由于样本类型过于单一且样本量不足而导致模型出现“过度拟合”,故需构建不同来源个体多种类型的模拟混合DNA,且各混合DNA均包含多个混合梯度,以确保客观地反映混合DNA分型和混合比例(mixture proportion, Mx)对后续分析的影响;另外,需将模拟混合DNA原液的浓度调整至理想浓度范围0.5-1.25ng/μl内,以满足DNA检测试剂盒对模板量的要求。随后,将实测Mx与理论Mx间的误差D值和mixsep软件包估算的Mx值(alpha)作为实验模型的科学性验证指标,通过数据挖掘和统计分析以评估构建实验模型的数据质量。
     结果:以DNA浓度相差不超过0.5ng/μl为归类标准,其中符合两男混合DNA标准的单一DNA样本有22个,可构建11组;符合三人混合DNA标准的单一DNA样本有12个,可构建4组。两男混合DNA的Mx在95%的可信区间中PCR扩增前后的偏差D≤0.1,波动较小,说明构建两男混合DNA实验模型的数据质量较好,为后续混合DNA分型的准确分离提供了较好的数据基础;三人混合DNA没有通用的D值计算公式,故不予评估。
     两男混合DNAIdentifiler(简称ID)分型中实测alpha值的均方根标准误(root mean square error, RMSE)值较大的数据在11组样本中散在出现;除了梯度1:1的RMSE>0.02,其余8个梯度的RMSE均位于0.01-0.02之间。即:本实验构建的两男混合DNA ID分型相应的实测Mx与理论Mx之间的RMSE值不超过0.02(除梯度1:1),该实验模型能够为科学合理的进行混合DNA分析提供良好的数据基础。三人混合DNAID分型由于出现很多等位基因drop-out,mixsep软件无法保证准确评估alpha值,故不予评估。
     两男混合DNA Yfiler分型中实测alpha值的RMSE值较大的数据在11组样本中散在出现;除了梯度1:3和1:4的RMSE偏大>0.02但<0.3,其余梯度的RMSE均位于0.01-0.02之间。即:本实验构建的两男混合DNAYfiler分型相应的实测Mx与理论Mx之间的RMSE值不超过0.03,Yfiler分型数据仅作为补充基础。
     结论:结合ABI7500实时荧光定量仪和实验模型的科学性验证,该部分建立了模拟轮奸案混合DNA的297个两男混合DNA和264个三人混合DNA(其中三人混合DNA存在很多等位基因drop-out现象),这些模型除了用于构建混合DNA分离模型及分析软件的研发之外,297个模拟两男混合DNA的ID分型还要为混合DNA相关参数(如:等位基因平均峰高/面积、混合比例、杂合型均衡比、等位基因缺失、基因座间平衡等)的评估分析及规律性挖掘提供数据支持。
     第二部分:两男混合DNA的参数评估及mixsep软件验证
     目的:对混合DNA的参数进行评估和分析,观察各参数之间的相关关系,以明晰混合DNA分析的约束性条件并挖掘其规律性;并通过模拟混合DNA分型数据对mixsep软件进行应用验证,明确该软件的优缺点,取长补短,为混合DNA分离模型的研发提供参照和效能比较对象。
     方法:峰高(PH)与峰面积(PA)两参数间的相关分析选择广义可加模型拟合法进行曲线拟合,以及最小二乘回归分析计算回归系数,观察两种定量信息在混合DNA分析中的效能是否有差异。
     平均峰高(APH)与杂合型均衡比(Hb)两参数间的相关分析,选择局部加权回归和Kruskal-Wallis秩和检验等,这些方法适用于非正态分布的数据,可分析16个STR基因座和9个混合梯度相应的Hb分布趋势及规律。
     通过不同channel荧光敏感度对APH的影响和基因座间平衡(Inter-locus balance, Ci)参数对混合DNA分型中各channel对应的STR基因座进行荧光敏感度差异分析,证明各STR基因座在混合DNA分析中的效能是否有差异,并通过Tukey’s Honestly显著性差异法进行多重检验。
     该部分所有统计图均由R软件(版本3.0.1)的ggplot2(版本0.9.3)程序包绘制完成。
     结果:
     1PH与PA相关分析:16个STR基因座中,除基因座D19S433、D3S1358、D58S18和D8S1179的PH与PA呈良好线性关系外,其余12个基因座的PH与PA呈高度线性关系,这与Tvedebrink的研究结论基本一致。即:PH与PA具有良好线性关系,两种定量信息在混合DNA分析中均可使用,分析效能差别不大。
     2APH与Hb相关分析:通过Kruskal-Wallis秩和检验,各基因座Hb分布的检验p值=0.0063<显著性水平0.05,说明各基因座的Hb分布有统计学差异;另外,各混合梯度Hb分布的检验p值=0.02257<0.05,说明各混合梯度的Hb分布也存在统计学差异。即:参数Hb会受到STR基因座和混合梯度两个因素的共同影响。当APH<1250rfu时,Hb值明显增大;当APH≥1250rfu时,Hb值基本稳定,APH≥1250rfu时相应的Hb均值为0.878。结合本实验数据,APH<2500rfu且Hb>0.6阈值的数据达到92.74%。基因座CSF1PO、D19S433、D21S11、D2S1338和vWA中,相应Hb值和APH均较高的数据比其它基因座多;当混合梯度的不平衡性增加(从1:5到1:9)时,Hb值和APH均较低的数据会增多。
     3APH与drop-out相关分析:当混合梯度比较均衡(1:1到1:3)时,等位基因drop-out(简称ADO)的个数较少;而当梯度非常不均衡(1:7到1:9)时,ADO个数陡增,即:ADO个数与混合梯度相关;随着ADO个数的增多,相应的样本APH逐渐降低。
     4荧光敏感度对APH的影响:为检验不同荧光敏感度的四种channel(蓝色、绿色、黄色和红色)间APH均值是否有统计学差异,利用基于Tukey’s Honest Significant Difference方法进行多重检验,蓝色与绿色channel间的荧光敏感度无差异(p值=0.446);同时,黄色与红色channel间的荧光敏感度也无差异(p值=0.530);其余蓝色与黄色组、蓝色与红色组、绿色与黄色组、绿色与红色组共4组对应的APH检验p值均=3.95E-08远小于0.05,即:“蓝色和绿色”的两种荧光与“黄色和红色”的两种荧光相比有显著性差异。也就是说,ABI3130xl基因分析仪对“蓝色和绿色”荧光的敏感度确实高于其它两种荧光。
     蓝色channel中基因座D8S1179的APH中位数最高;绿色channel中基因座D3S1358、TH01和D13S317的APH中位数高于其它;黄色和红色channel中基因座D18S51和FGA的APH中位数最低;这些恰与ABIIdentifler试剂盒中STR基因座的片段分子量大小排列相吻合,即:分子量较小的基因座D8S1179、D21S11、D3S1358、TH01、D13S317、D19S433、vWA、Amel-和D5S818对应的APH中位数均较高。也就是说,APH会受到基因分析仪的荧光敏感度和STR基因座分子量两个因素的共同影响。
     5基因座间平衡(Ci)参数分析:Ci的均值、中位数与ADO个数间的Pearson相关系数R2分别为-0.7179和-0.7065,检验p值分别为1.736E-3<0.05和2.215E-3<0.05,具有显著性差异,即:Ci的均值、中位数与ADO个数呈显著负相关;Ci中位数最高的是基因座D8S1179。
     16个STR基因座Ci值的分布规律同基因分析仪对四种channel的荧光敏感度差异规律基本一致,即:ABI3130xl对蓝色和绿色channel的荧光敏感度偏高,对应8个基因座D8S1179、D21S11、CSF1PO、D3S1358、TH01、D13S317、D16S539和D2S1338(D7S820例外)Ci值的整体水平偏高;而对黄色和红色channel的荧光敏感度偏低,对应6个基因座D19S433、vWA、TPOX、D18S51、AMEL-和FGA(D5S818例外)Ci值的整体水平偏低。
     6mixsep横向分析:混合梯度与基因座分离准确率进行相关分析,得到相关系数R2=-0.7121,检验p值=0.03139<0.05,两者呈线性负相关;另外,混合梯度与ADO个数也进行相关分析,得到R2=-0.4244,检验p值=0.2549>0.05,说明两者无明显相关性;梯度1:1的准确率最低,随着混合梯度不平衡性的增加,相应的准确率呈先提高后降低的趋势,其中梯度1:2、1:3和1:4的准确率较高,梯度1:1和1:9的准确率较低且波动较大;去除ADO后的分离准确率比未去除时的稍高,说明等位基因发生drop-out会降低mixsep软件的分析效能。
     7mixsep纵向分析:基因座D5S818、D8S1179和FGA的准确率较高>88%,而基因座D19S433、D2S1338和D7S820的准确率偏低≤80%;基因座AMEL-、D5S818和D8S1179的ADO个数最少,而基因座D18S51、D19S433、FGA、TPOX和vWA的ADO个数较多>15个,后者5个基因座均位于黄色和红色channel且基因座APH均较低,这与ABI3130xl基因分析仪对黄色和红色荧光敏感度偏低的规律相一致。
     当梯度为1:1时,除了基因座AMEL-和D3S1358外,其它基因座的准确率均≤70%,箱线图下方区域的离群点即为该梯度的数据;当梯度为1:2、1:3、1:4和1:5时,各基因座的准确率均较高,尤以梯度1:3的各基因座准确率均最高≥90%;当梯度为1:8和1:9时,基因座分离准确率波动较大且平均水平较低。
     结论:结合DNA分型的APH信息、STR基因座和混合梯度分别对参数Hb、等位基因drop-out、荧光敏感度和参数Ci等多个因素进行相关分析以及对mixsep软件进行效能分析,本研究认为:针对ABI ID试剂盒的16个STR基因座,在混合DNA的基因型分离过程中,如果该分型的APH大于1250rfu且混合梯度在1:1到1:5范围内(不包括梯度1:1),我们优先信任蓝色channel的基因座D8S1179、D21S11、CSF1PO,绿色channel的基因座D3S1358、TH01、D13S317,黄色channel的基因座D19S433、vWA、TPOX和红色channel的基因座AMEL-、D5S818(合计11个)对应的基因型分离结果,即:16个STR基因座在混合DNA分析中的基因型分离效能有差别,相应的证据强度也不尽相同。而如果混合DNA分型的APH偏低(小于1000rfu)且混合梯度极度不平衡(低于梯度1:6),在等位基因drop-out不详或没有已知参考样本时,不建议贸然进行混合DNA软件分析,这种情况很容易出现错判(misclassification);另外,混合梯度为1:1的混合DNA分型是无法进行基因型分离和个人识别的。也就是说,即使有了完整的混合DNA分离模型和分析软件,在基因型分离的前后,仍然需要DNA鉴定人员人工判断的参与,不能单纯依赖混合DNA分析软件作出鉴定结论。
     第三部分:轮奸案混合DNA分离模型的构建及效能分析
     目的:基于批量模拟混合DNA STR分型构建科学合理同时保守的混合DNA分离模型,对分离模型进行效能验证,并与mixsep软件进行比较分析,证明研发模型的稳健性(robustness)和保守性。
     方法:1朴素贝叶斯模型(Naive Bayesian model):假设等位基因峰高hα符合正态分布N(Bα+C,Hτ2),为方便处理,假设混合比例α的先验分布也符合正态分布N(m,A);而方差参数丁仍为一个参数,且混合比例α与参数丁无关,故峰高的边缘分布推导如下对于先验分布的方差超参数A,当实验数据比较精确时,各基因座间的α相差≤0.05,区间估计取3个标准差范围,故先验的α方差约为A=0.01672≈0.00028(由本实验室的数据经验所得)。此时A很小,故B2A相对于原来的方差可忽略,则峰高hα的边缘分布可简化为ba|τ~N(Bm+C,Hτ2)由先验分布得出m,遍历各基因座的所有基因型时,可通过最大化边际似然获得似然值最大时对应的最优匹配和相应参数;而对于次优匹配基因型,可人工判断来选择,本实验室的经验是一般与最优匹配的似然值相差达1.5倍以上均不考虑。
     2受限的单基因座分析模型(constrained single locus analysis model):当初始混合比例已知时,对混合比例的波动范围进行经验性约束,然后遍历各基因座的所有基因型,通过最大化似然函数来求解混合比例α和方差参数。仍沿袭正态分布的假设条件,此时等位基因峰高的均值与方差分别为此时的混合比例有限制条件α∈[α,b],方差参数也有限制条件τ∈[∈,M],∈接近于0,M值通常较大。通过参数限制,遍历基因型并最大化似然函数若求解的α达到或超过限制条件的上限或下限,即使该基因型的似然函数值最大,该基因型仍要被警告或排除。另外,由方差参数τ2的公式看出峰高拟合越好,方差参数就越小,当方差参数τ2接近于下限∈,对应的峰高拟合最好。结合本实验室混合DNA的大量实验数据,估算的混合比例会在朴素贝叶斯模型所求的先验混合比例上下波动,波动范围≤0.08,其中两等位基因的波动范围≤0.05;如果估计的混合比例接近限制条件的上下限,则该基因座很可能异常,可依据经验排除最优匹配而选择次优匹配。
     结果:1该部分构建的两种分离模型——朴素贝叶斯模型和受限的单基因座分析模型,在编号NAN3-1-9-B DNA分型的基因座AMEL-均出现了分离错误(结果同mixsep软件),错判基因型组合为X,X和X,Y。法医DNA鉴定中,基因座AMEL-在嫌疑人性别判定中具有重要作用。当未考虑混合DNA分型的其它影响因素时单纯依靠基因座AMEL-的条带峰高来直接推断混合DNA是由多名男性混合还是男性和女性混合不够保守可能导致分离模型对该案件的嫌疑人性别发生误判,从而对案件的侦破方向产生错误引导。
     2峰高退化(degradation)的影响因素中,当分子量占主要因素时,通过峰高调整可使错判的基因座进行修正;而对于分子量不是峰高退化主要因素的基因座(如:编号NAN3-1-5-B的基因座vWA),峰高调整后的分离结果不变。即:根据混合效应模型(mixed effect model)估测的峰高退化系数相对保守,在实际混合DNA分型中分子量导致的峰高退化有时并不是主要因素,故峰高调整只对部分STR基因座有效。
     结论:该部分研究从全局一致性问题、朴素贝叶斯和单基因座求解的三种思路人手,通过构建朴素贝叶斯模型(简称Bayer)、受限的单基因座分析模型(简称Iter),对mixsep软件分析结果不理想的4个混合DNA分型进行基因型分离,在不考虑峰高退化导致分离错误的前提下,Bayer与Iter两种模型的联合使用可使最优匹配基因型分析获得更理想的结果;此外,构建的混合效应模型可保守地解决峰高退化现象,当分子量占峰高退化的主要因素时,通过峰高调整可使错判的基因座进行修正;而对于分子量不是主要因素的基因座,峰高调整后的分离结果不变,即:峰高调整只对部分STR基因座有效,只作为可选修正。第四部分:混合STR分型分离软件sepDNA的研发及应用验证
     目的:选择能与中国法庭科学DNA数据库兼容的STR分型为录入数据,将第三部分的两种分离模型(即:Bayer和Iter模型)联合使用,研发混合DNA分离软件sepDNA,并通过实验数据验证该软件的保守性和可靠性。
     方法:通过R语言将第三部分构建的多个混合DNA分离模型转化为源代码,附加sepDNA用户界面的源代码,转化成sepDNA软件;并对该软件的分析效能进行验证评估。
     结果:sepDNA软件包括两个分离模型和多个小模块,其中,Bayes模型通过寻找先验混合比例,转化为峰高均值仅与基因型有关的正态分布,最大化边际似然函数后寻求最优匹配基因型;Iter模型用各基因座单独分析代替联合分析,对混合比例的波动范围进行经验性约束,通过最大化似然函数求解各基因座的混合比例和方差参数,并通过参数限制对单个基因座进行遍历求解。
     两种模型从全局优化和局部优化两种不同的建模思路完成混合DNA的基因型分离,虽然基因座D3S1358和D7S820的分离结果拉低了Iter模型的整体分离准确率,但从基因座AMEL-来源个体的性别推断和分离模型保守性的角度考虑,有必要将两种模型联合使用,以两种模型均出现的分离结果为可靠结果,两种模型不一致的分离结果需要进一步人工判断,以确保混合DNA分离结果的保守性和可靠性。
     结论:本文研发的sepDNA软件中的Bayes模型和Iter模型需联合使用,以两种模型均出现的分离结果为可靠结果,以确保混合DNA分离结果的保守性和可靠性;本软件无处理drop-out的模块,软件设计有“样本平均峰高”和“混合比例”的参数信息,如果平均峰高过低或者混合比例极不平衡,提示DNA分型可能发生等位基因drop-out,此时分离报告需要结合参数信息和DNA鉴定人员的人工判断作出最终结论。另外,本软件的三人混合DNA分离模块设计有“设置固定基因型”,可适当提高三人混合DNA的分离准确率,该模块效能还需大量三人混合DNA数据做进一步验证。
In this study, scientifically verified experimental data were used for evaluatingparameters of mixed DNA and exploring constraints through constructing theexperimental models of mixed DNA in gang-rape cases; and then separationmodels for the mixed DNA was constructed based on STR genotyping data.The separation models were compared with the mixsep software abroad; andthey were then transformed into a software package whose efficacy andapplicability was verified using the genotyping data of the simulated mixedDNA. This study has brought forward an basic expert system for theindividual identification of mixed DNA.Part I: Constructing an Experimental Model of Mixed DNA
     Objective: Experimental models of two-male mixed DNA andthree-person mixed DNA (two males+one female) were used to simulate themixed DNA samples in gang-rape cases. ABI7500real-time PCR analyzerwas used to construct the simulated mixed DNA, including sample preparationwith different contributors and different mixed ratios. And the scientificallyverified experimental data was used for evaluating parameters of mixed DNAand developing the separation model. Afterwards, the deviation D valuebetween the measured Mxand the theoretical Mx, and the Mxvalue estimatedby the mixsep software were taken as the scientific verification indexes for theexperimental model. The data quality of the experimental model wasevaluated through data mining and statistical analysis.
     Method: ABI7500analyzer was performed on the DNA extracted from50whole blood samples. During the construction of simulated two-male andthree-person mixed DNA, single DNA samples were classified as contributorsbased on the criterion that the DNA concentrations were very close so as toensure the preparation of different mixed ratios through the volume adjustment of DNA solution; and in order to avoid the “overfitting” which might becaused by simple sample types and insufficient sample size while constructingthe separation model, multiple types of mixed DNA from differentcontributors needed to be constructed, and each mixed DNA should containmultiple mixed ratios to ensure that they could objectively reflect the impactof mixed DNA profiles and mixture proportion (Mx) on the analysis; besides,the concentration of the original mixed DNA solution needed to be adjusted tothe recommended concentration range within0.5-1.25ng/μl, so as to meet therequirement of PCR Amplification Kit for DNA template.
     Results: With the DNA concentration difference no less than0.5ng/μl asthe standard for classification, there were22single DNA samples that met thestandard for two-male mixed DNA, which could construct11groups; andthere were12single DNA samples that met the standard for three-personmixed DNA, which could construct4groups. The deviation D value of mixedDNA's Mxwithin95%confidence interval before and after PCR amplificationwas≤0.1with relatively small fluctuation, which indicated that the data usedto construct the experimental model for two-male mixed DNA were of goodquality. Therefore, it could provide a favorable data basis for the accurateseparation of mixed DNA genotype.
     In mixed Identifiler profiles, among the root mean square errors (RMSEs)of the measured alpha values, the data with relatively larger RMSEs werescattered among the11groups of samples; except that the RMSE of1:1ratiowas>0.02, all the RMSEs for the rest8ratios were within the range of0.01-0.02. That is, the RMSE differences between the measured Mxs andtheoretical Mxs were no more than0.02in the simulated two-male mixed DNAprofiles. This experimental model could provide a favorable data basis forscientific analysis of mixed DNA.
     In mixed Yfiler profiles, measured alpha values with relatively largerRMSEs were scattered among the11groups of samples; except that theRMSEs of1:3and1:4ratios were>0.02but <0.3, all RMSEs of the rest ratioswere within the range of0.01-0.02. That is, in the Yfiler profiles of two-male mixed DNA constructed in this experiment, the RMSE differences betweenthe measured Mxs and theoretical Mxs were no more than0.03.
     Conclusion: In this part, with ABI7500Analyzer and the scientificverification of experimental model,297simulated two-male mixed DNA and264simulated three-person mixed DNA for simulating the mixed DNA ingang-rape cases were established. Besides constructing the separation modelfor mixed DNA and R&D of the separation software, the Identifiler-STRprofiles of297simulated two-male mixed DNA would also provide datasupport for the evaluation analysis, and regularity mining for the parameters ofmixed DNA (such as the average peak height/area of active alleles, mixtureproportion, heterozygote balance ratio, allelic drop-out, and inter-locusbalance).
     Part II: Parameter Estimation and Mixsep Software Verification for theSimulated Two-male Mixed DNA
     Objective: to clarify the constraints in the mixed DNA analysis and findout their regularity by evaluating and analyzing the parameters of mixed DNA,and by analyzing the correlations among parameters. Through applying thesimulated mixed DNA profiles data into mixsep software, it would beexpected to verify its advantages and disadvantages for further improvement,providing reference and efficacy comparison for the development of ourmixed DNA separation model.
     Methods: For correlation analysis between the peak height (PH) andpeak area (PA) of mixed DNA profiles, the generalized additive model fittingmethod was adopted for curve fitting, and the least square regression analysiswas used to compute the regression coefficient in order to observe whetherthere was efficacy difference of between the two quantitative information inthe mixed DNA analysis.
     For correlation analysis between the two parameters of APH and Hb, thelocally weighted regression and Kruskal-Wallis rank test were adopted fornon-normally distributed data, and the Hbdistribution corresponding to16STR loci and9mixed ratios could be analyzed separately.
     Variation analysis of fluorescence sensitivity was performed on the STRloci corresponding to each channel of the mixed DNA profiles through theparameter analysis of each channel's fluorescence sensitivity with APH andthe Inter-locus balance (Ci), so as to prove whether there was differencebetween the efficacy of each STR locus in the mixed DNA analysis; and themultiple test was performed through the Tukey's Honestly significantdifference method.
     All statistical charts in this paper were drawn with the ggplot2(Version0.9.3) program package of R software (Version3.0.1).
     Results:1Correlation analysis of PH and PA:The distribution of PH andPA corresponding to16STR loci showed that, besides the good linear relationbetween loci D19S433, D3S1358, D58S18and D8S1179, there was asignificant linear relation between the PH and PA of the rest12loci. This wasprimarily consistent with the study conclusion of Tvedebrink, i.e. PH and PAhad a good linear relation, and the two quantitative information could both beused in mixed DNA analysis with little difference in the analytical efficacy.
     2Correlation analysis of APH and Hb: Through Kruskal-Wallis rank-sumtest, the p value for the Hbdistribution of each locus was0.0063, which wasless than0.05, indicating that the Hbdistribution of each locus werestatistically different; besides, the p value for the Hbdistribution of each mixedratio was0.02257, which was less than0.05, indicating that the Hbdistributions of each mixed ratio were also statistically different, that is, the Hbdistribution would be affected by STR locus and mixed ratio. WhenAPH<1250rfu, Hbvalue significantly increased (from0.75to around0.87);When APH≥1250rfu, Hbvalue was almost constant and Hbmean value was0.878. Combined with the experimental data, when APH≥1250rfu, theHb>0.6threshhold data accounted for92.74%.
     Among loci CSF1PO, D19S433, D21S11, D2S1338, and vWA, therewere more data with correspondingly high Hband high APH value than theother loci; when the imbalance of the mixed ratio increased (from1:5to1:9),there would be more data with lower Hband APH value.
     3Correlation analysis of APH and drop-out: For relatively balancedratios (1:1to1:3), there were less allelic drop-outs (ADO); but for the veryimbalanced ratios (1:7to1:9), the ADO number increased rapidly, suggestingthat its number was correlated with the Mx; along with the increment of ADO,the relevant sample APH gradually decreased.
     4Impact of fluorescence sensitivity on APH: In order to test whetherthere was statistical difference among APH mean values of four channels atdifferent fluorescence sensitivities, multiple test was performed based onTukey's Honest Significant Difference method. There was no differencebetween the fluorescence sensitivities of blue channel and green channel(p=0.446), and there was also no difference between the fluorescencesensitivity of yellow channel and red channel (p=0.530). And for the tests ofrest4groups, blue and yellow group, blue and red group, green and yellowgroup, and green and red group, the corresponding p values were all equal to3.95E-08, which was far less than0.05. The sensitivity to blue and greenfluorescence therefore differed significantly from that to yellow and red, thatis, the ABI3130xl Genetic Analyzer was truly more sensitive to blue andgreen fluorescence than to the other two.
     The median APH of locus D8S1179was highest in the blue channel. Themedian APH of loci D3S1358, TH01, and D13S317were higher than those ofthe other loci in the green channel, and the median APH of loci D18S51andFGA were lowest in the yellow and red channels. That is, the distribution ofAPH was generally consistent with the molecular size of the STR loci, and themedian APH values of loci with small molecular sizes (i.e., D8S1179, D21S11,D3S1358, TH01, D13S317, D19S433, vWA, AMEL-, and D5S818) wererelatively high.
     5Analysis of parameter Ci: The Pearson correlation coefficients (R2) ofthe mean and median Ciwith the ADO count were-0.7179and-0.7065,respectively. The corresponding P values were1.736E-3and2.215E-3,indicating statistically significant differences; the mean and median Cihadsignificant negative correlations with the ADO count. The locus with the highest Cimedian was D8S1179.
     The Civalues distribution of16STR loci were generally consistentedwith the fluorescence sensitivity of ABI3130xl Genetic Analyzer in the fourchannels, that is, ABI3130xl had higher fluorescence sensitivity to blue andgreen channel, and the corresponding8loci D8S1179, D21S11, CSF1PO,D3S1358, TH01, D13S317, D16S539, and D2S1338(D7S820as an exception)all had higher Civalues; but ABI3130xl had lower fluorescence sensitivity toyellow and red channels, and the corresponding6loci, D19S433, vWA, TPOX,D18S51, AMEL-, and FGA (Except D5S818) all had lower Civalues.
     6Horizontal analysis of mixsep: Correlation analysis was carried out onmixed ratios and locus separation accuracy, revealing correlation coefficientR2=-0.7121and p value=0.03139; the two had negative linear correlation.Besides, correlation analysis was also performed between mixed ratios and theADOs count, revealing R2=-0.4244and p value=0.2549, suggesting nomarked correlation. Ratio1:1had the lowest accuracy. And along with theincrease unbalance of mixed ratios, the corresponding accuracy rised at first,and then decreased. Among them, ratios1:2,1:3, and1:4had higher accuracy,while ratios1:1and1:9had relatively lower accuracy and greater variation.The locus separation accuracy without ADO was higher than that with it,meaning the allelic dropout would impair the analytical efficacy of mixsep.
     7Vertical analysis of mixsep: Loci D5S818, D8S1179, and FGA had ahigher accuracy (>88%), while loci D19S433, D2S1338, and D7S820had alower accuracy (≤80%); loci AMEL-, D5S818, and D8S1179had the leastdropout count, while loci D18S51, D19S433, FGA, TPOX, and vWA hadmore dropout count (>15); and these5loci were all located at yellow and redchannels with lower APHs, which was consistent with the pattern where ABI3130xl Genetic Analyzer had lower fluorescence sensitivity to yellow and red.
     For ratio1:1, except loci AMEL-and D3S1358, accuracies of all theother loci were≤70%; and the outliers at the lower area of the box plot werethe data of this ratio. For ratios1:2,1:3,1:4, and1:5respectively, theaccuracies of each locus were all higher, and particularly at ratio1:3was the highest (≥90%); for ratios1:8and1:9, the locus separation accuracies werecomparatively more fluctuated with lower mean values.
     Conclusion: Combining the APH of DNA profiles, mixed ratios andSTR loci, correlation analysis on parameters Hb, Ci, and fluorescencesensitivity, as well as efficacy analysis of mixsep software, this study suggests:during the genotype separation of the mixed DNA profiles in ABI Identifiler,if the APH of this profile was greater than1250rfu while the mixed ratio waswithin1:1to1:5(excluded1:1), we prefer the genotype separation results ofloci D8S1179, D21S11and CSF1PO in blue channel, loci D3S1358, TH01,and D13S317in green channel, loci D19S433, vWA, and TPOX in yellowchannel, and loci AMEL-and D5S818in red channel (with a total of11loci).That is, separation efficacies of16STR loci in the mixed DNA analysis aswell as evidence strength were not the same. If the APH of mixed DNAprofile was less than1000rfu and the mixed ratio was extremely imbalanced(lower than ratio1:6), and when allelic dropout was not clear or there was noknown samples, it is not recommended to perform the software analysis withthe mixed DNA hastily, which easily leads to misjudgment. Moreover, mixedDNA profiles with mixed ratio close to1:1could not undergo genotypeseparation and individual identification. That is, even if there were completemixed DNA separation model and analytical software, around the time ofgenotype separation, artificial judgment of forensic investigators were stillneeded, and an expert conclusion could not be drawn simply based on thesoftware report.
     Part III: the Separation Model Construction and Efficacy Analysis inMixed DNA
     Objective: Constructing scientific and conservative mixed DNAseparation model based on large number of simulated mixed DNA profiles, toverify the efficacy of the separation model, and to compare it with the mixsepsoftware, so as to prove the robustness of separation model constructed.
     Method:1Na ve Bayesian model: The peak height of alleles wereassumed to conform to normal distribution. For convenience, the prior distribution of mixed proportion α was also assumed to normal distribution N(m, A); the variance parameter τ was still a parameter, and the α had no relationship with the parameter τ, therefore, the marginal distribution of ha was deduced as follows: For the variance super parameter A of prior distribution, when the experimental data was relatively accurate, the difference between α of each locus was≤0.05; with the three standard deviation ranges taken for interval estimate, the prior α variance was about A=001672≈0.00028(obtained through data experience of this lab). Since A was very small, the B2A relative to the original variance could be ignored, then the marginal distribution for the ha could be simplified as: ha|τ~N(Bm+C,τ2) In which, m was obtained through prior distribution; for all the genotypes of each locus, when the likelihood value was the largest, its corresponding optimal match and parameters could be obtained through maximizing marginal likelihood; and for the suboptimum matched genotypes, it could be done through artificial judgment. Our experience was that the likelihood difference from optimum match exceeding over1.5times was not considered.
     2constrained single locus analysis model:Given the initial mixed proportion, the experiential constraint was performed on the fluctuation range of mixed proportion α, and then for all the genotypes of each locus, the α and variance parameter were solved through maximizing likelihood function. The assumed condition for normal distribution would still be followed, and then the mean of alleles peak height and variance were as follows, respectively: Herein, the limiting conditions of the mixed proportion is α∈[a, b], and variance parameter is τ∈[(?), M], with (?) close to0while M is usually large. Through limitation of parameters, the genotypes are traversed and the likelihood function is maximized; if the a to be solved reached or exceeded the upper-limit or lower-limit of the constraint, even if the likelihood of this genotype were the maximum, the genotype would still be warned or excluded. In addition, according to the formula of the variance parameter τ2, the better the peak height fitting is, the smaller the variance parameter will be obtained; and when the variance parameter τ2approaches the lower limit (?), the corresponding peak height fitting will be close to its best. Combining the experimental mixed DNA data generated from our lab, the estimated α fluctuates around the prior α solved through Naive Bayesian model, with the variation range≤0.08, in which the variation ranges of two allelic bands≤0.05. If the estimated α approached the upper limit or lower limit of the constraint, then this locus would be very much likely to be abnormal whose optimal match could be replaced with the suboptimum one according to experience.
     Results:The two types of separation models constructed in this study, Naive Bayesian model (Called Bayer, for short) and constrained single locus analysis model (Called Iter, for short), both separation errors on locus AMEL-of NAN3-1-9-B DNA profile, and the misjudged genotype combinations were X,X and X,Y (the result was the same as mixsep). In the forensic DNA testing, locus AMEL-played an important role in the suspect gender inference. When other factors affecting mixed DNA profile were not considered, it was not conservative enough to directly infer whether the mixed DNA was from multiple males or males and female just relying on the peak height of this locus, which could cause the separation model to misjudge the gender of suspects, thus providing a wrong direction for the case.
     In the influential factors of peak height degradation, when molecular weight became the major factor, the misjudged locus could be corrected through peak height adjustment; while for loci whose molecular weight were not the major causes for peak height degradation (such as locus vWA of NAN3-1-5-B sample), the separation result after peak height adjustment remained the same. In another word, the peak height degradation coefficientwas relatively conservatively estimated based on the mixed effect model.Therefore, the peak height adjustment is only effective for some STR loci inmixed DNA profiles.
     Conclusion: The research started with global consistency problem, Na veBayesian and single locus solving, and through constructing Bayer model andIter model, genotype separations were done in5mixed DNA profiles datawhich did not have ideal analytical results from the mixsep. On the premisethat the peak height degradation causing separation error was not considered,the combined use of bayer and Iter, could make the best matched genotypesanalysis get more ideal results. In addition, the mixed effect model constructedcould conservatively solve the phenomenon of peak height degradation; whenmolecular weight was the major factor contributing to peak height degradation,the misjudged loci could be corrected through peak height adjusting; that is,the peak height adjustment was only effective to some STR loci, which wasonly taken as an optional correction.
     Part IV: Development of sepDNA Software in mixed DNA analysis andCase Application
     Objective: Select STR marker as the input data which is compatible withthe DNA database of Chinese Forensic Science, using the two separationmodels developed in this study together, to research and develop a mixedDNA separation sepDNA software, and to verify the robustness and reliabilityof sepDNA software through experimental data.
     Methods: The sepDNA software package was developed through Rlanguage by converting the source code from the multiple mixed DNAseparation models constructed in Part III, and adding the source code ofsepDNA user interface; application verification was performed to theanalytical efficacy of this software.
     Results: The sepDNA software contained two separation models andmultiple little modules. In the Bayes model, through exploring prior mixtureproportion, it was converted to normal distribution where average peak height was only related to genotype; and the best match genotype was found aftermaximizing the marginal likelihood function. And in the Iter model, jointanalysis was replaced by unilateral analysis on each locus, and empiricalconstraint was performed on the fluctuation range of mixed proportion; themixture proportion and variance parameter of each locus were solved throughmaximizing likelihood function, and traversal solving to single locus was donethrough parameter constraint.
     The two types of models completed genotype separation of mixed DNAfrom two different modeling ideas, global optimizing and local optimizing of
     . Although the separation results of loci D3S1358and D7S820impaired theoverall separation accuracy of the Iter model, to consider from the genderestimation of locus AMEL-and the robustness of separation model, it wasnecessary to use these two models combinedly and take the separation resultas a reliable one when both models had the same result; and for differentresults in two models, further artificial judgment was needed to ensure that theseparation report of the mixed DNA had robustness and reliability.
     Conclusion: The sepDNA software created in this study includes twoseparation models, Bayer model and Iter model. The two models should beused together and the separation result of the mixed DNA appears in bothmodels is considered as a reliable one in order to ensure robustness andreliability. This software has no module of allelic drop-outs. In the software,there was parameter information of “average peak height” and “mixtureproportion”, and if the average peak height was too low or the mixed ratio wasextremely unbalanced, it prompted that allelic drop-out might happen in mixedDNA profile. The analysis report of sepDNA was needed to be drawn with theparameter information and artificial judgment. In the three-person mixed DNAseparation module of this software, there was a function called “set up fixedgenotype”, which could properly increase the separation accuracy ofthree-person mixed DNA, but the efficacy of this module still needs to befurther verified with more three-person mixed DNA data.
引文
1Hatsch, D., Amory, S., Keyser, C., Hienne, R.&Bertrand, L. A rape casesolved by mitochondrial DNA mixture analysis. J Forensic Sci.52,891-894,2007
    2Mayntz-Press, K. A.&Ballantyne, J. Performance characteristics ofcommercial Y-STR multiplex systems. J Forensic Sci.52,1025-1034,2007
    3Hall, A.&Ballantyne, J. Novel Y-STR typing strategies reveal thegenetic profile of the semen donor in extended interval post-coitalcervicovaginal samples. Forensic Sci Int.136,58-72,2003
    4Mayntz-Press, K. A., Sims, L. M., Hall, A.&Ballantyne, J. Y-STRprofiling in extended interval (> or=3days) postcoital cervicovaginalsamples. J Forensic Sci.53,342-348,2008
    5Clayton, T. M., Whitaker, J. P., Sparkes, R.&Gill, P. Analysis andinterpretation of mixed forensic stains using DNA STR profiling.Forensic Sci Int.91,55-70,1998
    6Vetten, L.&Haffejee, S. Gang rape: A study in inner-city Johannesburg.SA Crime Quarterly.12,31-36,2005
    7Gill, P., Brenner, C. H., Buckleton, J. S., et al. DNA commission of theInternational Society of Forensic Genetics: Recommendations on theinterpretation of mixtures. Forensic Sci Int.160,90-101,2006
    8González-Andrade, F., Bolea, M., Martínez-Jarreta, B.&Sánchez, D.DNA mixtures in forensic casework resolved with autosomic STRs. In:Int Congr Ser: Elsevier:2006
    9Cowell, R. G. Validation of an STR peak area model. Forensic Sci IntGenet.3,193-199,2009
    10Hou, Y. P. Forensic DNA typing in China. Leg Med (Tokyo).11Suppl1,S103-105,2009
    11Tvedebrink, T., Eriksen, P. S., Mogensen, H. S.&Morling, N. Estimatingthe probability of allelic drop-out of STR alleles in forensic genetics.Forensic Sci Int Genet.3,222-226,2009
    12Gill, P., Sparkes, R.&Kimpton, C. Development of guidelines todesignate alleles using an STR multiplex system. Forensic Sci Int.89,185-197,1997
    13Urquhart, A., Kimpton, C. P., Downes, T. J.&Gill, P. Variation in shorttandem repeat sequences--a survey of twelve microsatellite loci for use asforensic identification markers. Int J Legal Med.107,13-20,1994
    14Whitaker, J. P., Clayton, T. M., Urquhart, A. J., et al. Short tandem repeattyping of bodies from a mass disaster: high success rate and characteristicamplification patterns in highly degraded samples. BioTechniques.18,670-677,1995
    1Budowle B, Onorato AJ, Callaghan TF, et al: Mixture interpretation:defining the relevant features for guidelines for the assessment of mixedDNA profiles in forensic casework. J Forensic Sci54:810-821,2009
    2Gill P, Brenner CH, Buckleton JS, et al: DNA commission of theInternational Society of Forensic Genetics: Recommendations on theinterpretation of mixtures. Forensic Sci Int160:90-101,2006
    3Li CX, Han JP, Ren WY, Ji AQ, Xu XL and Hu L: DNA profiling ofspermatozoa by laser capture microdissection and low volume-PCR.PLoS One6: e22316,2011
    4Finkelhor D, Ormrod R, Turner H and Hamby S: School, police, andmedical authority involvement with children who have experiencedvictimization. Arch Pediatr Adolesc Med165:9-15,2011
    5Longombe AO, Claude KM, Ruminjo J: Fistula and traumatic genitalinjury from sexual violence in a conflict setting in Eastern Congo: casestudies. Reprod Health Matters16:132-141,2008
    6Tell K: African women struggling against female circumcision and sexualviolence. Reprod Freedom News8:4-5,1999
    7Greene D, Maas CS, Carvalho G and Raven R: Epidemiology of facialinjury in female blunt assault trauma cases. Arch Facial Plast Surg1:288-291,1999
    8Bright JA, Turkington J and Buckleton J: Examination of the variability inmixed DNA profile parameters for the Identifiler multiplex. Forensic SciInt Genet4:111-114,2010
    9Wetton JH, Lee-Edghill J, Archer E, et al: Analysis and interpretation ofmixed profiles generated by34cycle SGM Plus amplification. ForensicSci Int Genet5:376-380,2011
    10Bright JA, Huizing E, Melia L, Buckleton J: Determination of thevariables affecting mixed MiniFilerTMDNA profiles. Forensic Sci IntGenet5:381-385,2011
    11Hill CR, Duewer DL, Kline MC, et al: Concordance and populationstudies along with stutter and peak height ratio analysis for thePowerPlex ESX17and ESI17Systems. Forensic Sci Int Genet5:269-275,2011
    12Coletti A, Merigioli S, Severini S, et al: Statistical analysis of DNAmixtures using peak area information and allelic drop out. Forensic SciInt Genet Suppl Ser2:202-203,2009
    13Tvedebrink T, Eriksen PS, Mogensen HS and Morling N: Estimating theprobability of allelic drop-out of STR alleles in forensic genetics,Forensic Sci Int Genet3:222-226,2009
    14Van Nieuwerburgh F, Goetghebeur E, Vandewoestyne M and Deforce D:RMNE probability of forensic DNA profiles with allelic drop-out.Forensic Sci Int Genet Suppl Ser2:462-463,2009
    15Post WJ, Buijs C, Stolk RP, de Vries EG and le Cessie S: The analysis oflongitudinal quality of life measures with informative drop-out: a patternmixture approach. Qual Life Res19:137-148,2010
    16Haned H, Egeland T, Pontier D, Pène L and Gill P: Estimating drop-outprobabilities in forensic DNA samples: a simulation approach to evaluatedifferent models. Forensic Sci Int Genet5:525-531,2011
    17Tvedebrink T, Eriksen PS, Asplund M, Mogensen HS and Morling N:Allelic drop-out probabilities estimated by logistic regression--furtherconsiderations and practical implementation. Forensic Sci Int Genet6:263-267,2012
    18Tvedebrink T, Eriksen PS, Mogensen HS and Morling N: Statistical modelfor degraded DNA samples and adjusted probabilities for allelic drop-out.Forensic Sci Int Genet6:97-101,2012
    19Weiler NE, Matai AS and Sijen T: Extended PCR conditions to reducedrop-out frequencies in low template STR typing including unequalmixtures. Forensic Sci Int Genet6:102-107,2012
    20Kloosterman AD and Kersbergen P: Efficacy and limits of genotyping lowcopy number (LCN) DNA samples by multiplex PCR of STR loci. J SocBiol197:351-359,2003
    21Ballantyne KN, van Oorschot RA and Mitchell RJ: Comparison of twowhole genome amplification methods for STR genotyping of LCN anddegraded DNA samples. Forensic Sci Int166:35-41,2007
    22McCartney C: LCN DNA: proof beyond reasonable doubt? Nat Rev Genet9:325,2008
    23Gill P: LCN DNA: proof beyond reasonable doubt?-a response. Nat RevGenet9:726,2008
    24Gu LH, Dong Y, Zhang C, et al: Forensic analysis of LCN DNA usingsample concentration methods followed by miniSTR genotyping. JForensic Med26:361-363,2010
    25Petricevic S, Whitaker J, Buckleton J, et al: Validation and development ofinterpretation guidelines for low copy number (LCN) DNA profiling inNew Zealand using the AmpFlSTR SGM Plus multiplex. Forensic Sci IntGenet4:305-310,2010
    26Benschop CC, van der Beek CP, Meiland HC, van Gorp AG, Westen AAand Sijen T: Low template STR typing: effect of replicate number andconsensus method on genotyping reliability and DNA database searchresults. Forensic Sci Int Genet5:316-328,2011
    27Gilder JR, Inman K, Shields W and Krane DE: Magnitude-dependentvariation in peak height balance at heterozygous STR loci. Int J LegalMed125:87-94,2011
    28T Tvedebrink. mixsep: DNA mixture separation. R package version0.2.1-2. URLhttp://CRAN.R-project.org/package=mixsep,2013
    29T Tvedebrink. mixsep: An R-package for DNA mixture separation.Forensic Science International: Genetics Supplement Serires3:486-488,2011
    30T Tvedebrink, PS Eriksen, HS Mogensen and N Morling. Identifyingcontributors of DNA mixtures by means of quantitative information ofSTR typing.To Appear in Journal of Computational Biology,2011
    31T Tvedebrink, PS Eriksen, HS Mogensen and N Morling. mixsep-AnR-package for separating forensic DNA mixtures and performingstatistical analysis of EPGs.2011
    32T. Tvedebrink, P.S. Eriksen. Evaluating the weight of evidence by usingquantitative short tandem repeat data in DNA mixtures. Journal of theRoyal Statistical Society, Series C59:855-87,2010
    1Budowle B, Onorato AJ, Callaghan TF, et al: Mixture interpretation: definingthe relevant features for guidelines for the assessment of mixed DNAprofiles in forensic casework. J Forensic Sci54:810-821,2009
    2Torben Tvedebrink. mixsep: DNA mixture separation. R package version0.2.1-2. URL http://CRAN.R-project.org/package=mixsep,2013
    3Torben Tvedebrink. mixsep: An R-package for DNA mixture separation.Forensic Science International: Genetics Supplement Serires3:486-488,2011
    4T Tvedebrink, PS Eriksen, HS Mogensen and N Morling. Identifyingcontributors of DNA mixtures by means of quantitative information of STRtyping. Journal of Computational Biology,2011
    5R. G. Cowell, S. L. Lauritzen, J. Mortera. Probabilistic expert systems forhandling artifacts in complex DNA mixtures. Forensic ScienceInternational: Genetics5:202-209,2011
    6R. G. Cowell, S. L. Lauritzen, J. Mortera. MAIES: A tool for DNA mixtureanalysis. In Proceedings of the22nd Conference on Uncertainty inArtificial Intelligence. Morgan Kaufmann Publishers,2006
    7R. G. Cowell, S. L. Lauritzen, J. Mortera. A gamma model for DNA mixtureanalyses. Bayesian Analysis2:333-348,2007
    8T. Graversen, S.L. Lauritzen. Estimation of parameters in DNA mixtureanalysis. arXiv:1108.1884,2013
    9T. Graversen, S.L. Lauritzen. Computational aspects of DNA mixture analysis.arXiv:1307.4956,2013
    10DNAmixture: Statistical Inference for Mixed Traces of DNA. R packageversion0.1-0, dnamixtures.r-forge.r-project.org/
    11Hugin Expert A/S. Hugin API Reference Mannual, Version7.7. HuginExpert A/S, Aalborg, Denmark,2013
    12Konis, K. RHugin. R packages version7.7-5, rhugin.r-forge.r-project.org,2013
    13Aguilera, Marcos."Stumbling over Consensus Research: Mis-understandings and Issues". Lecture Notes in Computer Science59:59–72,2010
    14T Tvedebrink, PS Eriksen, HS Mogensen and N Morling. mixsep-AnR-package for separating forensic DNA mixtures and performing statisticalanalysis of EPGs. Manuscript in preparation,2011
    15T. Tvedebrink, P.S. Eriksen. Evaluating the weight of evidence by usingquantitative short tandem repeat data in DNA mixtures. Journal of theRoyal Statistical Society, Series C59:855-874,2010
    16T. Tvedebrink, P.S. Eriksen, H. S. Mogensen, N. Morling. Estimating theprobability of allelic drop-out of STR alleles in forensic genetics. ForensicScience International: Genetics3:222-226,2009
    17T. Tvedebrink, P.S. Eriksen, H. S. Mogensen, N. Morling. Statistical modelfor degraded DNA samples and adjusted probabilities for allelic drop-out.Forensic Science International: Genetics6:97-101,2012
    18T. Tvedebrink, P.S. Eriksen, H. S. Mogensen, N. Morling. Allelic drop-outprobabilities estimated by logistic regression-Further considerations andpractical implementation. Forensic Science International: Genetics6:263-267,2012
    1Team RDevelopment Core. R: A Lnguage And Environment For StatisticalComputing. Vienna, Austria: R Foundation for Statistical Computing,2013
    2RStudio. RStudio: Integrated development environment for R. Version0.96.122.[software],2012
    3Gentleman, R.C. et al. Bioconductor: open software development forcomputational biology and bioinformatics. Genome Biol.5, R80,2004
    4Smyth, G.K. Limma: linear models for microarray data. In: BioinformaticsAnd Computational Biology Solutions Using R And Bioconductor:Springer,397-420,2005
    5Gautier, L., Cope, L., Bolstad, B.M.&Irizarry, R.A. affy--analysis ofAffymetrix GeneChip data at the probe level. Bioinforma.20,307-15,2004
    6Ellis, B., Haaland, P., Hahne, F., Le Meur, N.&Gopalakrishnan, N.flowCore: basic structures for flow cytometry data. R package version1.24.2.[software],2009
    7Pages, H., Carlson, M., Falcon, S., Li, N.&Maintainer, M.B.P. Package‘AnnotationDbi’: Annotation Database Interface. R package version1.20.7.[software],2014
    8Carlson, M. hgu95av2.db: Affymetrix Human Genome U95Set annotationdata (chip hgu95av2). R package version2.8.0.[software],2012
    9Carlson, M. et al. A set of annotation maps describing the entire GeneOntology. R package version2.8.0.[software],2010
    10Shannon, P. MotifDb: An Annotated Collection of Protein-DNA BindingSequence Motifs.. R package version1.0.0.[software],2012
    11Pages, H., Aboyoun, P., Gentleman, R., DebRoy, S.&Alignments,P.R.S.P.S. String objects representing biological sequences, and matchingalgorithms. R package version2.26.3.[software],2009
    12Tvedebrink, T. mixsep: DNA mixture separation. R package version0.2.1-2.[software],2013
    13Tvedebrink, T. mixsep: An R-package for DNA mixture separation.Forensic Sci Int: Genet Supple Seri.3, e486-e8,2011
    14Tvedebrink, T., Eriksen, P.S., Mogensen, H.S.&Morling, N. Identifyingcontributors of DNA mixtures by means of quantitative information ofSTR typing. J. Comput. Biol.19,887-902,2012
    15Tvedebrink, T., Eriksen, P.S., Mogensen, H.S.&Morling, N. mixsep-AnR-package for separating forensic DNA mixtures and performingstatistical analysis of EPGs. Manuscript in preparation.2011
    16Tvedebrink, T., Eriksen, P.S., Mogensen, H.S.&Morling, N. Evaluatingthe weight of evidence by using quantitative short tandem repeat data inDNA mixtures. J R Stat Soc Ser C Appl Stat.59,855-74,2010
    1Jeffreys AJ, Wilson V and Thein SL: Hypervariable 'minisatellite' regionsin human DNA. Nature314:67-73,1985
    2Weir BS, Triggs CM, Starling L, Stowell LI, Walsh KA and Buckleton J:Interpreting DNA mixtures. J Forensic Sci42:213-222,1997
    3Saad R: Discovery, development, and current applications of DNA identitytesting. Proc (Bayl Univ Med Cent)18:130-133,2005
    4Wambaugh J: Blooding, The: The True Story of the Narborough VillageMurders. Morrow, New York, NY,1989
    5van Oorschot RA, Ballantyne KN and Mitchell RJ: Forensic trace DNA: areview. Investig Genet1:14,2010
    6Gill P, Brenner CH, Buckleton JS, et al: DNA commission of theInternational Society of Forensic Genetics: Recommendations on theinterpretation of mixtures. Forensic Sci Int160:90-101,2006
    7Morling N, Bastisch I, Gill P and Schneider PM: Interpretation of DNAmixtures--European consensus on principles. Forensic Sci Int Genet1:291-292,2007
    8Gill P, Brown RM, Fairley M, et al: National recommendations of theTechnical UK DNA working group on mixture interpretation for theNDNAD and for court going purposes. Forensic Sci Int Genet2:76-82,2008
    9Stringer P, Scheffer JW, Scott P, et al: Interpretation of DNAmixtures--Australian and New Zealand consensus on principles. ForensicSci Int Genet3:144-145,2009
    10Buckleton JS, Curran JM and Gill P: Towards understanding the effect ofuncertainty in the number of contributors to DNA stains. Forensic Sci IntGenet1:20-28,2007
    11Biedermann A, Bozza S, Konis K and Taroni F: Inference about thenumber of contributors to a DNA mixture: Comparative analyses of aBayesian network approach and the maximum allele count method.Forensic Sci Int Genet6:689-696,2012
    12Chung YK and Fung WK: Identifying contributors of two-person DNAmixtures by familial database search. Int J Legal Med127:25-33,2013
    13Clayton TM, Whitaker JP, Sparkes R and Gill P: Analysis andinterpretation of mixed forensic stains using DNA STR profiling.Forensic Sci Int91:55-70,1998
    14Murray C, McAlister C and Elliott K: Identification and isolation of malecells using fluorescence in situ hybridisation and laser microdissection,for use in the investigation of sexual assault. Forensic Sci Int Genet1:247-252,2007
    15Haned H, Pene L, Sauvage F and Pontier D: The predictive value of themaximum likelihood estimator of the number of contributors to a DNAmixture. Forensic Sci Int Genet5:281-284,2011
    16Gill P, Whitaker J, Flaxman C, Brown N and Buckleton J: An investigationof the rigor of interpretation rules for STRs derived from less than100pgof DNA. Forensic Sci Int112:17-40,2000
    17Howitt T: Ensuring the integrity of results: a continuing challenge inforensic DNA analysis. In: Promega Genetic Identity ConferenceProceedings Fourteenth International Symposium on HumanIdentification,2003. http://cn.promega.com/products/pm/genetic-identity/ishi-conference-proceedings/14th-ishi-oral-presentations/
    18Gill P and Kirkham A: Development of a simulation model to assess theimpact of contamination in casework using STRs. J Forensic Sci49:485-491,2004
    19Krenke BE, Nassif N, Sprecher CJ, Knox C, Schwandt M and Storts DR:Developmental validation of a real-time PCR assay for the simultaneousquantification of total human and male DNA. Forensic Sci Int Genet3:14-21,2008
    20LaSalle HE, Duncan G and McCord B: An analysis of single andmulti-copy methods for DNA quantitation by real-time polymerase chainreaction. Forensic Sci Int Genet5:185-193,2011
    21Hudlow WR and Buoncristiani MR: Development of a rapid,96-wellalkaline based differential DNA extraction method for sexual assaultevidence. Forensic Sci Int Genet6:1-16,2012
    22Cook O and Dixon L: The prevalence of mixed DNA profiles in fingernailsamples taken from individuals in the general population. Forensic SciInt Genet1:62-68,2007
    23Dowlman EA, Martin NC, Foy MJ, Lochner T and Neocleous T: Theprevalence of mixed DNA profiles on fingernail swabs. Sci Justice50:64-71,2010
    24Malsom S, Flanagan N, McAlister C and Dixon L: The prevalence ofmixed DNA profiles in fingernail samples taken from couples whoco-habit using autosomal and Y-STRs. Forensic Sci Int Genet3:57-62,2009
    25Kamodyova N, Durdiakova J, Celec P, et al: Prevalence and persistence ofmale DNA identified in mixed saliva samples after intense kissing.Forensic Sci Int Genet7:124-128,2013
    26Brown L, Brown G, Vacek P and Brown S: Aneuploidy detection in mixedDNA samples by methylation-sensitive amplification and microarrayanalysis. Clin Chem56:805-813,2010
    27McAlister C: The use of fluorescence in situ hybridisation and lasermicrodissection to identify and isolate male cells in an azoospermicsexual assault case. Forensic Sci Int Genet5:69-73,2011.
    28Meredith M, Bright JA, Cockerton S and Vintiner S: Development of aone-tube extraction and amplification method for DNA analysis of spermand epithelial cells recovered from forensic samples by lasermicrodissection. Forensic Sci Int Genet6:91-96,2012.
    29Rothe J, Roewer L and Nagy M: Individual specific extraction of DNAfrom male mixtures--First evaluation studies. Forensic Sci Int Genet5:117-121,2011
    30Egeland T, Fonnelop AE, Berg PR, Kent M and Lien S: Complex mixtures:a critical examination of a paper by Homer et al. Forensic Sci Int Genet6:64-69,2012
    31Voskoboinik L and Darvasi A: Forensic identification of an individual incomplex DNA mixtures. Forensic Sci Int Genet5:428-435,2011
    32Gill P, Sparkes R, Pinchin R, Clayton T, Whitaker J and Buckleton J:Interpreting simple STR mixtures using allele peak areas. Forensic SciInt91:41-53,1998
    33Slooten K: Validation of DNA-based identification software bycomputation of pedigree likelihood ratios. Forensic Sci Int Genet5:308-315,2011
    34Gill P, Kirkham A and Curran J: LoComatioN: a software tool for theanalysis of low copy number DNA profiles. Forensic Sci Int166:128-138,2007
    35GenoProof FG: Mixture-the complete solution for complex forensicDNA samples [Product information flyer]. Qualitype AG. Germany.http://genoproof-mixture-client.software.informer.com/
    36He H, Snyder-Leiby T, Qi R and Liu JC: GeneMarker HID.2009.http://www.softgenetics.com/MixtureAnalysis_AppNote.pdf
    37Hansson O and Gill P: Evaluation of GeneMapper ID-X MixtureAnalysis tool. Forensic Science International: Genetics SupplementSeries3: e11-e12,2011
    38Perlin MW, Legler MM, Spencer CE, et al: Validating TrueAllele DNAMixture Interpretation*,. Journal of forensic sciences56:1430-1447,2011
    39Haned H: Forensim: an open-source initiative for the evaluation ofstatistical methods in forensic genetics. Forensic Sci Int Genet5:265-268,2011
    40Haned H and Gill P: Analysis of complex DNA mixtures using theForensim package. Forensic Science International: Genetics SupplementSeries3: e79-e80,2011
    41McEwen JE: Forensic DNA data banking by state crime laboratories. Am JHum Genet56:1487-1492,1995
    42Daniel D: DNA Forensics. U.S. DOE Human Genome Program.2013
    43No l J, Lavergne L, Mailly F, Roberge D and Jolicoeur C: Searching aDNA databank with complex mixtures from two individuals. ForensicScience International: Genetics Supplement Series2:464-465,2009
    44Chung YK and Fung WK: The evidentiary values of "cold hits" in a DNAdatabase search on two-person mixture. Sci Justice51:10-15,2011
    45Evett IW and Weir BS: Interpreting DNA evidence: statistical genetics forforensic scientists. Sinauer Associates Sunderland, Mass,1998
    46California Department of Justice DoLE: DNA partial match (crime sceneDNA profile to offender) policy.2008. https://oag.ca.gov/
    47Moretti TR, Baumstark AL, Defenbaugh DA, Keys KM, Smerick JB andBudowle B: Validation of short tandem repeats (STRs) for forensic usage:performance testing of fluorescent multiplex STR systems and analysisof authentic and simulated forensic samples. J Forensic Sci46:647-660,2001
    48Gill P: Role of short tandem repeat DNA in forensic casework in theUK--past, present, and future perspectives. Biotechniques32:366-368,370,372, passim,2002
    49Butler JM: Genetics and genomics of core short tandem repeat loci used inhuman identity testing. J Forensic Sci51:253-265,2006
    50Bekaert B, Van Geystelen A, Vanderheyden N, Larmuseau MH andDecorte R: Automating a combined composite-consensus method togenerate DNA profiles from low and high template mixture samples.Forensic Sci Int Genet6:588-593,2012
    51Bieber FR, Brenner CH and Lazer D: Human genetics. Finding criminalsthrough DNA of their relatives. Science312:1315-1316,2006
    52Curran JM and Buckleton JS: Effectiveness of familial searches. SciJustice48:164-167,2008
    53Cowen S and Thomson J: A likelihood ratio approach to familial searchingof large DNA databases. Forensic Science International: GeneticsSupplement Series1:643-645,2008
    54Gill P, Curran J, Neumann C, et al: Interpretation of complex DNA profilesusing empirical models and a method to measure their robustness.Forensic Sci Int Genet2:91-103,2008
    55Biedermann A and Taroni F: Bayesian networks for evaluating forensicDNA profiling evidence: a review and guide to lIterature. Forensic Sci IntGenet6:147-157,2012
    56Cowell R, Lauritzen S and Mortera J: Object-oriented Bayesian networksfor DNA" mixture analyses.2006. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.102.1834&rep=rep1&type=pdf
    57Taroni F, Aitken C, Garbolino P and Biedermann A: Bayesian Networksfor Evaluating Scientific Evidence. Bayesian Networks and ProbabilisticInference in Forensic Science:97-129
    58Dawid AP, Mortera J and Vicard P: Object-oriented Bayesian networks forcomplex forensic DNA profiling problems. Forensic Sci Int169:195-205,2007
    59Hepler AB and Weir BS: Object-oriented Bayesian networks for paternitycases with allelic dependencies. Forensic Sci Int Genet2:166-175,2008
    60Pascali VL and Merigioli S: Joint Bayesian analysis of forensic mixtures.Forensic Sci Int Genet6:735-748,2012
    61Paoletti DR, Doom TE, Krane CM, Raymer ML and Krane DE: Empiricalanalysis of the STR profiles resulting from conceptual mixtures. JForensic Sci50:1361-1366,2005
    62Haned H, Pene L, Lobry JR, Dufour AB and Pontier D: Estimating thenumber of contributors to forensic DNA mixtures: does maximumlikelihood perform better than maximum allele count? J Forensic Sci56:23-28,2011

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700