C, although the Chi-squared measure is the smoothest for these simulations.
C, although the Chi-squared measure is the smoothest for these simulations. The preceding experiments yield understanding of the comparative behaviour of exploratory and confirmatory approaches to period estimation as a function of the degree of periodicity in the underlying sequence. As discussed in the introduction, an important characteristic of confirmatory approaches is the extent to which they are able to detect the presence of periodicity that may bedegraded in one way or another. Since the problem of detection is generally dependent on the detection threshold chosen for a specific application, a conventional method of comparison is to calculate the receiver operator characteristic (ROC) curve (true positives vs. false positives) for each method. The results of the ROC simulation for eroded and approximately periodic sequence fragments can be seen in Figure 4. The results for approximately periodic sequences, shown for period-10 only, are slightly get Aprotinin simplistic, since if it is known or suspected a priori that the period-10 component may be approximate rather than exact, a more reasonable approach may be to also consider the strength of components with periods 9, 11 etc. Under these circumstances, the order of preference among the various techniques compared herein may differ from that suggested by the results in Figure 4. Interestingly, between the two panels in Figure 4, the order of effectiveness of the various methods is reversed. Since the difference between the significance measures is most marked for Figure 3, where the BWB performs close to the ideal behaviour for a period-10 detector, we selected the BWB for further experimental work. An explanation for the good performance of the CRB for approximate periodicity can be constructed along similar lines to that for Fourier-based methods in the periodicity degradation simulations above, since CRB is also Fourier-based. Although the CRB is very effective for1.0 0.8 Probability 0.g-statisticBWBChi-sqCRB0.0010 Cramer-Rao bound 0.0008 0.0006 0.0004 0.0.4 0.2 0.00 10 20 30 40 Percent Erosion 0.0 0.5 1.0 1.5 2.0 2.5 Standard Deviation0.Figure 3 Significance measures from the embedded IPDFT (g-statistic and BWB) for period-10 synthetic sequences of length N = 150. Sequences were either erosion of a perfect period-10 signal (top) or had periods Gaussian distributed about an expected value of period-10 (bottom).Epps et al. Biology Direct 2011, 6:21 http://www.biology-direct.com/content/6/1/Page 7 ofg-statistic1.BWBChi-sqApproximateCRBErodedIPDFTHybridTrue positive rateTrue positive rate0.0.6 0.0.4 0.0 0.8 0.6 0.2 0.2 0.4 0.6 0.8 False positive rate 0.0 0.2 0.4 0.6 0.8 False positive rate 1.0.4 0.0 0.Figure 4 ROC curves for confirmatory period detection of eroded (left) and approximate (right) synthetic period-10 sequences and randomly permuted sequences for embedded IPDFT (top) and embedded Hybrid PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/28914615 (bottom).approximate periodicity, which occurs commonly in practise, the simulation constraint of an average period of 10 bp is artificial and probably overstates the practical utility of the measure somewhat.Analysis of Yeast Chip-chip dataThe yeast ChIP-chip data identified 73327 DNA sequences that are associated with nucleosomes in vivo versus their linker sequences. Lee et al. [37] further utilised the variance in association to classify the sequences into 31557 well-positioned or 41770 fuzzy nucleosomes. Since the regions identified by Lee et al differed in length, and statistical power of.