Left-handed Z-DNA is the consequence of alternate pyrimidine-purine (YR YR)n base strands such as (CGCG) and (CATG)n (Figure 4 & Table 1). The upside-down flip of a base-pair (bp) by 180° spins the proceeding purines from anti- to syn-conformation and consequently shifting the B-Z backbone turn to left handedW?1 orientation. The algorithm Z-Hunt-II used for predicting Z-DNA in large genomes was developed by extending the original thermodynamic search strategy of Z-Hunt (Ho et al, 1986)W?2 . Briefly, the search strategy relies on the ability to predict properties of B- to Z-DNA transition induced by negative super coilingW?3 in a closed circular DNA.
The search strategy within Z-Hunt-II (Schroth et al., 1992) works through the following steps: (1) a search window that is used to walk along a DNA sequence is defined; (2) each nucleotide of the sequence within the search window is assigned to its energetically most favored base conformation (either anti or syn) in the context of the entire; (3) the free energy associated with each nucleotide in this base conformation is designated according to the base conformations; (4) the search window is placed within 5000-bp close to circular plasmid theoretically (analogous to the actual experimental system used to measure the stability of Z-DNA); (5) the superhelical density in the midpoint of the supercoil is calculated based on B-to Z-DNA transition for the sequence; (6) the “Z-Score” of the sequence is calculated, and is defined as the probability of finding a random sequence that is as good or better at forming Z-DNA as that in the search window; and finally, (7) this is repeated as the window walks through one nucleotide at a time along the entire sequence.
We have reviewed various approaches for prediction of non-canonical motifs which have been steadily improving in the last decades. Until now tremendous efforts have been made identifying potential non-canonical sequence motifs, but those efforts came up short regarding their predictive capacities. All of these studies are based on research of the primary structure of DNA for predicting three dimensionallyW?4 oriented conformations. The main limitations of these search methods could be following; (1) current folding rules for non-canonical motif prediction are not embedded with essential algorithm to distinguish which combination of bases conform to quadruplex structure in nature, and (2) the structural diversity of non-canonical motif follows the template patterns that are far beyond common patterns. To overcome these issues, specificity and sensitivity of search criteria must be improved through more insights of structural information of the non-canonical DNAs.
DespiteW?5 all above mentionedW?6 facts bioinformatics studies have successfully predicted largeW?7 number of unique potential non-canonical sequence motifs. Whole genome analyses of different organisms for predicting non-canonical motifs indicate that distribution of potentialW?8 non-canonical sequences is not random but abundant at a specific location, which indicatesW?9 that non-canonical sequences could be correlated with keyW?10 functionality. These predictions are well correlated with accumulating evidencesW?11 that non-canonical DNA in the specificW?12 regions e.g.W?13 promoter regions are crucial for the control of gene expression. Knowledge gained by studying non-canonical sequences at genomeW?14 -wide level, together with the careful inquisition of individualW?15 non-canonical motif, will further improve our understanding of their roles in gene regulation and control in organisms. HopefullyW?16 all the present and future computational approaches towards predicting non-canonical DNA can resolve the long awaiting mysteries of genome regulation.