by Wing-Kin SUNG
Software helps speed up the selection of probes in the DNA chip more accurately.
he micorarray or DNA chip strongly and significantly affects genomic study. Many areas such as gene discovery, drug discovery, toxicological research, and disease diagnosis benefit from this technology.
Although the microarray is improving and growing more robust, the concentration values it measures are still “noisy” or are rendered inaccurate by other factors. One main difficulty involves selecting a unique probe for each complementary DNA (cDNA). A probe can bind with more than one cDNA in an event called cross-hybridisation. Other factors such as the formation of secondary structure and the melting temperature of probes may also cause hybridisation error, which reduces experimental accuracy.
Investigators at the School of Computing at the National University of Singapore work with scientists at the Genome Institute of Singapore (GIS) to develop software for fast and accurate selection of the probe to be used in the DNA chip. In the literature, research on better algorithms for probe design has been going on for some time. David Lockhart and his coauthors in 1996 contributed the first probe-designing program, used by Affymetrix to design short probes 20–25 bases long. They stated the criteria for selecting good probes, namely:
- Homogeneity, to ensure that the probes can bind to target cDNAs at the temperature of the experiment.
- Sensitivity, to ensure that the probes will not form a secondary structure. (Such a structure will prevent the probes from binding to the targeted cDNAs.)
- Specificity, to ensure that the probes stay unique even after a few bases are changed.
The NUS team has designed software (Figure 1) for selecting good probes, relying on Lockhart's proposed criteria. Unlike other approaches, which involve trial and error rather than fixed rules, this method can arrive at an optimal solution because speed does not get sacrificed for accuracy. In fact, the algorithm is faster than any other existing algorithm (see Table).
The improvement stems from the use of smart-filtering techniques to avoid redundant computation. Three filters are applied: the homogeneity filter eliminates probes whose melting temperature is out of the experiment's temperature range; the sensitivity filter gets rid of probes that can form secondary structures; and the specificity filter sifts out probes that can cross-hybridise. The last is the most computation-intensive and time-consuming step. By employing the so-called pigeon hole principle, the specificity filter finds and checks only exact regions in the genome that can potentially cause cross-hybridisation. Since these regions are small in comparison with the entire genome, redundant checks can be avoided.
The new algorithm can greatly reduce the time scientists spend on laboratory-testing each cDNA for probes and increase microarray throughput, eventually leading to faster production of more accurate microarrays that will be invaluable in genomic research and in the fight against disease. Even though current knowledge or technology still does not permit researchers to design probes for the whole human genome, they plan to improve their probe-selection algorithm further.
Comparison of algorithm performance.
||Li and Stormo (2000)
||Rouillard, Herbert and Zuker (2002)
|How a Microarray Works
A DNA microarray consists of a glass or nylon slide containing a set of spots, each of which has identical short DNA sequences known as probes. Each probe is a sub-string of a complementary DNA (cDNA), which acts as its fingerprint. The microarray measures the concentration of the cDNAs in some sample solution. Researchers first fluorescent-label the cDNAs in the sample and introduce them into the microarray. On the basis of the principle of Watson-Crick base pairing, a cDNA x in the sample will hybridise (bind) to the corresponding probe . The cDNAs that fail to hybridise do not correspond to any probe and are washed out of the microarray. Finally, the concentration of each cDNA can be measured on the basis of the fluorescence level of the spot for the corresponding probe.