Fuzzy Jaccard similarity approach in handling missing values for randomly amplified polymorphic DNA (RAPD) analysis / Apple Grace Otero Ubas
Material type: TextLanguage: English Publication details: 2008Description: 82 leavesSubject(s):- Fuzzy sets
- Hierarchical clustering
- Clustering
- Jaccard similarity coefficients
- k-nearest neighbors
- Missing values
- Modified Jaccard similarity coefficients
- Zero replacements
- RAPD (Randomly Amplified Polymorphic DNA)
- UPGMA (Unweighted Pair Group Mean Average)
- Clustering algorithms
- Undergraduate Thesis AMAT200
Cover image | Item type | Current library | Collection | Call number | Status | Date due | Barcode |
---|---|---|---|---|---|---|---|
|
Thesis | University Library Theses | Room-Use Only | LG993.5 2008 A64 U23 (Browse shelf(Opens below)) | Not For Loan | 3UPML00012182 | |
|
Thesis | University Library Archives and Records | Preservation Copy | LG993.5 2008 A64 U23 (Browse shelf(Opens below)) | Not For Loan | 3UPML00032447 |
Browsing University Library shelves, Shelving location: Archives and Records, Collection: Preservation Copy Close shelf browser (Hides shelf browser)
Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2008
In RAPD analyses, ambiguous hands are discarded as missing values. The missing values in RAPD data add the complexity to the clustering of organisms. One way of dealing with the missing values in RAPD analyses is to tolerate the ambiguity of the bands that are considered missing. In this study, this was done by the introduction of the concept of fuzziness. It was proposed that an analyst may opt to score bands with values with in the interval [0,1]. The fuzzy interpretation of RAPD experiments requires the use of appropriate similarity measures. In this light, three fuzzy Jaccard similarity coefficients were presented and applied to three RAPD data sets with scores. The performance of the three fuzzy Jaccard similarity measures were evaluated and compared to those of zero replacement, KNN, and the modified Jaccard similarity approach in terms of their ability in recovering the similarity matrices and dendrograms of the data sets used in the study. The Spearman rank correlation index was used to measure the performance of the methods at the similarity matrix level, while the Symmetric Difference was used at the clustering level. Results of the study showed that the fuzzy Jaccard similarity measures had generally performed almost as good as the KNN method at almost all levels of missing value incidence at both the similarity matrix and dendogram levels. Moreover, the fuzzy similarity measures outperformed the zero replacement and the modified Jaccard similarity approaches in handling RAPD data missing values.
There are no comments on this title.