Hierarchical clustering for mixed dataset based on variance and entropy / Luchie Marie A. Labayan.
Material type:![Text](/opac-tmpl/lib/famfamfam/BK.png)
Cover image | Item type | Current library | Collection | Call number | Status | Date due | Barcode |
---|---|---|---|---|---|---|---|
|
![]() |
University Library Theses | Room-Use Only | LG993.5 2009 A64 L32 (Browse shelf(Opens below)) | Not For Loan | 3UPML00012369 | |
|
![]() |
University Library Archives and Records | Preservation Copy | LG993.5 2009 A64 L32 (Browse shelf(Opens below)) | Not For Loan | 3UPML00032663 |
Browsing College of Science and Mathematics shelves, Shelving location: Theses, Collection: Room-Use Only Close shelf browser (Hides shelf browser)
Thesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2009
Hsu's coefficient for mixed data was modified in terms of the aggregation technique and the entropy-based distance function the efficiently cluster mixed datasets. There are six proposed dissimilarly coefficients which use variance for numerical attributes and entropy such as Shannon's weighted entropy. Havrda-Charvat's structural a-entropy and Jensen-Shannon divergence for categorical attributes. The type of data has a significant effect on the clustering produced accompanied by the aggregation function used and the entropy measure employed. For data whose categorical values have no level of similarity. The proposed dissimilarity coefficients that are closely related and generated similar dendrograms are those which have the same aggregation function. Based on the performance, the proposed dissimilarity coefficients that used De Carvallo's dissimilarity measure as aggregation function produced better clustering solution. On the other hand, for data whose categorical values have different degrees of similarity, only the proposed dissimilarity coefficients that used Shannon's entropy weighted by the distance in the distance hierarchy deviated from the group. The proposed dissimilarity coefficients that used De Carvallo's extension of Ichino and Yaguchi's dissimilarity as aggregation function worked well in clustering. The six proposed dissimilarity coefficients performed better with mixed data compared to the existing dissimilarity measures for mixed data.
There are no comments on this title.