000 02129nam a22002893a 4500
001 UPMIN-00000518117
003 UPMIN
005 20230209165948.0
008 230209b |||||||| |||| 00| 0 eng d
040 _aDLC
_cUPMin
_dupmin
041 _aeng
090 0 _aLG993.5 2007
_bA64 S27
100 _aSarmiento, Jon Marx P.
_9561
245 _a D-neighborhood imputation method for ordinal data sets with missing values /
_cJon Marx P. Sarmiento
260 _c2007
300 _a111 leaves.
500 _aThesis (BS Applied Mathematics) -- University of the Philippines Mindanao, 2007
520 3 _aImputation is applied in filling up missing values in surveys which are ordinal in form. Among the imputation techniques are Mean, Mode, Hot-deck and KNN imputations which have their own drawbacks. To address this issue, the proponent introduced a new imputation method called D-neighborhood imputation. It uses the concept of neighborhood and cut off value to ensure high similarity with the reference and the maximum penalty rule in solving for the distance of unknown values. D-neighborhood was evaluated and compared with the existing techniques. The experiment was done using the Dermatology and Breast Cancer data sets. Incomplete data sets were generated under MCAR with 1%, 5%, 10%, 20%, and 30% level of missing values and conditioned MCAR with 0.25, 0.5, 0.75 and 1 probability in no, 2, and 3 combinations. According to the results, it performed best under MCAR condition in both data sets and resulted the best clustering quality when applied to Breast Cancer data set under MAR condition. Using Dermatology data set, D-neighborhood and KNN have competing results while using Breast Cancer data set, D-neighborhood performed best. In general, D-neighborhood imputation outperformed the rest of the algorithms when tested in both data sets.
650 1 7 _aClustering.
_9366
650 1 7 _aK-means algorithm.
_92351
650 1 7 _aImputation techniques.
_92352
650 1 7 _aOrdinal data sets.
_92353
658 _aUndergraduate Thesis
_cAMAT200,
_2BSAM
905 _aFi
905 _aUP
942 _2lcc
_cTHESIS
999 _c699
_d699