基于修正Sigmoid核的成分数据缺失值填补法
An imputation method for missing data based on Sigmoid kernel in composition data
云南民族大学学报:自然科学版,2016,25(6):531-535

程誉莹 CYY

摘要


大多数统计分析方法基于完整的数据集, 这些方法不能直接用于包括缺失值的数据集. 此外, 由于成分数据的特殊属性, 传统的缺失值插补方法直接用于这种类型的数据可能得到不良的结果. 因此, 对成分数据而言, 缺失值的填补具有十分重要的意义. 为了解决这个问题, 根据核函数的性质,提出了一种基于修正Sigmoid核的成分数据缺失值非参数插补方法. 该方法使用模拟和真实的数据集与k近邻插补法和最小二乘迭代回归插补法进行比较. 实验结果表明, 新的插补方法可以得到更准确的估计. Most statistical analysis methods are designed for complete data sets so that they can't be used in the data sets with missing values directly. Besides, the properties of compositional data lead to the results that traditional imputation methods may get undesirable result if they are directly used in this type of data. As a result, the management of missing values in compositional data is of great significance. To solve this problem, this paper proposes a new nonparametric imputation method which is based on a modified Sigmoid kernel function for missing values in compositional data. This method has been implemented and evaluated by using both synthetic and real-world databases, and then it is compared with the k-nearest neighbors method and the iteration regression imputation method. The experimental results reveal that the new imputation method can get a better estimation.

参考



全文: PDF      下载: 148      浏览: 148


counter for myspace
云南民族大学学报(自然科学版) 1991—2016 Copyright
地址:云南省昆明市一二.一大街134号 邮编:650031 全国邮发代号:64-47
电话:0871-65132114 传真:0871-65137493 Email:ynmzxyxb@163.com