成分数据中基于MCLasso的修正EM算法
A new modified EM algorithm based on MCLasso in compositional data
云南民族大学学报:自然科学版,2017,26(1):50-54

田莹 TY

摘要


针对成分数据中含有近似零值,对其作对数比变换后就会出现无穷值,从而影响对数据的进一步分析.提出了一个新的修正EM算法来处理成分数据中的近似零值问题,针对EM算法的缺点对其进行一些改进,即:对EM算法的E步用Monte Carlo方法改进,对EM算法的M步用Lasso算法进行改进.对新的方法进行实证分析,并与基于线性回归的修正EM算法、基于均值插补法和Bootstrap的修正EM算法进行比较研究,验证了该方法的有效性. The log-ratio transformation is a common method of pretreatment in compositional data analysis, however, the data sets often contain the rounded zeros in practice, which will be infinite when using the log-ratio transformation. We can be regarded as a special kind of missing values, thus affecting the further analysis of the compositional data. In this paper, we present a new modified EM algorithm (MCLasso) to dealing with rounded zeros in compositional data. Some improvements are executed for the shortcomings of EM algorithm, namely: the E step of EM algorithm will be improved by the Monte Carlo algorithm, and the M step of EM algorithm will be improved by the Lasso algorithm. We conduct empirical analysis for the new method based on MCLasso, and the modified EM algorithm based on linear regression, based on mean imputation and Bootstrap were compared to verify the validity of the method.

参考



全文: PDF      下载: 179      浏览: 288


counter for myspace
云南民族大学学报(自然科学版) 1991—2016 Copyright
地址:云南省昆明市一二.一大街134号 邮编:650031 全国邮发代号:64-47
电话:0871-65132114 传真:0871-65137493 Email:ynmzxyxb@163.com