论文标题

PULSAR星星数据集的机器学习管道

Machine Learning Pipeline for Pulsar Star Dataset

论文作者

Florez, Alexander Ylnner Choquenaira, Vinces, Braulio Valentin Sanchez, Arroyo, Diana Carolina Roca, Saire, Josimar Edinson Chire, Franco, Patrıcia Batista

论文摘要

这项工作汇集了一些最常见的机器学习(ML)算法,目的是在一组不平衡数据中获得的结果级别进行比较。该数据集由对天文学对象进行近17,000个观测值组成,以识别脉冲星(HTRU2)。基于评估这些不同模型在同一数据库中的准确性的方法论建议,该数据库用两种不同平衡数据的策略处理。结果表明,尽管这种类型的数据中存在噪声和类别的噪声和不平衡,但可以将它们应用于标准ML算法并获得有希望的精度比率。

This work brings together some of the most common machine learning (ML) algorithms, and the objective is to make a comparison at the level of obtained results from a set of unbalanced data. This dataset is composed of almost 17 thousand observations made to astronomical objects to identify pulsars (HTRU2). The methodological proposal based on evaluating the accuracy of these different models on the same database treated with two different strategies for unbalanced data. The results show that in spite of the noise and unbalance of classes present in this type of data, it is possible to apply them on standard ML algorithms and obtain promising accuracy ratios.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源