极地研究 ›› 2016, Vol. 28 ›› Issue (3): 353-360.DOI: 10.13679/j.jdyj.2016.3.353

• 研究论文 • 上一篇    下一篇

基于聚类的极光形态分类的探索性研究

王倩1,2,胡泽骏2,丘琪2   

  1. 1. 西安邮电大学通信与信息工程学院,陕西 西安 710121
    2. 中国极地研究中心,上海 200136
  • 收稿日期:2015-09-10 修回日期:2016-01-06 出版日期:2016-09-30 发布日期:2016-09-30
  • 通讯作者: 王倩
  • 基金资助:

    国家自然科学基金(41504115)、国家海洋局“南北极环境综合考察与评估”专项(CHINARE 2014-02-03)和陕西省自然科学基础研究计划(2015JQ6223)资助

A Study of Clustering All-sky Images Based on Auroral Appearance

Wang Qian1,2,Hu Zejun2,Qiu Qi2   

  • Received:2015-09-10 Revised:2016-01-06 Online:2016-09-30 Published:2016-09-30

摘要: 极光形态为研究日地物理过程提供了显著、直观和具有可识别性的特征。合理分类对研究各类极光现象与磁层动力学过程之间的关系尤为重要。极光形态分类机制的选择问题是极光有监督分类研究被主要诟病的问题之一。有监督分类实验中人工标记的工作量非常浩大,而且不能保证标记的准确性?更重要的是,高分类正确率只能说明自动分类符合人的认识,有监督分类结果无法验证分类机制的正确性。现有的分类机制是否为极光数据空间的真实划分,是否存在更为合理的分类机制都是我们应该探讨的问题。针对该问题,基于已有的全天空极光图像表征方法,引入聚类算法探究极光特征空间的结构,利用了9种聚类有效性函数选择适合极光数据的聚类个数。实验结果表明,对于从2003—2004年北极黄河站观测的全天空极光数据中随机选取的6 000幅极光图像,两类和四类的划分方式最为合适?两类的划分可以看作是分离度较好的极光类型,并且根据两类分布曲线呈现午前午后双峰的分布特点,这一类极光可能是弧状极光。对于四类的情况,虽然通过肉眼观察无法用一幅典型的极光图像代表每一类,但是这些由聚类得出的极光类型具有各自的时间分布特点,这一结果从无监督的角度证明了极光类型在形态上是可分的。

关键词: 极光, 聚类, 形态分类, 全天空图像

Abstract: Auroral appearance forms provide remarkable visual and distinguishable features for the study of solar-terrestrial physics. Proper classification is meaningful for studying the relationship between various types of auroral phenomena and the dynamics of the magnetosphere. The selection of auroral classification schemes has been criticized in supervised classification experiments, which require large amounts of labor. Additionally, the accuracy of manual labeling has been questioned. More importantly, the results of supervised classifications cannot be used to verify the correctness of classification schemes. Thus, we should investigate whether existing classification schemes are accurate, as well as identify better classification schemes. To address these issues, a clustering method was used to explore the feature space of aurora data based on an available auroral image characterization method. Nine cluster validation indices were used to select the optimal number of clusters. Six thousand all-sky images, which were randomly selected from observations acquired in the Arctic Yellow River Station in 2003—2004, were clustered using the Ncut algorithm. The results showed that schemes consisting of two and four classes were the most accurate. The two-class schemes had well-separated auroral types, and the distribution of pre-noon and post-noon occurrence peaks can be used to determine whether an aurora may be an arc. In the four-class schemes, although the naked eye failed to find a typical image that can represent all images in this class, the temporal distribution characteristics of these classes were very distinct, which proves that the auroral appearance is identifiable using an unsupervised method.

Key words: Aurora, Clustering, Auroral appearance, All-sky image