Merger identification

From cluster group wiki
Jump to navigation Jump to search

Automatic Galaxy Classification

ABSTRACT

With the rapid growth in astronomical data produced by large sky survey telescopes, traditional manual classification processes can no longer fulfill the requirements of precision and efficiency of spectral classification. There is an urgent need to employ machine learning approaches to conduct automated galaxy classification tasks. In this work, we are going to find a good enough method to classify the pair galaxies accurately, which may be a foundation for statistical work in the future.

Data

Training data

In this section, we will present how we download the training data and how to handle them.

Where data from

Galaxy Zoo is a public science project which classifies galaxy pictures at a wide range and high accuracy. Different from GZ1/2(from SDSS), Galaxy Zoo DECaLS are based on the data collected by DECaLS survey, which pictures are deeper. In addition, GZD1/2/5 hasn't released the galaxy catalog but present posted data[1], and we choose the automated one as which fits the researchers who prefer a large sample(we need to train a model). Learning from them, we download the picture in FITS as a pixel scale of 0.262 and the channels are "g", "r", and "z". The data structure is 3*255*255.we haven't done the data augmentation yet, but we decide to use flip horizontal, random cropping, and scaling.

how to label

After getting the posted data in GZD1/2/5, we need to label them using the criteria based on Table-1. Label"disk/featured disk/elliptical/stars" and "none/merger/post-merger/minor-merger" are parallel in decision tree.

"There will be a picture after I learned how to upload the picture :("
no-merge galaxy: As our target is to classify the pair_galaxy, we first download the no-merge galaxies("none" in the decision tree) and rename it as "ra_dec_0.fits". The "0" means label "none", to compare with "1" as a set of merged galaxies with "merger", "post-merger", and "minor-merger". You can visit it at http://cluster.shao.ac.cn/~michen/nomerge/ (haven't already downloaded it). There are about 310,580 in 313,788(total GZD galaxies).
merged galaxy: As same as the no-merged galaxy, we download and rename the fits as "ra_dec_1_x.fits". The first"1" means they are merged galaxies, and the "x" will be "0" as "merger", "1" as "post-merger", and "2" as "minor-merger".

'raw' data

The 'raw' data is what we want to classify. They are pair galaxies chosen from Feng, we have present the catagory at [2], we also share how to select the pair galaxies on it either. The whole data should be download in DESI, because DECaLS do not contain the dr8-north. We present the data at [3]

Model

We use the resnet model with a batch size of 1024 and learning rate at 0.001.

Merger results

CNN supervised classification results

What to do next

  • combine the "post-merge" and "minor-merge" to "after-merge" to solve the imbalance problem(merger:post:minor are almost 20:1:10)
  • use self-supervised(not decided yet) to solve the imbalance problem between merged and no-merged galaxy.
  • download the pair fits in mzls and bass
  • redownload the merged fits and rename it on webpage.

reference

  • Using transfer learning to detect galaxy mergers[4]
  • Deep learning predictions of galaxy merger stage and the importance of observational realism [5]
  • Identification of low surface brightness tidal features in galaxies using convolutional neural networks [6]
  • Identifying galaxy mergers in observations and simulations with deep learning[7]

say at last

CONTINUE UPDATING!!!_2021.4.22