Xiaoyu Luo,Sheng Zheng,Yao Huang,Shuguang Zeng,Xiangyun Zeng,3,Zhibo Jiang,and Zhiwei Chen
1Center for Astronomy and Space Sciences,China Three Gorges University,Yichang 443000,China; xyzeng2018@163.com
2 Purple Mountain Observatory,Chinese Academy of Sciences,Nanjing 210023,China
3 Xinjiang Astronomical Observatory,Chinese Academy of Sciences,Urumqi,Xinjiang 830011,China
4 College of Electrical Engineering and New Energy,China Three Gorges University,8 University Road,Yichang,443000,China
Abstract The detection and parameterization of molecular clumps are the first step in studying them.We propose a method based on the Local Density Clustering algorithm while physical parameters of those clumps are measured using the Multiple Gaussian Model algorithm.One advantage of applying the Local Density Clustering to the clump detection and segmentation,is the high accuracy under different signal-to-noise levels.The Multiple Gaussian Model is able to deal with overlapping clumps whose parameters can reliably be derived.Using simulation and synthetic data,we have verified that the proposed algorithm could accurately characterize the morphology and flux of molecular clumps.The total flux recovery rate in13CO (J=1-0) line of M16 is measured as 90.2%.The detection rate and the completeness limit are 81.7%and 20 K km s-1 in13CO(J=1-0)line of M16,respectively.
Key words: molecular data–molecular processes–methods: laboratory: molecular
The detections of the interstellar molecular hydrogen (H2) by Carruthers (1970) in the ultraviolet band and carbon (CO) by Wilson et al.(1970)at 2.6 mm created an exciting new era in the study of the molecular interstellar medium,while the discovery of organic molecules in the medium led to the birth of molecular astrophysics.As one of the fundamental components of the interstellar medium,molecular clouds mainly consist of molecular gas with a mixture of atoms,ions,dust,and other materials(Heyer& Dame 2015; Heiles et al.2019).The molecular clouds in the galaxy exhibits the structure over a wide range of scales,from 20 pc or more for giant molecular clouds down to 0.05 pc for dense molecular clumps(Williams et al.2012; Kauffmann et al.2013;Lin et al.2020).Modern astronomy proved that the formation of stars is inside the molecular clumps(Krumholz & McKee 2005;Zinnecker & Yorke 2007; Krumholz et al.2009).Therefore,the molecular clumps are the keys for theoretical models that aim to reproduce the observed characteristics of star formation in the Galaxy(Rivera-Ingraham et al.2017; Tang et al.2019).
As a consequence,several telescopes (e.g.,the FCRAO 14 m,the CfA 1.2 m,the Bell Laboratories 7 m,the PMO 13.7 m telescopes) have devoted to the CO survey projects(Sanders et al.1986; Dame et al.2001; Lee et al.2001; Zuo et al.2011).These CO surveys will lead to a better understanding of the evolution of molecular clumps,the initial mass function of stars,as well as the structure and dynamic evolution of the Milky Way(Heyer & Dame 2015).With the progresses of the CO survey,it is impractical to manually process great numbers of data.Therefore,a stable and reliable algorithm for automatically detecting the molecular clumps has become the focus.Several algorithms have been used to detect molecular clumps,such as GaussClumps,FellWalker,ClumpFind and Reinhold(Stutzki & Guesten 1990; Williams et al.1994; Berry et al.2007; Berry 2015).The GaussClumps was first applied to the M17 molecular cloud to detect molecular clumps,and then frequently applied to the detection of clumps in other molecular clouds(Schneider et al.1998; Dent et al.2009; Lo et al.2009).The ClumpFind algorithm was applied to the detection of compact structures in the Rosette molecular clouds.A new giant filament was found by Zhan et al.(2016) with a statistical study on the giant molecular cloud M16(Sugitani et al.2002).
Studies show that the ClumpFind is very sensitive to the initial parameters,and the GaussClumps can only fit a strict elliptic shape.The FellWalker exhibits the best performance in detection completeness and parameterization(Li et al.2020).However,it should be noted that the GaussClumps and the ClumpFind algorithms are affected by the initial parameters,and the algorithms themselves are designed to simulate the“human eye” for molecular clump recognition,which have certain limitations(Rosolowsky et al.2008).Moreover,for large amounts of molecular cloud data,it is clearly not feasible to rely on algorithms with repeatedly setting parameters by users,although it is possible to achieve satisfactory detection results in certain cases.Therefore,we need to design an algorithm which has fewer parameters or can be adjusted more easily based on the physical properties.
One of the dominant features of molecular clumps with increased local intensity and different shapes is that they are embedded in molecular gas of lower average bulk density(Blitz&Stark 1986; Lada 1992).The Local Density Clustering (LDC)algorithm(Alex Rodriguez 2014)has its basis on assumptions that the cluster centers are surrounded by neighbors with lower local density and they are at a relatively large distance from the points with a higher local density,which is similar to the characteristics of molecular clumps.Therefore,we attempt to adopt the LDC in the detection of molecular clumps.In Section 2,the molecular clump detection algorithm based on LDC and parameterization based on the Multiple Gaussian Model(MGM)are introduced.In Section 3,the 3D simulated data sets with different number density are described.The performance of the LDC and MGM is compared with traditional algorithms on the data sets.The investigation of the completeness and parameterization of the algorithm in real molecular clump data are presented in Section 4,while the summary is provided in Section 5.
2.1.1.Features Extraction
The algorithm first compute three parameters of a point: the local density,the distance,and the gradient.The local density ρiof a point piis defined as:
where dcrepresents the cut-off distance,dijrepresents the distance between piand pj,and Ijrepresents the intensity at pj.
The distance δiof a piis defined as:
where δiis measured by computing the minimum distance between piand any other point with higher density.Specially,δiis set to be the maximum δ if piwith the highest density.The distances δiare normalized.
among them,n represents the total number of data points,the point with the longest distance is set as 0 in the vector N(p).
The gradient ?iis defined as:
where ρjand δiare defined in Formula(2).
2.1.2.The Clump Center Determination
After calculating three parameters,as shown in Figure 1,the distance δ is plotted against the density ρ,which is referred to the Decision Graph.The simulated data with 10 clumps are shown in Figure 1(a),while the detected clumps are shown in Figure 1(b)with centers marked by red stars.Figure 1(c)shows the Decision Graph,where the centers of the detected clumps are marked with circles.Whether piis the center point of a clump or not is judged by:
Figure 1.The example of algorithm on 2D simulated data.(a) The 2D data contain 10 simulated clumps.(b) Clustering result.(c) Decision Graph of the data in (a).
2.1.3.Route Clustering
where N and nkrepresent the number of clumps and the number of data points in the clump Ck,respectively.
2.1.4.Clump Region Determination
The region of the individual clump Ckcan be determined according to ρ and ?:
2.1.5.False Clumps Exclusion
The isolated noise points with high peak intensity value could be recognized as false clumps.The smallest clump should have enough data points to form it.Therefore,the false detected clumps could be eliminated by the following criteria:
where n0is the minimum data point number of a clump.
2.1.6.Clump Characterization
The algorithm will provide a pixel mask which is the same shape as the supplied data array.In the mask,the pixel points belonging to the same clumps are marked with an integer,while points that are not assigned are marked with-1.Finally,a table in which each row describes an individual clump is obtained.In each column of the table,Peakiand Cenirepresent the position of the clump peak intensity value and centroid on axis i (i=1,2,3),respectively.Sizeirepresents the size of the clump on axis i.Sum and Peak represent the total flux and peak intensity value of the clump,respectively.The definition of the centroid is as follows:
The Sizeiof the clump Ckon axis i is defined as:
where Ijand xjrepresent the intensity and position of pj,respectively.For the clump with a Gaussian profile,the size is equal to the standard deviation of the Gaussian.
2.1.7.Algorithm Summarizing
The input of algorithm is a 3D (or 2D)data array.δ0and ρ0are key hyper-parameters of the algorithm,where δ0represents the minimum distance between the centers of the two clumps,and ρ0represents the minimum peak intensity value of a candidate clump.n0is the minimum data point number of a clump and ?0is used to determine the region of a clump.The local density of a point is calculated with the cut-off distance(dc).The input and parameters of the LDC algorithm5https://github.com/Luoxiaoyu828/LDC-MGM.are listed in Table 1.The outputs include masks indicating the pixels that contribute to each clump,and catalogs holding clump positions,sizes,peak values and total fluxes.The output and parameters of the LDC algorithm are listed in Table 2.
Table 1 The Input and Parameters of LDC Algorithm
Table 2 The Output of LDC Algorithm
The detection of the algorithm is not affected by neither the shape of the clumps nor the dimensionality of the space they embedded in.The detection results of the LDC in different number density and different PSNR are shown in Table 3.The size of a simulated data is 100×100×100.
Table 3 The Performance of the Algorithm
Traditional algorithms are used to segregate overlapping molecular clumps,and there could be large deviation in the parameter estimation of overlapping molecular clumps.Therefore,we adopt MGM to realize the parameterization in this paper.
2.2.1.The 3D Gaussian Model
The observation data of molecular clump is a 3D data array.The first and second dimension of the 3D data array stand for the galactic longitude and latitude,respectively.The third dimension of the 3D data array stands for the velocity.Due to the fact that the spatial and velocity are not related,the tilt angles of simulated clumps only appear on the galactic longitude–latitude plane.Therefore,the 3D Gaussian Distribution is described as:
where(x0,y0,v0)represents center point of the distribution,σx,σy,σvrepresent the standard deviations in the three axes,respectively.The variable A represents amplitude of the distribution,and θ represents the tilt angle on the x–y plane.
2.2.2.The 3D MGM
For the scenario where multiple Gaussian components overlap,adopting a single Gaussian distribution to explain will lead to serious deviation.Taking Figure 2 as an example,the black solid line represents the actual data obtained,and the three dashed lines represent the actual components.The MGM can effectively solve this problem.The MGM is defined as follows:
Figure 2.Overlapping of three Gaussian components.The black line is composed of a combination of three Gaussian distributions.Dashed lines represent each Gaussian component.
3.1.1.3D Simulated Data
The simulated data sets are composed of different number density data with the size of 100×100×100,and data at low,medium and high density contain 10,25 and 100 simulated clumps,respectively.The peak intensity value of the clump take values from 2 to 10,while the size of the clump in velocity axis take values from 3 to 5 and the spatial size in the x and y axes take values from 2 to 4 (FWHM=2.35×size).The tilt angles of the simulated clumps on the x–y plane vary from 0°to 180°.Gaussian noise is added to the simulated clumps with a root-mean-square (rms) of 1.For each number density,we generate a total of 10,000 simulated clumps.Figure 3 shows the 3D display of one simulated data array.
Figure 3.3D display of simulated clumps.
3.1.2.Detection Based on LDC Algorithm
As shown in Figure 4,from left to right are the integral maps on the three planes of x–y,x–v and y–v,respectively.The center points of clumps are marked with red asterisks on the integral graphs.
Combined with ?and ρ of each data point,the members and region of the clump Ckcould be determined by Formula(8).(1)Using ?0as the threshold,the point set A1with ?greater than?0is the main part of the clump Ck.(2)The average densityis calculated based on the point set A1,then the point set A2represents ρ greater thanwhich is also the main part of the clump.(3)The union of A1and A2could form the region of an individual clump Ck.The detection results are shown in Figure 5.The region of each clump will determined while the false clump will eliminated by Formula(9).Finally,the parameter estimation in Section 2.1.6 will be performed.
Figure 4.The centers of detected clumps are marked on the integrated intensity maps with red asterisks.From left to right are integral maps of x–y,x–v and y–v planes,respectively.
Figure 5.The integrated intensity maps of detected clumps are marked with red asterisks.From left to right are integral maps of the x–y,x–v and y–v planes,respectively.
3.1.3.Evaluation Indicators
The detection of molecular clumps is considered to be correct if the Euclidean distance between the center of the detected clump and the center of the simulated is less than 2 pixels in the three axes.
Four statistics are obtained by the detection results as follows: True-Positive (TP),True-Negative (TN),False-Positive (FP),and False-Negative (FN)(Zhou et al.2020).The evaluation indicators for the algorithm include: recall rate(R),precision rate(P)and comprehensive score(F1).The R,P and F1are defined as:
The accuracy and completeness of detection are reflected in P and R,respectively.A good detection algorithm should have higher P and R.Usually the two indicators will show the opposite trend.The comprehensive performance ability of the algorithm is mainly reflected in F1.
3.1.4.Detection Comparison
The GaussClumps,FellWalker and LDC are employed in the detection of simulated clumps.Figure 6 shows the evaluation of indicators R,P and F1of the GaussClumps,FellWalker,and LDC algorithms in different peak signal-to-noise ratio (PSNR)levels and different density.The PSNR is defined as the ratio of the peak intensity value of the simulated clump to the rms of noise.As the PSNR decreases,R of these algorithms at different density levels starts to decrease,especially when the PSNR is less than 4.The FellWalker and LDC algorithms generally have high P,while the same indicator of the GaussClumps performed worse with the PSNR less than 6.It is obvious that R of those algorithms at different density hold high level with the PSNR above 6,and P of those algorithms hold high level with the PSNR greater than 7,while P of the GaussClumps gradually descends with the decrease of the PSNR.
Figure 6.The evaluation indicators R,P and F1 of the three algorithms are plotted against the PSNR at different number density.Top panel:the detection statistics of the three algorithms in high density,from left to right are R,P and F1,respectively.Middle panel:same as above but for medium density.Bottom panel:same as above but for low density.
The top panel in Figure 6 shows the R,P and F1of the GaussClumps,FellWalker,and LDC algorithms at high density from left to right,respectively.The R and P are above 80%for those algorithms when the PSNR is greater than 7.While P of the GaussClumps is greater than the FellWalker and LDC in the case of the high PSNR,and R of the GaussClumps is lower than the two algorithms in low PSNR.For those clumps in the simulation that overlap heavily or even merge into new clumps,the FellWalker and LDC are unable to distinguish these clumps,leading to a decrease in R.Because the Gaussclumps detects the clumps by fitting,it can separate the overlapping clumps from each other,thus improving R.The middle panel shows the same as above but for medium number density.P of the three algorithms are essentially the same as in the case of high density,but R of those algorithms have increased and the gap between the GaussClumps and the other two algorithms is further reduced with the PSNR above 6.The bottom panel shows the same as above but for low number density.P of the three algorithms are basically the same as in the case of high density,but R are above 90%for the three algorithms when the PSNR is greater than 5,and R of the GaussClumps is lower than the other two algorithms.
Figure 7.The statistics of relative deviation in peak intensity by the three algorithms as a function of the PSNR.The blue,green and red dots show the distribution of the individual measurements.The special symbols and error bars represent the median and standard deviation of accuracy,respectively.Two dashed horizontal lines represent the relative deviation of ±10%.
Figure 8.The statistics of the relative deviation in the total flux as a function of the PSNR.Two dashed horizontal lines represent the relative deviation of±30%.The blue,green and red dots,special symbols and error bars have the same meaning as Figure 7.
The experimental results show that P and R of the FellWalker and LDC algorithms can be maintained at high level,but R decreases in the case of high density.The GaussClumps algorithm has high R and P at the certain PSNR indicating that it is susceptible to noise.In terms of the comprehensive performance indicator F1,the FellWalker and the LDC algorithms are essentially the same,both outperforming the GaussClumps algorithm in low PSNR.
3.2.1.Evaluation Indicators
To investigate the performance of the algorithm in terms of parameterization accuracy,various measured parameters are compared with their input values,peak intensity,total flux,tilt angle,size,and position of the clump.For each parameter,the absolute deviation of the position E(ΔX),angle E(Δθ),and the relative deviation of size E(ΔS),peak intensity E(ΔI)and total flux E(ΔF) are calculated.Those evaluation indicators are defined as:
where N represents the number of simulated molecular clumps which are detected correctly by the algorithm.The superscript s and m represent the parameters of the simulated molecular clumps and measured by the algorithm,respectively.X,S,I and F represent the position,size,peak and total flux of the clump,and θ represents the tilt angle on the x–y plane of the clump.
3.2.2.Performance
We launched statistical experiments to compare the parameterization performance of FellWalker,GaussClumps and LDC and MGM algorithms.The high density simulated data described in Section 3.1.1 are used in the statistical experiments.
Figure 7 shows the relative deviation of peak intensity value as a function of the PSNR.The vertical axis represents the relative deviation of peak intensity between the simulated clump and measured clump,and the horizontal axis represents the PSNR of the clump.The blue,green and red dots represent the relative deviation of clumps detected by the FellWalker,GaussClumps and LDC and MGM,respectively.When the dot is above 0,it means that the value of measured by algorithm is less than the simulated,otherwise,the value of measured is greater than the simulated.Error bars represent standard deviation of accuracy.The blue circle,green triangle and red asterisk represent the median of relative deviation measured by the FellWalker,GaussClumps,LDC and MGM algorithms,respectively.
From Figure 7 we can see that as the PSNR of the simulated clump increase,the deviation of the GaussClumps and FellWalker algorithms gradually decrease,while the peak intensity values measured by both algorithms are greater than the simulated.The deviations obtained by the LDC and MGM algorithm are close to 0 with the dispersion decreased gradually,indicating that the peak intensities estimated from the LDC and MGM algorithm is more reliable.
The total flux is an important parameter,which is directly related to the column density and mass of a molecular clump.As can be seen in Figure 8,the total fluxes of the GaussClumps and FellWalker algorithms are smaller than the simulated values.The reason is that both algorithms have a cutoff threshold for background noise in detecting molecular clumps and can only detect part of clumps.The most deviation of the LDC and MGM does not exceed±30%with the PSNR greater than 4,indicating that the LDC and MGM are stable in the total flux estimation of molecular clumps.
Figure 9.The statistics of absolute deviation in the tilt angle as a function of the PSNR.The tilt angle on the x–y plane of the molecular clump vary from 0°to 180°.The minimum ratio of the major axis to the minor axis in these clumps is 1.4.The blue,green and red dots,special symbols and error bars have the same meaning as Figure 7.
Figure 9 shows the deviation of tilt angle,the symbols are the same as Figure 7.We can see that the dispersions of the measured deviations are decreased gradually with increasing of the PSNR for the three algorithms,while the deviation is less than 10° when the PSNR greater than 4,indicating that the estimation of molecular clump angle by this algorithm is stable.
The size of the molecular clump can be used to describe the different shapes of them,which is a very important parameter for the classification of the molecular clump.From left to right,the panels of Figure 10 show the statistics relative deviation in Size1,Size2,and Size3,respectively.In Figure 10,we can see that the size obtained by the GaussClumps exhibit a large deviation.The measured size of the GaussClumps and FellWalker algorithms are lower than the simulated size.With increase of the PSNR,the deviations from the GaussClumps and FellWalker algorithms gradually decrease,while the deviations from the LDC and MGM algorithms are closed to zeros.The deviation of LDC and MGM is less than 10% with the PSNR above 4,indicating that the size of clump obtained from the algorithm is reliable.
Figure 11 shows the absolute deviation of position as a function of the PSNR,from left to right are the deviation on galactic latitude,galactic longitude,and velocity,respectively.The position deviations measured by the FellWalker,LDC and MGM are almost within 1 pixel and the deviation is no more than 0.5 pixel at the PSNR greater than 4,while the deviation from the GaussClumps is greater than the two algorithms.We can see that some horizontal bars appear in the distribution of position measured by the GaussClumps in galactic latitude and longitude direction.The reason is that the low spatial resolution of the simulated clumps leading to the centers fitted by the Gaussclumps are mainly located on the grid.
Overall,detecting clumps by the LDC and MGM at high number density has robust parameterization accuracy in term of position,peak,total flux,size,and tilt angle.The molecular clumps parameterization of the proposed algorithm show less deviation and less dispersion than the FellWalker and GaussClumps algorithms with the PSNR above 5.
The13CO (J=1-0) line of M16,including the region within 15°15′<l<18°15′ and 0°<|b|<1°30′ from the Milky Way Imaging Scroll Painting (MWISP) survey(Sun et al.2018),is employed in the molecular clump detection and parameterization.The typical noise level at13CO (J=1-0)line is about 0.23 K with the channel width of 0.167 km s-1.Figure 12 shows the integrated intensity maps of M16 in13CO(J=1-0) line.
Using the M16 data,Zhan et al.(2016) has confirmed the identification of the giant molecular filament (GMF) G18.0-16.8 by Ragan et al.(2014) and find a new giant filament,G16.5-15.8,located in the west 0°.8 of G18.0-16.8.Song &Jiang(2017)has calculated the properties of the clump samples under local thermodynamic equilibrium assumption.The virial mass and virial parameter are calculated to evaluate whether clumps are bound or unbound.They found the majority of13CO clumps are bound,which suggest that those clumps may form stars in the future.Based on their research in detection clump on M16,the13CO(J=1-0)line of M16 is used to investigate the performance of our algorithm.
After tuning the algorithm parameters,the GaussClumps,FellWalker and LDC algorithms are applied to detect the13CO(J=1-0) line of M16.Figure 13 shows the distribution of peak intensity value of clumps detected by the three algorithms.The observed total flux is defined as the summed flux of those observations above 2×rms of the background.The recovery rate is defined as the ratio the sum of clumps flux to the observed total flux.The recovery rate of total flux obtained by the GaussClumps,FellWalker and LDC are 51.6%,90.4%and 90.2% in13CO emission of M16,respectively.
It is can be inferred from Figure 13 that the peak intensity values of clumps detected by the LDC and FellWalker have a similar distribution with a more flatted peak,while the distribution of the peak intensity values detected by the GaussClumps deviates greatly from the other two algorithms.The peak of distribution is about 2 in the FellWalker and LDC algorithms,while the GaussClumps is for 3.4.Combined with the minimum peak intensity value (about 2.1 K) of clumps detected by the GaussClumps and the noise level (0.23 K) at13CO (J=1-0) line,it shows that the PSNR of clumps detected by the GaussClumps are greater than 9,while the recall rate of the algorithm in the Section 3.1.4 can be maintained a certain level with the PSNR above 5.It may be the Gaussclumps algorithm tends to fit a clump with a strict elliptic shape,and it fails to fit a clump with weaker peak intensity value in the real data.
Figure 10.The statistics of relative deviation in size as a function of the PSNR.From left to right are the deviation of Size1,Size2,and Size3,respectively(Size1 and Size2 represent major and minor size of detected clump in the spatial,respectively,Size3 represents the size of detected clump in velocity axis).Two dashed horizontal lines represent the relative deviation of ±10%.The blue,green and red dots,special symbols and error bars have the same meaning as Figure 7.
Figure 11.The statistics of absolute deviation of the position as a function of the PSNR.From left to right are the deviation of galactic latitude,galactic longitude,and velocity,respectively.The blue,green and red dots,special symbols and error bars have the same meaning as Figure 7.
Figure 12.The integrated intensity maps of M16 in13CO(J=1-0)line with a velocity range of 15.93–27.06 km s-1.
Figure 13.The distribution of the detected peak intensity values of clumps by the GaussClumps,FellWalker,and LDC.
Figure 14.The detection rate of the three algorithms in13CO(J=1-0)line of M16 as a function of total flux.The detection rate of GaussClumps,FellWalker and LDC are 80.9%,74.7% and 81.7%,respectively.The completeness limitation of LDC and FellWalker are 20 K km s-1 and 45 K km s-1,respectively,while the GaussClumps is 75 K km s-1.
Figure 15.The histogram of the peak deviation (ΔI) for the GaussClumps,FellWalker,and LDC.The ΔI is described in Formula(19).The mean deviation of the FellWalker,GaussClumps and LDC are-10.8%,-11.0%and-8.9%,respectively.The standard deviation of the FellWalker,GaussClumps and LDC are 29.4%,31.5% and 28.3%,respectively.
The limitation of the telescope sensitivity causes low quality clump being ignored.Other indicators of the algorithm are the completeness and the detection rate above the limitation.The“completeness limit”here refers to the total flux or mass above which a clump can be detected at certain level with an algorithm.The smaller and weaker molecular clumps,the less likely they are to be detected.
We designed the data set by randomly inserting simulated clumps into the13CO (J=1-0) line of M16.The peak intensity value of those simulated clumps takes values from 2 to 5,while the size of the clump in the velocity axis takes values from 2 to 4 and the size in the galactic longitude and latitude axes takes values from 0.5 to 2.The clumps detected by the GaussClumps,FellWalker and LDC algorithms are matched with the simulated clumps.The number of clumps within each total flux interval is counted,the completeness and the average detection rate above the limitation are obtained.
Figure 14 shows the detection rate of the GaussClumps,FellWlaker,and LDC algorithms in13CO (J=1-0) line of M16,respectively.As the total flux increases,the detection rate of the GaussClumps grows slowly,while the Fellwalker and LDC are able to maintain a relatively high detection rate all the way from the completeness limitation.The detection rate of the GaussClumps,FellWalker and LDC are 80.9%,74.7% and 81.7% above the completeness limitation,respectively.From the detection rate of each algorithm,we can roughly estimate that the number density in13CO (J=1-0) line of M16 is between the high density and medium density in the simulation data sets described in Section 3.1.1.
Figure 15 shows the statistical histogram of ΔI for the three algorithms.The simulated clumps could overlap in real observations,leading to the detected peak intensity values of clumps by the three algorithms are systematically larger than those of simulated clumps.While the LDC has the least dispersion of deviations.The long tail at the left side of the peak deviation suggests a relatively high intensity of molecular clumps in M16.
We present a molecular clump detection and parameterization algorithm based on the Local Density Clustering and Multiple Gaussian Model (LDC and MGM).The proposed algorithm is robust and universal in the clump detection.The employed algorithm of LDC in the clump detection and segmentation could achieve high accuracy with different signal-to-noise levels,while the MGM could obtain reliable physical parameters of overlapping clumps.
We applied our method to a simulated data set,and find,(1)detection rate: the recall rate of the algorithm at high,medium and low number density simulated data is greater than 80%,90%,and 97% with the PSNR above 6,respectively.The algorithm retains a high level of detection accuracy when the PSNR is greater than 3.(2) Accuracy of parameters: the parameterization of the algorithm in simulated data show less deviation and less dispersion with the PSNR above 5.The deviations of peak value and size are almost within 10% with the PSNR above 5,while the deviations of total flux hardly exceed 30% when the PSNR is greater than 4 at the high number density.The deviations of tilt angle on the x–y plane are less than 10° with the PSNR above 4.
We apply our algorithm to the13CO (J=1-0) map of the M16 nebula taken by PMO-13.7 m telescope.The detection rate of clumps is up to 81.7%with a completeness limitation of 20 K km s-1in13CO (J=1-0) line of M16.A total of 658 molecular clumps have been detected by our algorithm and the total flux recovery rate in13CO (J=1-0) line of M16 is estimated as 90.2%.The number density in13CO (J=1-0)line of M16 may be between the high and medium density in the simulation data sets described in Section 3.1.1.
Acknowledgments
We thank the anonymous referee for his/her suggestive comments that help improve the manuscript a lot.This work was supported by the National Natural Science Foundation of China(U2031202,11 903 083,and 11 873 093).This research made use of the data from the MWISP project,which is a multi-line survey in12CO/13CO/C18O along the northern galactic plane with PMO-13.7 m telescope.We are grateful to all the members of the MWISP working group,particularly the staff members at PMO-13.7 m telescope,for their long-term support.MWISP was sponsored by the National Key R&D Program of China with grant 2017YFA0402701 and CAS Key Research Program of Frontier Sciences with grant QYZDJ-SSW-SLH047.
ORCID iDs
Research in Astronomy and Astrophysics2022年1期