Spectrum Classification of the Quasars from SDSS with the Redshift z ~ 3 with PCA

The spectrum lines of different quasars (QSOs) have been investigated, emphasis on the weak emission lines existent in the Lyα forest region using principal component analysis method (PCA) in the wavelength range from 1020 to 1600Å. The first, the second and the third principal component spectra (PCS) involve 63.4, 14.5 and 6.2% of the variance respectively, as the first seven PCS involve 96.1% of the total variance. The first PCS contain peak from high ionization emission lines namely the emission lines of Lyα and Lyβ (OVI, NV, SiIV, and CIV), these peaks are sharp and strong and the second PCS has peaks from low ionization emission lines (FeII, FeIII, SiII, and CII) that these emission lines are wide and almost rounded. By using the PCS, can be produced the QSO spectra artificially that are useful for investigation of how to discover QSOs and their continuum spectrum classification. By using the weights of the first two PCS can be defined five classes: class Zero and classes from I to IV, these classifications will help to discover the continuum spectrum in the Lyα forest. 21 continuum (Listed in Table 1) have been used upon spectrum of bright QSOs from SDSS with z~3 and a signal to noise ratio more than 20 (S/N>20) and apparent magnitude less than 18 (mg<18). Initially, by investigating the spectrum of each one of these QSOs, the QSOs were classified and the peak of emission line of Lyα belonging to each one of classes was investigated. In the rest, the result mean spectrum of 21 QSOs were compared with mean spectrum of 50 QSOs with redshift 0.14<z<1.04 in Suzuki et al. (2007).


Introduction
The importance of investigated absorption lines of QSOs for discovering continuum spectra in the region of the Lyα forest is to study the physical characteristics of intergalactic medium (IGM) and extract the universal parameters.It is impossible to fit the spectrum of the Lyα forest measurements, thus for describing a global plan of continuum, all seeking to find a method for classifying the simple spectra of QSOs (Kirkman et al. 2005, pp. 1373-1380, Suzuki et al. 2003, pp. 1050-1067and Boroson, 2002, p. 1265).PCA which is known as Karhunen-Lo`eve expansion, is one of the abilities of PCA method is to abbreviate the information existent in a big collection of data and enables one to summarize the information existent in a big collection of data and it is used widely in many zones of astronomy (Hewett et al. 1995, pp. 1498-1521, Croft et al. 1998, p. 44, Bechtold et al. 1994, pp. 1-78 and Kirkman et al. 2003, pp. 1-28).Francis et al. (1992) used PCA method for 232 QSO spectra (1.8<z<2.2and wavelength range from 1150 to 2000Å) resulted from Large Bright QSOs (Hewett et al. 2001, pp. 518-535).They showed that the first three PCS involve 75% of variance.Boroson & Green (1992) used from 87 QSO spectra with low redshift (z <0.5) and investigated the relationship between the first two PCS with physical characteristics.Yip et al. (2004) used PCA method for 16707 QSO spectra with 0.08< z <5.41 and wavelength range from 900 to 8000Å and reported that spectrum classification is dependent on redshift and radiance.The continuum spectra of 21 QSOs with z~3 were predicted by using PCA method in the region of the Lyα forest, where it is difficult to observe the continuum because of superfluity of absorption lines resulted from IGM.The purpose of this article is to investigate the variety of the spectrum of QSOs according to following in §3, explaining the formulation of PCA method for quantitative description of QSO spectrum using from specific spectra or PCS.In §4, the definition of the theory of QSO artificial spectra and in §5, introduction of five classes of QSO spectra to help qualitative investigation of the variety of QSO spectra.

Data
Statistical sample of data was selected out of a collection more than 105000 released QSOs in SDSS, 21 QSOs have redshift z~3 and signal to noise ratio more than 20 and apparent magnitude less than 18 (Table 1).By using PCA method, the continuum of these bright QSOs in the region of the Lyα forest was predicted (Suzuki et al. 2005, pp. 592-600).The PCA method, the spectra were transferred to observational spectrum in the rest frame; the wavelength range was defined from 1020 to 1600Å with a step of 0.5Å.To obtain the normal spectrum, the mean flux of 21 pixels obtained around 1280Å, and then the flux of each QSO was divided upon its related mean.On the spectrum resulted from MATLAB software, a norm spectrum was fitted with a condition that this spectrum omits the absorption line and only connects the emission lines to each other.To predict the continuum, it is sufficient to consider the continuum belonging to the red region of spectrum, use this part of the spectrum and predict the complete continuum of QSO in both regions (Kiamehr & Aghaee, 2012).

PCA and Quantitative Investigation of Rebuilt Spectra
The spectrum of a QSO can be defined in Dirack bracket to the shape of |q > which is usually used in quantum mechanics.
Where|q >: is the spectrum of each QSO |r i,m >: is a reconstructed spectrum |μ >: is the mean of QSOs spectrum |ε j >: is the jth PCS C : is weight of each one of PCS.Covariance and correlation matrix of 50 QSO spectrums in Suzuki et al. (2005) were investigated and PCS has been obtained by making a diagonal variance matrix.PCS and their weights would be selected like for becoming orthonormal (Francis et al. 1992, pp. 476-490).
Where N is the number of 50 QSO spectra because N is smaller than the number of pixels (1167).The square root of specific values of the jth PCS is defined as: λ for describing the probability distribution function (PDF), the weights of PCS are used (Bechtold, 1993, pp. 143-238, Jannuzi et al. 1996, p. 11, and Croft, 2002, pp. 20-52).The probability distribution function of weight coefficients can only be defined like using a parameter λ which has a weight C in the distance of −x ≤ C ≤ x (Eq.7) and the spectrum of a QSO can be written as: δ is the weights of PCS which show the mean deviance of spectrum resulted from principal components of jth spectrum in comparison with ith QSO.The first, the second and the third PCS contain 63.4, 14.5 and 6.2% of variance which means that the first three PCS contain 84.3% of variance (This percentage is dependent on normalization and wavelength range).Share of the variance in the series studied is 84.3% and it is more than what others have claimed (Francis et al. 1992, pp. 476-490 andShang, 2003, p. 6122).

Artificial Spectra
The PCS can be used to create artificial spectrum, artificial spectra can be useful in diagnostic test of QSOs, flow calibration, continuum prediction and cosmos simulator.Artificial spectra can be created by evaluating the weights of PCS.The probability distribution function of weights of jth PCS is well showed as a Gaussian function with a zero mean and the standard deviation of λ , so if PCS have been added and these weights together, a set of artificial spectra of QSO can be created (Bechtold et al. 2002, pp. 2054-2063and Cabanac et al. 2002, pp. 1090-1116).The spectrum of QSOs with a flat spectrum and the least absorption lines is mostly used to study the IGM.Therefore, artificial spectra of QSO can be useful in predicting the continuum of QSO in the region of the Lyα forest and for calibrating these continuum spectra on the region of the Lyα forest (Tytler, 2004, pp. 1-28).In figure 1, the predicted continuum according to PCA method for two QSOs out of the collection has been studied.
Figure 1.The continuum of the 13th and 4th row QSOs obtained according to PCA method.

Classification using PCA method
In this section, by using the definition of the weights of the PCS (meaning the C ), the QSO spectra classified quantitatively analyzed.The weights of the first two PCS have been used to show the difference between emission lines and continuum spectra and polar specifications are considered like the following: Where r is represents the deviation from the mean spectrum and θ is related to the profiles of the emission lines.The diagram δ according to δ is divided to five regions and five classes of a variety of the QSO spectra are considered.The main purpose of this classification is to differ between similar classes of the QSO spectra (Jena et al. 2004(Jena et al. , p. 1552)).
Class zero is defined for those which have a limited deviance from the mean spectrum, classes I to IV are explained according to one fourth of diagram δ based on δ .The probability r ≤ r : If r = 0/668 then P(r ≤ r ) = 0/2.So, class zero is considered for the spectrum of QSOs which has r ≤ 0/668.For the QSO spectra with r > 0/668, classes I to IV are defined which is relevant to the first to the fourth quarter existent in diagram δ based on δ (Figure 2).The weights of PCS are shown in diagram δ based on δ in figure 4 that are limited by a circle with a radius of r = 0/668 (Suzuki et al. 2007), the number of dots point to the identification of the number of i listed QSO in Table 1.

The Definition and Explanation for the Characteristics of Five Classes
Figure 3 shows the artificial the QSO spectra produced in the four classes I to IV for explaining the plan of the QSO spectra, they are the result of mean summation of spectrum and the first two PCS with δ = ±1 and δ = ±1.It shows four the QSO spectra with the same scales in figure 3 to see the equivalence of emission lines with continuum in a steady mood, in figures 4 to 8, two observed spectra out of each class are shown (With an exception of class I that there is only one QSO in this class).Numbering in figures 4 to 8 is the same as shape 2 to imagine where QSO and its spectrum are in the diagram δ according to δ .

Conclusion
A wide variety of vertical slices of emission lines have been in both analyzed and investigated in the region of the Lyα forest qualitative and quantitative moods.The variety of QSO spectra has been described using PCA method (PCA) and understood that 1161 pixels (in the wavelength range from 1020 to 1600Å with a step of 0.5Å) can be summarized using the weights of the first seven PCS because pixels are not independent but they are so correlated and dependent to each other.In figure 9

Figure 2 .
Figure 2. Investigating the standard distribution of weights of the first two PCS for 21 studied QSO, the exact characteristics of each one of the dots is mentioned in Table 1 for all of these QSOs.Five regions for five classes are shown in the figure.Class zero is for defined QSOs which have r ≤ 0/668 and classes I to IV is for those which have r > 0/668

Figure 3 .
Figure 3. Image making of four classes using the four produced artificial spectra with the first two PCS.Notice that the four spectra are shown in the same horizontal and vertical scales for easier comparisons, so that the vertical axis shows the normalized share and the horizontal axis shows the wavelength.By comparing these spectra, we will see that the spectrum of classes I to IV has more prominent emission lines in comparison with classes II and III

Figure 4 .
Figure 4. Class zero (r ≤ 0.668) the diagram of plain line shows the observed spectrum and the diagram of dotted line shows the continuum resulted from PCA method , the diagram related to the mean of 50 continuums of QSOs with z <1 and dotted line diagram related to the mean of 21 continuums of QSOs with z ~3 are shown, the main difference between two diagrams is about the peaks of their emission lines.The reason of this difference can be explained by paying attention to figure 2, by comparing figure 2 of this article and figure 4 of Suzuki et al. (2007), there are a few QSOs in classes I to IV with high peaks here, whereas in Suzuki et al. (2007), most of the QSOs are located in these two classes.So, the peak of emission lines of linear diagram of the mean spectrum of 50 QSOs must be higher than the peak of emission lines of dotted line diagram of the mean spectrum of 21 QSOs here and it can be well seen in the figure.In Suzuki et al. (2007), the theory of creating artificial spectra of QSO must be useful in identifying, calibrating and simulating.The five classes have been introduced to segregate similar spectra of QSOs and show that how the classification of spectra can guide to find continuum in the region of the Lyα forest.

Figure 9 .
Figure 9. Linear diagram is related to the mean of 50 continuums of QSOs with redshift z < 1 and dotted line diagram is related to the mean of 21 continuums of QSOs with redshift z ~ 3

Table 1 .
The list of QSOs with redshift of z~3, S/N more than 20 and