Characterization and Genetic Variation of Sugarcane Streak Mosaic Virus, a Poacevirus Infecting Sugarcane in Thailand

Sugarcane leaves showing yellow streak mosaic symptoms were observed in farmers’ fields in Kamphaeng Saen, Nakhon Pathom province, during disease surveys conducted in 2010. Diagnosis of symptomatic leaf samples by RT-PCR for Sugarcane mosaic potyvirus failed, but it revealed the presence of Sugarcane streak mosaic virus (SCSMV). SCSMV-infected sugarcane, designated as THA-NP3, was subjected to RNA extraction and RT-PCR-based viral gene cloning and sequencing. The complete genome of THA-NP3 (JN163911) contained 9,781 nucleotides, excluding 3′ poly (A) tail which encoded a polyprotein of 3,130 amino acid residues. Protein sequence analysis indicated nine putative cleavage sites that yielded ten functional proteins namely P1, HC-Pro, P3, 6K1, CI, 6K2, NIa-VPg, NIa-Pro, NIb and CP, and an additional frameshifted PIPO protein. Sequence alignment revealed that THA-NP3 shared 97.84% nucleotide identity with JP2 from China and 81.39-97.78% identities to other recorded SCSMV sequences. Surveys for streak mosaic disease were conducted from 2010 to 2014 at major sugarcane growing areas in five provinces, Nakhon Pathom, Kanchanaburi, Nakhon Ratchasima, Khon Kaen and Udon Thani, and among germplasm collections. The percentages of the infected samples ranged from 43.48-90.91% and 54.17-100% in collected farmers and germplasm fields, respectively. Genetic diversity based on coat protein (CP) coding sequences of 58 Thai SCSMV isolates showed 86.17-100% nucleotide identities among them and 85.70-99.29% identities to isolates from other countries. Phylogenetic analysis of CP sequences indicated two major clusters of virus variants, one in cropping fields and another in germplasm fields. Genetic variations of SCSMV isolates were consistently indicated according to recombination events detected in CP coding regions. These findings represent essential knowledge and should be utilized to improve the SCSMV resistance of sugarcane varieties.

The SCSMV virion is flexuous rod, size of 890 ×15 nm, and comprises a positive sense single stranded RNA genome of 9.8 kb, characteristic of members in the family Potyviridae (Hema, Sreenivasulu, & Savithri, 2002). The viral genome contains a single open reading frame (ORF) which encodes a polyprotein of 3130 amino acid residues. The polyprotein is processed by the viral protease yielding ten different proteins, namely P1, HC-Pro, P3, 6K1, CI, 6K2, NIa-VPg, NIa-Pro, NIb and CP. The first published complete genome sequence of SCSMV which was isolated from sugarcane in Pakistan, SCSMV-PAK: GQ388116, contained 9782 nucleotides (nts), excluding 3′ Poly (A) tail (Xu, Zhou, Xie, Mock, & Li, 2010). Sequence comparison and a phylogenetic tree of the complete genome revealed that SCSMV was a distinct group from those of other genus in the family Potyviridae, and has been recently named Poacevirus (Xu et al., 2010;ICTV: www.ictvonline.org). More isolates of SCSMV and their complete genome sequences were reported from China (Li et al., 2011) and India (Parameswari, Bagyalakshmi, Viswanathan, & Chinnaraja, 2013). The study on genetic variability of SCSMV has been investigated using the analysis of CP (He et al., 2013), P1 (He et al., 2013) and HC-Pro (Bagyalakshmi et al., 2012) coding regions. The comparison of these genes revealed high variation among different SCSMV isolates resulting from recombination (Bagyalakshmi et al., 2012;He et al., 2013). In Thailand, the presence and distribution of SCSMV in Thailand and genetic variations of the existing isolates still have not been recorded so far.
In this study, sugarcane leaves showing yellow streak mosaic symptom in farmers' fields were diagnosed, and the causal virus was identified. A full length genome of Thai isolate of SCSMV was reconstructed by RT-PCR amplification and its nucleotide sequence was determined for the first time. Disease surveys were conducted, samples were collected for viral gene amplification, and genetic variations among Thai SCSMV isolates were analyzed based on coat protein (CP) coding region.

Disease Survey and Sample Collection
Sugarcane disease surveys for SCSMV infection were performed from 2010 to 2014 in 5 provinces of the major sugarcane growing areas of Thailand (Nakhon Pathom, Kanchanaburi, Udon Thani, Khon Kaen and Nakhon Ratchasima). Virus-like symptomatic sugarcane samples, particularly young mosaic leaves, were collected and kept in sealed plastic bags. Samples were also collected from sugarcane germplasm collection fields, belonging to Kasetsart University, in Nakhon Pathom and Kanchanaburi provinces. More samples from the germplasm collection fields in Suphan Buri province were kindly provided by the Department of Agriculture, Ministry of Agriculture and Cooperatives.

Detection of SCSMV by Direct Antigen Coating ELISA (DAC-ELISA)
Collected sugarcane leaf tissues were diagnosed for the presence of SCSMV by DAC-ELISA, using the locally produced antiserum raised against the purified SCSMV (Kasemsin, Chiemsombat, & Hongprayoon, 2011). Briefly, one gram of leaf tissue was ground in a plastic bag containing 1 ml of the extraction buffer (PBS, pH 7.4, 0.2% sodiumdiethyldithiocarbamate). The homogenate was diluted at 1:10 in coating buffer. A 100 μl of the diluted plant extract was loaded in each well of a microtiter plate and incubated overnight at 4°C. The protocol as described by Chiemsombat, Prammanee, & Pipattanawong (2014) was followed except for SCSMV antiserum was diluted at 1:500 and incubated at 37°C for 1 h.

Primer Design
The specific primers used in this study (Table 1) for amplification of the SCSMV complete genome were designed based on the alignment of the complete genome sequence of SCSMV-PAK: GQ388116 and other partial SCSMV sequences recorded in GenBank (Y17738, EU650179, EF088799, EU650178, EU883391, EF088797, DQ421788, AM920686, AM920685, AB563503, GQ386845, GQ386843, GQ386844, Y17738, AY193783, AY189681). Two specific primers for amplification of the entire coat protein gene were also designed (Table 1). Note. N in the A-d-T-R2 represents the regenerated bases (A, T, G and C).

Total RNA Extraction and Reverse Transcription Polymerase Chain Reaction (RT-PCR)
Total RNA was extracted from the sugarcane leaf tissues using TLES buffer (100 mM Tris-HCl, pH 8.0, 100 ml LiCl, 10 mM EDTA, pH 8.0, 1% SDS, 0.1% sodium sulphite) according to Verwoerd, Dekker, & Hoekema (1989). The viral cDNA was synthesized using a SuperScript III cDNA synthesis kit (Invitrogen, USA) following the manufacturer's protocol. This cDNA was added to PCR reactions for synthesis of the 11 overlapping fragments covering the whole genome of SCSMV, using our designed 11 primer pairs (Table1).
PCR reaction consisted of 1X PCR buffer, 0.4 mM dNTP mix, 2mM MgSO 4 , 10 pmol of each primer, 1 U of Hi-Fidelity Taq (Invitrogen, USA), 1 μl of cDNA and RNase-free water to adjust the total volume to 25 μl. The reaction was started with the initial step of denaturation at 94°C for 4 min, followed by 30 cycles of denaturation at 98°C for 30 sec, annealing for 1 min at temperature according to the annealing temperature of each primer pair (Table 1), extension at 68°C for 1-2 min according to the length of each overlapped sequence (1 kb/1 min) and 1 cycle of the final extension at 68°C for 7 min. RT-PCR products were analyzed on 0.8% agarose gel electrophoresis.

Viral Gene Cloning and Sequencing
The RT-PCR products obtained from each primer were purified using a PCR/Gel purification kit following the manufacturer's protocol (Favorgen Biotech Corp, Taiwan) and separately cloned into the pGEM-T cloning vector (Promega, USA). The selected clones with a viral gene insert were subjected for sequencing in both directions at BioDesign, Thailand.

Viral Genome Sequence Assembly and Analysis
The presence of sections of the SCSMV was verified using nucleotide blast on sequenced inserts (www.ncbi.nlm.nih.gov/BLAST) and then assembled from 5′ to 3′ in BioEdit, version 7.2.5 (http://www.mbio.ncsu.edu/). The full length nucleotide sequences of the complete genome were initially confirmed by searching for an ORF by using ORFinder (http://www.ncbi.nlm.nih.gov). The putative cleavage sites of the deduced proteins on the polyprotein were determined by comparing the potential cleavage sites with those of SCSMV isolates, PAK: GQ388116, TPT: GQ246187, ID: JF488066, JP1: JF488064, JP2: JF488065 and IND671: JN941985.

Amplification, Cloning and Sequencing of the CP Coding Region
The CP coding region of SCSMV was amplified by RT-PCR using a CP specific primer pair, SCS-NIb-CP: F and SCS-NIb-CP: R (Table 1). The viral cDNA was synthesized from total RNA using ReverTraAce (TOYOBO, Japan) following the manufacturer's protocol.
The PCR reaction consisted of 1X PCR buffer, 0.4 mM dNTP mix (TOYOBO, Japan), 2mM MgSO 4 , 10 pmol of each primer, 1 U of KOD-Plus-Neo (TOYOBO, Japan), 1 μl of cDNA and RNase-free water to adjust the total volume to 25 μl. The amplification cycle was the same as described above (2.3.2) except that the annealing temperature was 61°C for 1 min. The RT-PCR products were analyzed by 0.8% agarose gel electrophoresis and submitted to direct sequencing in both directions (SolGent, South Korea). Some selected purified products were cloned into pGEM-T cloning vectors (Promega, USA), and the plasmids containing gene inserts were sequenced in both directions (SolGent, South Korea).

Sequence Analysis and Phylogenetic Tree of the CP Coding Regions
The nucleotide (nt) and amino acid (aa) sequences of the CP coding regions among SCSMV isolates used in this study (Table 2) were analyzed by using clustalW in the CLC program package (http://www.clcbio.com). Pairwise comparisons were also created using the CLC program. The phylogenetic relationships were analyzed by using MEGA6 program (Tamura, Stecher, Peterson, Filipski, & Kumar, 2013).

Recombination Analysis of the CP Coding Regions
The recombination events of SCSMV isolates used in this study (Table 2) were detected using RDP4 (Martin, Murrell, Golden, Khoosal, & Muhire, 2015). The detection algorithms used in this study were the automated RDP, GENECONV, Chimaera, MaxChi, BOOTSCAN, SISCAN, 3Seq and LARD which are implemented in the RDP4 program (version 4.50 ) with default setting.

Networks Analysis Using Splitstree
Phylogenetic networks of the CP coding regions among SCSMV isolates (Table 2) were created by using SplitsTree4.11 (Huson & Bryant, 2006). The alignment file obtained from clustalW was used for construction of the phylogenetic network using median network in SplitsTree4.11.

Incidence of SCSMV in Surveyed Sugarcane Fields
In this survey, diseased sugarcane plants showed typical symptoms of yellow streak mosaic, especially on young leaves ( Figure 1a-b), while older leaves showed mild symptoms. Two hundred and thirty-three sugarcane leaf samples were collected from 14 farmers' fields of which 153 samples tested positive by DAC-ELISA. Some sample obtained from older leaves had positive reactions by DAC-ELISA but the absorbance (A 405 ) values (0.277-0.339) were lower than in younger leaves. Therefore, the presence of SCSMV in some selected samples was confirmed by RT-PCR. This survey indicated that SCSMV is widespread and was present in all collected fields in Nakhon Pathom, Kanchanaburi, Udon Thani, Khon Kaen and Nakhon Ratchasima provinces ( Figure 2). The percentage of the infected samples of all collected fields ranged from 43.48-90.91% ( Figure 2).
In three fields containing germplasm collections, 138 samples from 73 sugarcane varieties were collected. Of these, 91 samples obtained from 50 varieties indicated positive reactions by DAC-ELISA. The percentage of positive reaction within varieties ranged from 54.17-100% ( Figure 2). In a subsequent study, we selected 36 isolates from different farmers' fields in 5 provinces and 22 isolates from fields containing germplasm collections to examine the genetic variation based on sequence analysis of the viral CP gene. All selected isolates yielded the expected 1094 bp RT-PCR product, from which an isolate from Kamphaeng Saen District, Nakhon Pathom Province, designated as THA-NP3 (Figure 1a) was subjected to full length genome sequencing.   codon AUG (nts 200-202) of the long polyprotein ORF was ended by the termination codon UGA (nts 9591-9593) and was followed by a 3′ UTR of 189 nts (Figure 3). The coding region, consisted of 9393 nts, encoded a polyprotein of 3130 amino acid residues with a calculated Mr of 356.53 kDa. This polyprotein had extensive amino acid sequence homology to those of SCSMV polyprotein isolates.
Nine putative cleavage sites of the polyprotein were identified in comparison to the putative sites of SCSMV-PAK polyprotein (Xu et al., 2010) and some others (Figure 3). All cleavage sites of THA-NP3 proteins as well as the positions of the amino acids were similar to those of the SCSMV isolates, PAK, ID, JP1, JP2 and TPT but not IND671. The amino acid sequences of the conserved motifs were slightly different among SCSMV isolates ( Figure 3). The genome sequence of THA-NP3 was analyzed for the presence of Pretty Interesting Potyviridae ORF (PIPO) in the P3 gene with a highly conserved motif, G 1-2 A 6-7 similar to the previous report of potyviruses (Chung, Miller, Atkins, & Firth, 2008). The result revealed that the conserved motif, GGAAAAAAA was found at the nucleotide position 3085-3093 which is similar to that of SCSMV-PAK reported by Xu et al. (2010). The deduced 139 aa of PIPO of THA-NP3 was obtained from 420 bp in the +1 frame at the nucleotide position 3091-3510 as reported by Chandran and Gajjeraman (2015).
The motif scan, using the NCBI-CDD database, of the THA-NP3 polyprotein revealed 19 motifs. The P1 of this virus isolate contained a serine peptidase at the amino acid position 208-312. The peptidase_C6 conserved motif, which is contained in the HC-Pro protein, was found at the amino acid position 684-810. The conserved motif, C-71-X-H (aa 715-787), was found at the C-terminal region of the HC-Pro protein while the conserved motifs associated with aphid transmission were not found. The CI protein was the largest protein among ten functional proteins and contained RNA helicases of superfamily II at

Comparison of THA-NP3 Complete Genome with Other Genome Isolates
The comparisons of the complete genome sequences showed that THA-NP3 was very similar to JP2, JP1, ID and TPT isolates with 97.84%, 97.78%, 97.73% and 94.80% nucleotide identities, respectively. It was less similar to PAK and IND671 isolates with 81.83% and 81.39% nucleotide identities, respectively ( The comparison of THA-NP3 with potyviruses causing mosaic diseases in sugarcane (SCMV, SrMV, MDMV, JGMV) showed less similarity with 29.01-29.38% (nt) and 17.49-18.33% (aa) identities (Table 3). Thus, THA-NP3 was a distinct virus genus separated from those of genus Potyvirus infecting sugarcane.

Analysis of the Complete CP Gene
The RT-PCR products using the primers, SCS-NIb-CP: F and SCS-NIb-CP: R revealed 1094 bp of the partial sequence of the polyprotein and 3′UTR. Sequence analysis revealed that the complete CP gene contained 846 nucleotides which encoded 281 amino acid residues. In this study, we investigated the genetic variability of the complete CP region (846 nts, 281 aa) of the 58 Thai SCSMV isolates; 36 isolates of which, were obtained from 14 farmers' fields; and 22 isolates of which, were obtained from three germplasm collection fields ( identities when compared to the isolate of TriMV. The amino acid sequence alignment of all isolates from Thailand and other isolates from other countries revealed more variation at the amino acid position 1-31 of N-terminal region and the core region, including C-terminus, were more conserved.

Phylogenetic Relationships of SCSMV Based on the CP Sequences
Phylogenetic relationships of the CP gene (846 nts) from 58 Thai SCSMV isolates (Table 2) and 27 SCSMV isolates from other countries were determined using a maximum-likelihood method. The CP sequences from Thai SCSMV isolates clustered in four well defined variant groups (Figure 4). Two sub-groups, which were designated as sub-groups 1A and 2A, contained 38 isolates from Thailand, 9 isolates from China, 2 isolates from Japan, 1 isolate from Indonesia, 2 isolates from India and the unique variant from the isolate GK76-4 ( Figure 4). The second major group consisted of 2 sub-groups, 1B and 2B which represented the isolates from germplasms in Thailand, India and China (Figure 4). The second sub-group, 2B contained only the isolates from collected germplasm in Thailand (Figure 4). These results suggested that Thai SCSMV isolates from different farmers' fields were more closely related to the isolates from China while the isolates from collected germplasm were closely related to the isolates from India and Pakistan (Figure 4).
The recombinant isolate (M55) from China was distributed from the major parental isolate (GRT2007-091) and the minor parental isolate (PAK) while the recombinant isolate (CB671-1) from India was distributed from the major parental isolate from Thailand (FUD10-7) and the minor parental isolate from India (IND671). The recombination sites of M55 and CB671-1 were similar (Table 5). These results confirmed that the recombination occurred in the CP coding region among SCSMV isolates from different geographical regions, and sugarcane varieties in the presence of four recombinant isolates, GK76-4, GROC7, M55 and CB671-1 (Table 4).

Phylogenetic Networks of Thai SCSMV Isolates
The splits networks based on the alignment of the CP gene of 58 Thai SCSMV isolates revealed that the recombination events occurred among Thai SCSMV isolates that divided Thai SCSMV isolates into two major network groups ( Figure 5). The recombinant isolate GK76-4 shared with these two network groups suggested that recombination occurred between the virus isolates from the collected farmers and germplasm fields. Nine SCSMV isolates (AP, TPT, IND671, JP1, JP2, ID, M55, CB671-1 and PAK) with likely to be the recombinant were selected for splits network analysis with 58 Thai SCSMV isolates. The splitstree based on the selected 67 isolates exhibited two major network groups ( Figure 5). JP1, JP2, ID, TPT and AP isolates shared the same network group with the collected farmer isolates. The second network group consisted of 4 isolates, CB671-1, M55, IND671 and PAK that shared the same network with the collected germplasm isolates ( Figure 5).   Vol. 10, No. 4;2016 (GQ246187), one SCSMV isolate from Pakistan (GQ388116) and an outgroup, TriMV (NC_012799).

Discussion
Yellow streak mosaic is a typical symptom of streak mosaic disease in sugarcane caused by Sugarcane streak mosaic virus (SCSMV). In addition, the mosaic symptoms in sugarcane are associated with several viruses such as Sugarcane mild mosaic virus (SCMMV), Sugarcane striate mosaic associated virus (SCSMaV), Sugarcane mosaic virus (SCMV) and Sorghum mosaic virus (SrMV). The typical symptoms and host ranges are similar among these viruses (Chen, Chen, & Adams, 2002). In this study, host range tests on plant species in genus Poaceae including sorghum cv. UT325B and the commercial corn cv. Tender58 were investigated by mechanical inoculation. The typical symptoms of yellow streak mosaic were also exhibited on the inoculated sorghum and corn at 15 and 5 dpi, respectively (Figure 1c-d). These exhibited streak mosaic symptoms on the inoculated sorghum and corn were similar to those symptoms caused by SCMV that reported as potyvirus causing mosaic diseases in sugarcane, corn and sorghum in Thailand (Gemechu, 2004). These infected plants also confirmed for the presence of SCSMV infection by RT-PCR and the results revealed the presence of SCSMV-CP gene. Thus, we strongly confirmed that yellow streak mosaic symptom in these sugarcane leaves was caused by SCSMV, as previously reported by Chatenet et al. (2005).
The disease surveys from 2010 to 2014 revealed that the incidence of SCSMV was widespread across the major sugarcane growing areas in 5 provinces and the germplasm collection fields. The sugarcane variety groups maintained at the germplasm collection fields such as UT3, UT4, UT5, UT6, UT8 and UT10 were found to be more frequently infected with SCSMV. Than other sugarcane varieties such as K76-4, K88-65 and K88-87. More widespread observation of SCSMV in many sugarcane fields might be facilitated by mechanical transmission such as cutting knives but insect vectors are still uncertain. Our survey suggested that the commercial sugarcane varieties including germplasm collections were widely infected with SCSMV.
In this study, we selected the virus isolate THA-NP3 derived from sugarcane with unknown variety for complete genomic sequencing (Figure 1a). The complete genome sequence of THA-NP3 was successfully assembled from 11 overlapping sequences by using a set of primers designed in this study (Table 1). All cleavage sites of THA-NP3 proteins and the position of their amino acids were almost identical to those of SCSMV isolates PAK, ID, JP1, JP2 and TPT except for the isolate, IND671 which contained 3131 amino acid residues of the polyprotein. Nucleotide sequence comparison among seven complete genome isolates (THA-NP3, PAK, ID, JP1, JP2, IND671 and TPT) revealed more genetic variations in the P1, HC-Pro and CP genes. These three proteins revealed more genetic variation among SCSMV isolates originally from different sugarcane varieties (Bagyalakshmi et al., 2012;He et al., 2013).
Based on gene sequence variability, Thai SCSMV isolates were divided into two distinct groups (Figure 4), which were the group containing isolates from farmers' fields and another group of isolates from germplasm collection fields. However, some virus isolates obtained from farmers' fields were clustered in the same group with germplasm isolates. These results suggested that the variation of the CP gene occurred among various sugarcane varieties but was not associated with the geographical origin of the isolate.
Network analysis of the 58 Thai SCSMV isolates also confirmed that the recombination events occurred in the CP coding region among the virus isolates from different fields and germplasms ( Figure 5). More evidence from the recombination detection by RDP4 revealed two recombinant isolates, GK76-4 and GROC7 (Table 4). A previous study reported that the recombinant isolate from China, CB671-1, was distributed from the parents, W23×IND671, and three recombinant isolates (CB740, CB9217-1 and S-8) were distributed from the same parents, THA-NP3×CB671-1 (He et al., 2013). In this study, we found that two recombinant isolates from China (CB671-1 and M55) were distributed from the parents, FUD10-7×IND671 and GRT2007-091×PAK, respectively. These results suggest that the recombination events occurred in the CP gene among the virus isolates from Thailand, China and India. More results have been reported that the recombination events occurred throughout the HC-Pro gene of SCSMV but not in the P1 gene (Bagyalakshmi et al., 2012;He et al., 2013). Recombination events have been reported as the evolutionary history of single-stranded RNA genome such as Turnip mosaic virus (TuMV), in the P1, HC-Pro, P3, CI, 6K2, VPg, NIa-Pro, NIb and CP genes, except for 6K1 gene (Ohshima et al., 2007). The recombinant isolate (GK76-4) obtained in this study was found to have two recombination sites which occurred in the variable N-terminal region and the conserved sequence at the C-terminal regions of the CP gene (Table 4).
In conclusion, our research indicates that the genetic base of hosts, including biological background, was an important factor for viral genetic variation and differentiation in SCSMV populations. This is the first report on the incidence of SCSMV in the commercial sugarcane varieties and the germplasm collections in Thailand. These results will assist sugarcane varieties improvement, screening and breeding for the virus resistant varieties.