Sample Sufficiency for Mean Estimation of Productive Traits of Sunn Hemp

Sunn hemp (Crotalaria juncea L.) is an annual cycle legume with high potential for biological nitrogen fixation, being widely used in crop rotation for biomass formation and control of nematodes. The objectives of this study were to determine the sample size for the mean estimation of productive traits of sunn hemp and verify the sample size variability between traits and sowing dates. Two uniformity trials were performed in the agricultural year 2014/2015, with sowing in October (trial 1) and December (trial 2). At the crop flowering stage, 300 plants of each trial were harvested and fresh and dry matter of leaves, stem, root, aerial part, and total weight were evaluated. The normality and randomness tests were performed for each trait and the sample size was calculated for the semi-amplitudes of the confidence interval (estimation errors) of 2, 4, 6, 8, 10, 12, 14, 16, 18 and 20% of the mean estimate. There is sample size variability between productive traits and between sowing dates. The assessment of at least 101 plants is required for mean estimation of productive traits with maximum estimation error of 20% of the mean and 95% confidence level.


Introduction
Sunn hemp (Crotalaria juncea L.) is a native species from India, with wide adaptation to tropical regions.This legume is a shrub, standing upright 2 to 3 meters high and with deep root system, which assists in soil decompression and nutrient recycling.This species is used in crop rotation and stands out among the soil cover crops due to the P and Mg accumulation, high biomass production, reaching up to 16.5 t ha -1 of dry matter and nitrogen fixation up to 298 kg ha -1 of N (Mangaravite, Passos, Andrade, Burak, & Mendonça, 2014;Silva et al., 2014;Xavier, Oliveira, & Silva, 2017).Sunn hemp is a viable option for crop rotation in areas infested with root-knot nematodes (Meloidogyne enterolobii) (Rosa, Westerich, & Wilcken, 2015).
Due to its importance, carrying out further research is essential to assure security in the use of new technologies related to the crop by technicians and growers.In agricultural experiments, factors such as availability of time, labor, financial and human resources generally limit the evaluation of all plants of the entire experimental unit (plot).Thus, commonly only part of the plot is evaluated, i.e., only a few plants (sample) in order to minimize the limiting factors.Therefore, the sample should be representative of the plants of the experimental unit (Storck, Garcia, Lopes, & Estefanel, 2016).
The results obtained from the samplings are subject to a certain degree of uncertainty because the data measured in samples present random variation corresponding to the evaluation method and the experimental material, besides only considering part of the population.Meanwhile, these errors can be reduced by employing more accurate measuring instruments and sample sized for the desired precision.Data heterogeneity and the desired confidence level in the mean estimation of one trait are factors that directly influence the sample size.The sample size can be calculated by setting the desired precision degree.Lower values of admitted estimation error (greater precision) increase the number of observations to be evaluated (Bussab & Morettin, 2017).
The sample size for fresh and dry matter of sunn hemp plants was determined for the Brazilian cerrado region in the study performed by Teodoro et al. (2015).However, we assumed that the sample sizes for the productive traits of this crop cultivated at different sowing dates in the Brazilian subtropical region, site of this study, are different from those obtained by Teodoro et al. (2015) because of the influence of environmental factors and sowing dates on the growth and development of sunn hemp plants (Santos & Campelo Júnior, 2003;Timossi, Teixeira, Cava, Goularte, & Nascimento, 2014).We emphasize that reduced sample size can compromise the reliability of the results of the experiments.On the other hand, the use of large sample sizes may be unnecessary, with waste of time and financial resources, which demonstrates the importance of sample dimensioning.Thus, the objectives of this study were to determine the sample size for the mean estimation of productive traits of sunn hemp and verify the sample size variability between traits and sowing dates.

Material and Methods
Two uniformity trials (blank experiments) with sunn hemp (Crotalaria juncea L.) distributed in two sowing dates were performed in the experimental area from the Department of Plant Science of the Federal University of Santa Maria, Santa Maria, Rio Grande do Sul state (latitude 29º42′S, longitude 53º49′W and 95 m of altitude) in the agricultural year 2014/2015.According to the Köppen climate classification, the climate of the region is Cfa, humid subtropical with hot summers and without dry season (Heldwein, Buriol, & Streck, 2009).The soil of the experimental area is classified as sandy loam typic Paleudalf (Santos et al., 2013).
In both uniformity trials, all procedures (sowing, fertilization, cropping, harvesting, and evaluations) were performed homogeneously throughout the experimental area, as recommended for uniformity trials (Storck et al., 2016).
The uniformity trials were carried out at two sowing dates, i.e., the first trial on October 22, 2014 (sowing date 1) and the second trial on December 3, 2014 (sowing date 2).In both uniformity trials, sowing was performed in 0.50 m spaced rows, with density of 20 seeds per row meter, totaling an area of 50 m × 52 m (2600 m 2 ).The basic fertilization was of 15 kg ha -1 of N, 60 kg ha -1 of P 2 O 5 and 60 kg ha -1 of K 2 O.
Minimum and maximum air temperatures in °C and rainfall in mm were recorded daily (Figure 1).Those information were collected at the Automatic Experimental Station of the Federal University of Santa Maria, located 30 m away from the experimental area.The average daily air temperature, in °C, was calculated as the arithmetic mean of the minimum and maximum air temperatures.In each uniformity trial performed in an area of 1300 m 2 , a grid with 300 sample points spaced 2 m × 2 m was marked with stakes in the central area (24 m × 50 m = 1200 m 2 ), forming a matrix with 25 rows and 12 columns.The closest plant to the sampling stake was selected per grid point, totaling 300 plants in each uniformity trial.These plants were marked and the following traits were evaluated at harvest: fresh matter of leaves (FML, in g), fresh matter of stem (FMS, in g), fresh matter of roots (FMR, in g), fresh matter of aerial part (FMAP = FML + FMS, in g), total fresh matter (TFM = FML + FMS + FMR, in g), dry matter of leaves (DML, in g), dry matter of stem (DMS, in g), dry matter of roots (DMR, in g), dry matter of aerial part (DMAP = DML + DMS, in g), total dry matter (TDM = DML + DMS + DMR, in g).The evaluations were performed using a precision digital scale.
Plant harvest of the first sowing date was carried out at 110 days after sowing at flowering time.Meanwhile, harvesting of the second sowing date was performed at 89 days after sowing, also at flowering time.After weighing fresh matter at harvesting time, the leaf, stem and roots parts were packed in paper bags and taken to a 60 °C forced air oven for moisture content stabilization followed by weighing in order to assess the dry matter of leaves, stem and roots.
The mean values of FML, FMS, FMR, FMAP, TFM, DML, DMS, DMR, DMAP, and TDM between sowing dates (n = 300 plants per sowing date) were compared by means of the Student's t-test for independent samples at 5% probability (bilateral), with 598 degrees of freedom.For each of the ten traits in each uniformity trial, the data normality was verified by means of the Kolmogorov-Smirnov test and the data randomness was examined by means of the run test (Siegel & Castellan Júnior, 2006).The statistics minimum, maximum, mean, median, standard deviation, standard error of the mean, coefficient of variation, and variance were calculated.The sample size (n) of each trait was calculated for the semi-amplitudes of the confidence interval (estimation errors) of 2, 4, 6, 8, 10, 12, 14, 16, 18, and 20% of the mean estimate (m), i.e., estimation error = 0.02 × m (highest precision) to estimation error = 0.20 × m (lower precision), with a 95% confidence coefficient (1 -α) by means of the formula (Bussab & Morettin, 2017), where, t α/2 is the critical value of the Student's t distribution, whose area on the right is equal to α/2, i.e., the t value such that: P(t > t α/2 ) = α/2, with (n -1) degrees of freedom (299 degrees of freedom in this study), with α = 5% error probability and s is the standard deviation estimate.
Hereafter, setting n equal to 300 plants, which was the sample size used in the sampling in each uniformity trial, the estimation error (semi-amplitude of the confidence interval) was calculated as a mean percentage (m) for each trait by means of the formula, where, s is the standard deviation estimate.Statistical analyzes were performed with the Office Excel and R software (R Core Team, 2018).

Results and Discussion
Fresh matter of leaves oscillated between 0.50 and 356.58 g in the first sowing date and between 0.11 and 163.23 g in the second sowing date, with mean values respectively of 79.45 and 29.35 g.The fresh matter of stem varied between 9.14 and 745.25 g in the first sowing date and among 1.71 and 377.91 g in the second sowing date, with mean respectively of 239.60 and 93.13 g.The fresh matter of roots during the first sowing date ranged from 1.01 to 206.70 g and from 0.51 to 55.97 g in the second sowing date, with respective mean values of 38.75 and 11.49 g (Table 1).These values demonstrate the wide variability among plants under field conditions at both sowing dates.Sampled plants accounting for maximum crop variability is important for adequate sample size determination.Thus, we can infer that the database is suitable for the proposed study.
Table 1.Minimum (Min), maximum (Max), mean (M), median (Md), standard deviation (SD), standard error (SE), coefficient of variation (CV), variance (Var) value of Kolmogorov-Smirnov normality test (KS) and p-value of the randomness run test (Run) for fresh matter of leaves (FML, in g), fresh matter of stem (FMS, in g), fresh matter of roots (FMR, in g), fresh matter of aerial part (FMAP = FML + FMS, in g), total fresh matter (TFM = FML + FMS + FMR, in g), dry matter of leaves (DML, in g), dry matter of stem (DMS, in g), dry matter of roots (DMR, in g), dry matter of aerial part (DMAP = DML + DMS, in g) and total dry matter (TDM = DML + DMS + DMR, in g) of sunn hemp (Crotalaria juncea L.), evaluated at harvest at 110 days after sowing and at 89 days after sowing in the agricultural year 2014/2015 Note. (1)For each trait (FML, FMS, FMR, FMAP, TFM, DML, DMS, DMR, DMAP, and TDM), the means not followed by the same letter in the column (comparison of means between sowing dates) differ by 5% of probability (bilateral) by Student's t test for independent samples with 598 degrees of freedom.
The mean values of fresh matter of aerial part (FMAP) were respectively of 319.06 and 122.48 g per plant in the first and second sowing dates (Table 1).In the study of the influence of the basic experimental unit size on the estimation of the optimum plot size developed by Facco et al. (2017) 2012) (FMAP = 13.5 t ha -1 and DMAP = 3.0 t ha -1 ) in assessment performed at 65 days after sowing, i.e., this lower accumulation of FMAP and DMAP could be caused by the shorter development cycle occasioned by the effect of the photoperiod on the plant physiological maturation.The negative effect of decreasing crop cycle on biomass production was verified by Timossi et al. (2014).Therefore, the proper plant development demonstrated by the fresh and dry matter, high number of sampled plants and wide plant variability generates reliability in the results and safety for their use in future experiments.
The fresh and dry matter of the second sowing date were lower than in the first sowing date (Table 1).These results may be explained by the different environmental conditions between sowing dates (Figure 1) and the smaller crop cycle of the second sowing date (flowering with 89 days) compared to the first one (flowering with 110 days).
There is a reduction of the vegetative cycle with the reduction of the crop cycle, which causes a smaller plant growth period.Results obtained by Santos and Campelo Júnior (2003) demonstrated reduced fresh and dry matter of sunn hemp in later plantings, as well as in experiments performed by Timossi et al. (2014).Both authors attribute these effects to photoperiod reduction in cropping cycles with later sowing.The traits related to the dry matter presented proportional values to the fresh matter, since only the water content in each part of the plant was stabilized.
For the ten traits in the two sowing dates, the highest mean scores in relation to the median are indicative of data deviation from the normal distribution curve.In 90% of cases (10 traits × two sowing dates), the significance of the Kolmogorov-Smirnov test (P ≤ 0.05) indicates no data adjustment to the normal distribution.Data were adjusted to the normal distribution by the Kolmogorov-Smirnov test only for FMS and TFM in the first sowing date (Table 1).However, according to the central limit theorem, even if the basic population is non-normal, the distribution of the sample mean will be approximately normal for samples larger than 30 observations (Bussab & Morettin, 2017).In 80% of the cases (10 traits × two sowing dates), the p-values of the randomness run test (Siegel & Castellan Júnior, 2006) greater than 0.05 revealed a random distribution, i.e., the sampling performed with 300 plants was not biased and was representative of the population.Therefore, in the light of these considerations regarding the data normality and randomness, we can infer that the data of these traits offer credibility to the study of sample size by means of Student's t distribution (Siegel & Castellan Júnior, 2006;Bussab & Morettin, 2017).
The sample size was determined using 300 plants and if the researcher decide to use this sample size, the maximum estimation error would be of 8.88 and 10.64% for the mean estimate (m) of the traits related to fresh matter (FML, FMS, FMR, FMAP, and TFM) respectively for the first and second sowing dates.Meanwhile, the mean estimate (m) of traits related to the dry matter (DML, DMS, DMR, DMAP, and TDM) would present an estimation error of 9.08 and 11.58% respectively for the first and second sowing dates (Table 2).The estimation errors obtained in this study were slightly greater than those obtained by Facco et al. (2016), although these authors calculated based on 360 pigeon pea plants.
Table 2. Sample size (number of plants) for the mean estimation of fresh matter of leaves (FML), fresh matter of stem (FMS), fresh matter of roots (FMR), fresh matter of aerial part (FMAP), total fresh matter (TFM), dry matter of leaves (DML), dry matter of stem (DMS), dry matter of roots (DMR), dry matter of aerial part (DMAP) and total dry matter (TDM) for the estimation errors equal to : 2, 4, 6, 8, 10, 12, 14, 16, 18, and 20% of  For the semi-amplitude of the confidence interval equals to 4% of the mean estimate (m) and 95% confidence level, the sample size of fresh matter of leaves was 1,477 plants in the first sowing date and 2,125 plants in the second sowing date.The sample size of fresh matter of stem was respectively of 718 and 1,429 plants for the first and second sowing dates.Moreover, the sample size of fresh matter of roots was respectively of 1,275 and 2,052 plants for the first and second sowing dates (Table 2).In general, the sample size for the same semi-amplitude of the confidence interval was greater than that obtained by Teodoro et al. (2015) for the Brazilian cerrado region.The same trend was observed for the other productive traits, i.e., larger sample sizes were required to estimate the mean of traits in the second sowing date due to the greater data variability (higher coefficient of variation scores) observed in that sowing date.Thus, there is sample size variability between the productive traits and between the sowing dates.Furthermore, sample size variability in sunn hemp has been found in studies of productive traits of Crotalaria juncea (Teodoro et al., 2015), morphological and productive traits of Crotalaria spectabilis (Toebe et al., 2017), and morphological traits of Crotalaria juncea (Schabarum et al., 2018).
The large number of plants to be sampled hinders the sample size estimation for a high precision level.For example, a smaller number of plants can be sampled when assuming lower precision such as semi-amplitude of the confidence interval equals to 20% of the mean estimate (corresponding to the smaller precision in this study) (Table 2).Therefore, 60 plants are required in the first sowing date and 85 plants in the second sowing date to evaluate the trait fresh matter of leaves with the semi-amplitude of the confidence interval equals to 20% of the mean estimate (m) and 95% confidence level.For fresh matter of stem, the sample size is respectively of 29 and 58 plants in the first and second sowing dates.For fresh matter of roots, the sample size is 51 and 83 plants respectively for the first and second sowing dates.
These results demonstrate the necessity of using different sample sizes for mean estimation of productive traits.However, the use of different sample sizes to evaluate different traits is often impractical.Alternatively, the use of the largest sample size obtained among traits is allowed as a single sample size for all traits.For example, assuming the sample size of 101 plants (20% of precision) to estimate the mean of a treatment in a completely randomized experimental design with five replicates, 20 plants per replicate (101/5 ≈ 20), i.e., 20 plants per plot should be sampled.If 10 treatments were evaluated in the experiment, 1,000 plants would be sampled (100 plants per treatment).

Conclusions
The sample size (number of plants) in the sunn hemp (Crotalaria juncea L.) crop varies between productive traits and between sowing dates.The assessment of at least 101 plants is required for mean estimation of productive traits with maximum estimation error of 20% of the mean and 95% confidence level.

Figure 1 .
Figure 1.Minimum, maximum and mean daily air temperatures (°C) and rainfall (mm)