Genetic diversity assessed in Ethiopian highland bamboo [ Oldeania alpina (K. Schum) Stapleton] populations revealed by microsatellite markers

Background Ethiopian highland bamboo [ Oldeania alpina (K. Schum) Stapleton] (Poaceae: Bambusoideae: Arun-dinarieae) is one of the economically and environmentally important plants in Ethiopia. Despite its wide presence in the country, nothing is known about genetic diversity and population structure of the species. Methods The study relied on 150 DNA samples representing 15 O.alpina populations collected across major O.alpina harboring forests of Ethiopia. Following total genomic DNA isolation SSR primer screening was conducted using PCR, gel electrophoreses, gel doc imaging, allele scoring, and statistical analysis. Accordingly nine SSR primers from Chinese Phyllostachys edulis and seven from Ethiopian Oxytenanthera abyssinica were found informative and used to investigate the extent of genetic diversity and structure of O.alpina populations. Results The study revealed the presence of moderate genetic diversity (Ho = 0.262; I = 0.639) within populations and very low genetic differentiation among populations (Fst = 0.019). Cluster (UPGMA), PCoA, and STRU CTU RE analysis did not group the populations into clearly defined genetically distinct clusters according to their geographic origins, more likely due to the reproductive biology of the species since vegetative propagation is the main means of reproduction associated with 50–100 years of flowering and low viability of seeds. Conclusions Despite limitations connected with employing only 15 SSR markers, the study suggested the presence of moderate genetic diversity within populations and highly mixed population structure resulting in very low genetic differentiation among O. alpina populations. This information could serve as a basis for designing suitable conservation strategies and conducting further research using more SSRs and other sequences-based informative markers.


Introduction
Bamboos are found in Asia, America, and Africa and belong to the family Poaceae and subfamily Bambusoideae.There are roughly 1670 species within 125 genera (Chaomao et al. 2006;Basak et al. 2021;Rohilla et al. 2023).Bamboos are one of the most versatile renewable resources in the plant kingdom.The majority of bamboos are distributed in tropical and subtropical climates around the world (Das et al. 2008;Qiao et al. 2014).In Africa, Madagascar is home to 43 bamboo species.Ethiopia, Tanzania, Malawi, Uganda, Sudan, and Zambia are among the African countries with bamboo resources.Only two species of bamboos are present in Ethiopia: Oldeania alpina (previously known as Arundinaria alpina or Yushania alpina) and Oxytenanthera abyssinica (Embaye 2003;Bekele 2007).Due to their excellent carbon fixation potential and high strength-to-weight ratio, bamboos are attracting more attention in the ecological and economic realms from time to time (Ohrnberger 1999;Fu 2001;Peng et al.2013).Furniture, fiber and textile poles, buildings, utensils, roofing material, food, pipes, animal fodder, ornamental, soil conservation, basketry and fence materials are a few of the bamboo uses.Because of easily splits, O. alpina is frequently used to make mats, chairs, sofas, and tables (Bekele 2007).Bamboo is an interesting model for addressing fundamental biological questions, due to its fastest-growing ability on earth, approximately two inches per hour and 60 feet in three months (Khalil et al. 2012), unique rhizome-dependent systems, expansion of gene families as a result of polyploidization, having both regular and irregular flowering habit followed by mass synchronous flowering and subsequent mass death (Tao et al. 2018;Guo et al. 2019;Zheng et al. 2020;Basak et al.2021).
Plant genetic diversity, population genetic structure, genetic differentiation, cultivar identification, and evolutionary genetics all benefited from microsatellite markers, also known as simple sequence repeats (SSR).SSR are favored because they are polymorphic, co-dominant, have high information content, and are relatively abundant (Gui et al. 2007;Rohilla et al. 2023).SSR markers can be developed for plant species with entire genome sequences or by exploring existing SSRs from related species, as several plant species show sufficient homology between genomes in microsatellitebearing regions (Kalia et al. 2011;Tu et al. 2011;Sataya et al. 2016).Despite losing the advantage of co-dominance in polyploid species, SSR markers remain a powerful tool for genotyping of polyploid species due to the vast number of reproducible alleles available per locus (Bhandawat et al. 2014(Bhandawat et al. , 2019;;Meena et al. 2020).The distribution of SSR is determined by selection pressure during evolution, hence SSR markers from transcriptome data most likely offer a higher degree of resolution in population genetic studies (Du et al. 2013;Yang et al. 2022.)The flowering cycle of O. alpina is about 45-50 years, as a result, vegetative propagation is the most common means of reproduction.This tends to support the hypothesis that O. alpina populations reveal very low genetic differences among populations.
Appropriate conservation and exploitation strategies are devised based on the genetic diversity status of the species (Bhandawat et al. 2015).SSR markers have been developed for several bamboo species (Bhandawat et al. 2014;Zhao et al. 2015;Satya et al. 2016;Jiang et al. 2017) and have been used to determine genetic diversity; population differentiation and population structure of various bamboos species (Miyazaki et al.2009;Jiang et al.2013;Bhandawat et al. 2015;Rohilla et al. 2023).SSR markers developed from Phyllostachys edulis are highly conservative and transferrable to some closely related bamboo species (Liu et al. 2022).To examine the extent of genetic diversity, various statistical approaches including expected and observed heterozygosity, polymorphic information content and F statistics such as genetic differentiation, principal coordinate analysis, population structure, and analysis of molecular variance have been used.Despite conservation intervention of a species heavily relying on the knowledge of the genetic diversity status and the enormous ecological and economic importance of the species, the population genetic diversity status of O. alpina populations remains elusive.This research is conducted with the objective of investigating the degree of genetic diversity, population differentiation and structure of O. alpina populations, which could serve as an important input for conservation measures, utilization and further studies.

Ethiopian highland bamboo (O. alpina) habitats and descriptions
Ethiopian highland bamboo (O.alpina) is a giant grass belonging to the tribe Arundinarieae, subfamily Bambusoideae, of the grass family Poaceae (Gramineae).O. alpina is harbored in the altitude range between 2200 and 3500 m above sea level, with temperatures ranging from 10 to 20 °C and annual rainfall ranging from 1500 to 2500 mm.It is common in Ethiopia's moist agroclimatic zones, such as the Gojam, Shoa, Gedeo, Sidama, Awi, Illubabor, Gamo Gofa, Jima, Arsi and Bale zones (Liese 2008;Kidane et al. 2023).The woody grass according to (Bekele 2007) grows in dense stands with a leafy canopy and stems so close together that passing through is impossible.O. alpina is best described as a large hollow-stemmed grass that grows to a height of 6-8 m but can reach 12-25 m.The stems (culms) are smooth, woody, hollow, yellow-green to brown, growing from swollen underground stems (rhizomes).Stems can reach 7-10 cm in diameter (Liese 2008;Mulatu 2021).In various regions of Ethiopia, there is no well-documented information about the flowering cycle of O.alpina, however, it has been reported that in the Kosober/Injibara area of Northwest Ethiopia O. alpina has been flowering gregariously and intermittently in small patches without being recognized by the general public for three decades (Sertse et al. 2011).

Plant materials and sample collections
Young leaf samples from 150 Ethiopian O. alpina plants, representing 15 populations (10 samples from each population) were collected separately and gently dried using silica gel.The distance between the two samples was dependent on the width of the forest.Each sample was composed of ~ 12 leaves inserted with silica gel in the zipped plastic bag, which corresponds to 120 leaves from each locality/population.Samples were collected to cover the country's major O. alpina growing areas (Table 1 and Fig. 1).The habitat and stand structure of O. alpina populations are displayed in (Fig. 2).

Genomic DNA extraction and SSR marker screening
Total genomic DNA was extracted from 50 mg silica gel dried young leaves using a modified CTAB protocol as described by Doyle and Doyle (1987).A Nanodrop spectrophotometer (ND-8000, Thermo Scientific) and 1% percent agarose gel electrophoresis were used to check the quantity and quality of the isolated DNA.After screening and procedure optimization a total of 16 SSRsprimer pairs (9 from Phyllostachys edulis and 7 from Oxytenanthera abyssinica (Zhao et al. 2015;Adem et al. 2019)) with clear and reproducible bands were selected and used to investigate the within and among the population genetic diversity and structure of O. alpina (Table 2).

PCR, agarose gel electrophoresis and SSR alleles scoring
PCR amplification was done in a 12.5 μl reaction, containing a mixture of 2 μl of genomic DNA, 1 μl of each of the two primers (at a concentration of 10 μM), 2.25 μl nuclease-free water and 6.25 remaining PCR components.The temperature profile used during the amplification consisted of 5 min of preheating at 94 °C followed by 35 cycles of denaturation at 94 °C for 30 s, annealing at 54 °C for 30 s, extension at 72 °C for 30 s, and a final extension at 72 °C for 10 min.To observe DNA fragments, ten microliters of PCR amplified products were fractionated by 3% agarose gel electrophoresis using 1X TAE buffer at 130 V for 3 h.The gel was stained using gel red (2 μl).A 50/100 bp Biolabs DNA ladder was  used to estimate the amplification size.The Altay UVP gel documentation system was then used to image the fragments.The clear bands were recognized as alleles for the SSR loci based on the expected PCR product size of each primer.Heterozygote alleles were scored as bands amplified with different PCR product sizes (bP), while homozygous alleles as bands produced with uniform size (Additional file 1).

Genotypic data analysis
The number of alleles per locus, expected and observed heterozygosity, Polymorphic Information Content (PIC) and F statistics such as genetic differentiation (Fst), Wrights fixation index (Fis), principal coordinate analysis (PCoA) for the 15 populations were conducted using GenAlEx 6.502 (Peakall and Smouse 2012).Analysis of Molecular Variance (AMOVA) was conducted using ARLEQUIN ver 3.5 (Excoffier and Lischer 2011).UPGMA for the 15 O. alpina groups was drawn using the Poptree2 (http:// www.ualbe rta.ca/ ~fyeh/ fyeh) program.Genetic distance and genetic identity between 15 O.alpina populations were computed using Darwin V6 software.Based on their computed pairwise genetic distance value, a dendrogram was drawn for the 15 O.alpina populations by using Darwin V6 software (Perrier and Jacquemoud-Collet, 2006).
A Bayesian model-based clustering algorithm in STRU CTU RE ver. 2.3.4 (Pritchard et al. 2000;Falush et al. 2003) was employed to detect admixture and infer the pattern of population structure.To decide the most likely number of populations (K), a burn-in period of 50,000 was used in each run, and data were collected over 500,000 Markov Chain Monte Carlo (MCMC) replications for K = 1 to K = 15 using 20 iterations for each K.The optimum K value was predicted following the simulation method of (Evanno et al. 2005) using the web-based STRU CTU RE HARVESTER ver.0.6.92(Earl and Holdt 2012).The bar plot for the optimum K was determined using Clumpak beta version (Kopelman et al. 2015).Gene flow of the populations (Nm) was calculated based on Fst, Nm = [(1/Fst)−1]/4.

SSR genotyping and the extent of genetic diversity
A total of 49 alleles were detected in the 150 O. alpina individuals, ranging from 2 (primers Phe453 and Oxa5) and to 4 (primers Phe734, Phe317 and Oxa1), with an average of 3.06 alleles per locus with overall moderate genetic diversity (Ho = 0.262) as presented in (Table 3).Two-thirds of the major allele frequencies were covered by thirteen of the sixteen markers.The studied loci indicated significant differences between Ho and He, demonstrating a heterozygosity that differed significantly from Hardy Weinberg Equilibrium across populations.All of the employed markers have polymorphism information content scores greater than 0.5 (0.812-1).
The observed number of alleles (Na), effective number of alleles (Ne), observed heterozygosity (Ho) and expected heterozygosity (He) and Fixation index (F) are presented in (Table 4).Individual populations had Na values ranging from 2.063 (Agaro) to 2.563 (Debresina and Rira).The Ne value ranges from 1.587 (Agaro) to

Genetic distance and identity among the fifteen O.alpina populations of Ethiopia
For each pair of the fifteen highland bamboo populations, genetic identity was determined according to Nei (1972).The highest genetic identity (0.980) was observed between Agaro and Konta, while the highest genetic distance (0.091) was observed between Chencha and Debresina populations.Based on Nei-Li's similarity index, a high similarity rate was observed suggesting a close relationship among populations.

Genetic differentiation, gene flow and variance analysis of O.alpina populations
Analysis of Molecular Variance (AMOVA) among 15 populations revealed that variations were highly distributed within populations (98%) rather than among populations (2%) (Table 5).O. alpina populations exhibited genetic differentiation (Fst = 0.019) and gene flow (Nm = 12.91) between populations.Paired comparisons among populations based on locus-by-locus analysis for the 16 microsatellite loci revealed that Fst ranged from 0.000 at locus Phe453 to 0.265 at locus Phe82.The highest genetic identity (0.980) was observed between Agaro and Konta, while the highest genetic distance (0.091) was observed between Chencha and Debresina populations.

Population structure, principal coordinate and cluster analysis
The Bayesian approach-based assignment of the 150 individual plants to different groups using STRU CTU RE outputs, predicted (K = 6) to be the most likely number of clusters (Fig. 3a).Based on this value, Clum pak result (bar plot) shows the mixed genetic structure of populations (Fig. 3b).Principal coordinates analysis (PCoA) bi-plot shows the highly intermixed pattern of 150 individuals representing 15 populations.In the PCoA, the first three principal coordinates accounted for 25.36% of the total variation.The first principalt coordinate accounted for 9.85% of the total variation, while the second and third principal coordinates accounted for 8.07% and 7.44% respectively of the total molecular variance (Fig. 4).

SSR genotyping and extent of genetic diversity
The nature of their hypervariability, high abundance of information, and codominance make SSRs an interesting marker for population genetics and germplasm identification studies in plants (Yang et al.022;Liu et al. 2022;Rohilla et al. 2023).The geographic range of a species, the timing of its flowers, and its mating strategy are all factors that affect its genetic diversity (Hamrick and Godt 1996;Nyborn 2004).Other elements that may control the genetic diversity within a species include population size, gene flow, and genetic drift (George et al. 2009).The flowering of O. alpina is only once every 45-50 years, hence vegetative propagation is the most common form of reproduction.This tends to back a hypothesis that claims the occurrences of low genetic diversity within O. alpina populations.The study results, however, debunked the hypothesis by showing that O. alpina populations have exhibited moderate genetic diversity within populations.
For O. alpina populations, there is no prior information on the molecular marker-based genetic diversity in general and SSR marker analysis in particular.The total number of 49 alleles, 2-4 per loci from this study was better than 28 for Guadua inermis, 25 for G. amplexifolia and 13 G. tuxtlensis (Pérez-Alquicira et al. 2021), comparable with 64 alleles ranging from 2 to 5 per loci that were detected using 23 SSR primers in P. edulis (Zhao et al.2015).However, lower than, 89 alleles of 5.93 per loci (Sataya et al. 2016), 89 alleles within a range of 2-16 loci in P. edulis (Liu et al. 2022) and 169 alleles were detected across 20 SSRs in P. edulis (Jiang et al. 2017).Cai et al. ( 2019) found 3-12 alleles for P.violascens and 3-20 for Phyllostachys using EST-SSR primers.This reveals that the number of alleles to be detected significantly varies not only in different species but also within the same species due to the use of genetic materials from different sources (Zhao et al. 2015).This study found no appreciable polymorphism variations between the SSR markers transferred from the two species, even though transferability and polymorphism rely on the genomic position of the markers (Lin et al. 2014).
The genetic diversity expressed in terms of expected heterozygosity was high for both loci and population (He = 0.399 and 0.398) respectively.However, that of observed heterozygosity is a bit lower, 0.261 for populations and 0.262 for loci.As a species with odd reproductive biology, these values are taken as a good indicator of genetic diversity (Ramanayake 2006).Compared with other studies, the polymorphic information content (PIC = 0.92) shows the loci's high information content and polymorphism.For instance, Liu et al (2022) presented a PIC of 0.46, Silva et al. (2020) found a PIC of 0. 50.Jiang et al. (2017) found a PIC of 0.74.Several species of bamboos evaluated by Chen et al. (2010) showed a PIC that ranges from 0.48 to 0.987, these values are better close to this study.As PIC informs the polymorphic level of the markers, any marker (PIC > 0.50) is suitable for genetic diversity study (Lin et al. 2014;Satya 2016).
These differences may be more pronounced due to the homoplasy of SSR markers, sampling strategy, sample size, and sources of the genotyping material since materials of the same species collected in different areas and times have exhibited a significant difference in their genetic status (Zhao et al. 2015).This could occur after a lack of genetic variation (Wong 2004).Due to having reduced genetic differentiation among populations but moderate to higher genetic variations within populations of the same species, such genetic characteristics are attributed to self-incompatibility, outcrossing and longlived plants (Hammrick et al. 1992;Zawko et al.2001;Rohilla et al. 2023).O. alpina is a long-lived woody bamboo species, with a long vegetative phase (Sertse et al. 2011;Isagi 2016) there is likely to be a moderate amount of genetic variation within the populations as shown in other bamboos (Jiang et al. 2013(Jiang et al. , 2017)).Nybom (2004) found that the majority of the genetic variability was conserved within populations of long-lived and outcrossing

Genetic differentiation,gene flow and analysis of variance
Analysis of molecular variance (AMOVA) revealed that the highest genetic variability (62.98%) was accounted for within individuals than among populations (1.99%).The Nei-Li's similarity index also revealed a high rate of similarity suggesting a close relationship among populations.This is not consistent with similar studies on bamboos that detected 14.55% among populations and 84.55% within populations (Jiang et al. 2017) and 71.74% among populations and 28.25% within populations (Pérez-Alquicira et al. 2021).
According to Wright (1978) Fst value classification, a very low level of genetic differentiation (Fst = 0.019) was obtained.Only one study shows lower Fst than the current study (Fst = 0.01) for Guadua amplexifolia (Pérez-Alquicira et al. 2021).This implies that genetic differentiation was poorly pronounced.In the SSR analysis of Mexican bamboos, Pérez-Alquicira et al. (2021) found genetic differentiation of (Fst = 0.04) for G. tuxtlensis and (Fst = 0.29) for G. Inermis.Kuruna debilis populations in Sri Lanka showed genetic differentiation (Fst = 0.113) using SSR Markers (Attigala et al. 2017).In the SSR study of Phyllostachys edulis, (Fst = 0.162) was found (Zhao et al. 2015).Very high genetic differentiation was found (Fst = 0.847) for Dendrocalamus giganteus (Tian et al. 2012), (Fst = 0.338) for Dendrocalamus hamiltonii (Nilkanta et al. 2017).According to Abreu et al. (2014), genetic differentiation among seedlings and saplings of Brazilian bamboo after a mass flowering was (Fst = 0.31) for the chloroplast SSR.Using eight SSR markers, Yang et al. (2022).Found (Fst = 0.306) in Dendrocalamus sinicus.According to Grant (1991) The ability of a species to develop genetic variation across its range depends in large part on the spatial organization of local populations and the associated patterns of gene flow (Ennos 1991).Pollen drift and seed dispersal could not be the most likely reason for gene exchange across populations over a great distance due to limited pollen dispersion brought on rare flowering, a naturally protracted vegetative phase and changes in flowering times (Sertse et al. 2011;Zhang and Ma 1990).However, such high value ofgene flow might be attributed to the fact that O. alpina populations have likely undergone massive distribution of vegetative parts to different areas probably many years ago.The population structure, principal coordinate and cluster analysis show the mixing and overlapping of O.alpina individuals from various populations.According to Chen et al. (2014) plants that are close to one another retain higher genetic identity than those far from each other.Similar to this, the highest genetic identity (0.980) was observed between Agaro and Konta (157 km), which are much closer geographically than those with the highest genetic distance (0.091) observed between Chencha and Debresina (632 km).This would have been more acceptable if the highest genetic identity occurred between Ankober and Debresina populations due to their very closer distance of 97 km than any other populations.

Population structure, principal coordinate and clustering
As shown from Delta K, for STRU CTU RE analysis, the most likely cluster for 15 populations is (K = 6) with a high intermix of individuals.Little intermix (K = 2 and 3) clusters were found in G. amplexifolia, G. Inermis and G. tuxtlensis populations of Mexican bamboos (Pérez-Alquicira et al. 2021).The structure analysis of Brazilian 347 bamboo individuals formed (K = 5) distinct groups, showing little intermix among populations (Silva et al. 2020).The principal coordinate analysis shows that individuals from a single population were extensively dispersed and difficult to form groups as a result of significant genetic admixture, most likely due to sharing similar vegetative reproduction material probably many years back.Gene flow which relies on seed dispersal and pollen drift is the unlikely reason for such high genetic admixture since the flowering of O.alpina populations requires up to five decades and seed viability is very low (Shewaye et al. 2019).There is no well-documented information about the difference in flowering, pollen dispersal and seed flow between the sampled populations except that the O. alpina population of the Injibara area has been flowering every three-to five-year intervals (Sertse et al. 2011).Other sampled populations are thought to take decades to flower.
The Unweighted Pair Group with Arithmetic Mean (UPGMA) analysis of SSR data depicted that 15 populations were clustered into two main and five subclusters.However, the clustering does not coincide with their geographic origins.The UPGMA analysis by Liu et al. (2022) clustered 34 bamboo accessions into two groups using 15 SSR markers.The study showed that the majority of the accessions were grouped based on their taxonomic classifications.However, the used primer sets could not distinguish some accessions accordingly, pointing to the limitation of the employed SSR markers (Liu et al. 2022).Likewise, the SSR markers used in this study were not able to discriminate the populations according to their geographical origin or proximity to each other.This is in contrary to the prediction that plant populations often have a close genetic structure at short distances (Ennos 2001).

Limitations
The use of fewer SSRs and agarose gel, which might not be able to adequately distinguish the tiny changes between SSR amplicons, may be the cause of the populations' relatively very low genetic differentiation but substantial genetic diversity within a population.Therefore, more research should concentrate on employing representative, distinct O. alpina populations, a greater number of SSRs that can capture the species' entire genetic diversity, and a capillary electrophoresis technique for improved SSR amplicon discrimination.Additionally, to better understand the genetic diversity and population dynamics of O. alpina populations, future research should develop the species' own SSR markers.

Conclusions
The study demonstrated the presence of moderate genetic diversity within populations and very low genetic differentiation among populations.Despite the Bayesian structure analysis forming six groups (K = 6) from 15 analyzed populations, distinct groups are not generated as also confirmed by UPGMA and PCoA analysis.Despite conservation measures being important for all populations this study suggests that Masha and Tikurcheni populations should be prioritized for in situ conservation.More detailed studies should focus on isolated populations that exhibit distinctive morphology may assist in knowing the assertions of morphological variations.This is important since molecular-based diversity analysis enables to re-evaluate traditional taxonomy that relies on vegetative features in bamboo.

Fig. 1
Fig. 1 Map of Ethiopia showing sample collection areas

Fig. 3 A
Fig. 3 A the for ΔK k clusters, B Structure bar graph of highland bamboo groups inferred at K = 6 based on SSR data.The relationship between ΔK highest peak at K = 6

Fig. 4
Fig. 4 Principal coordinates analysis (PCoA) bi-plot showing the clustering pattern of 150 samples representing 15 populations.Samples coded with the same symbol and colors belong to the same population , genetic differentiation and population gene flow are inversely correlated.The result supported the previous assertion of high gene flow (Nm = 12.91) despite the very poor or non-existent seed dispersal mechanism of O. alpina populations.SSR analysis of Melocanna baccifera populations reported gene flow (Nm = 2.545) (Zawko et al. 2001) and (Nm = 1.203) in Phyllostachys edulis (Zhao et al. 2015).Overall gene flow (Nm = 0.49) among all the populations of Indian Dendrocalamus hamiltonii populations (Nilkanta et al. 2017).However, the same study observed high gene flow (Nm = 1.73) between Kangra and Mandi populations of Indian localities, while Mandi and Mizoram recorded low gene flow (Nm = 0.26).

Fig. 5
Fig. 5 Un-weighted pair-group method with arithmetic mean (UPGMA) Dendrogram showing genetic relationships among the 15 populations considered based on Nei's unbiased genetic distance

Table 1
Sample collection areas with their respective Latitude, Longitude, Altitude and region SNPPR: Southern Nations Nationalists and Peoples' Region

Table 2
List of SSR primers used for the study

Table 3
Major allele frequency, Number of alleles, observed, expected and unbiased heterozygosity and fixation index and PIC of 16 loci N (Sample size), Na (Actual number of alleles), Ne (Effective number of Alleles), Ho (Observed Heterozygosity), He (Expected Heterozygosity), I (Shannon diversity index), Fst (Fixation Index)

Table 4
Genetic diversity indices for the 15 populations of highland bamboo

Table 5
Summary of AMOVA for the 15 Ethiopian highland bamboo populations based on 16 SSR markers