Skip to main content

The effects of outdated data and outliers on Kenya’s 2019 Global Food Security Index score and rank

Background

The heterogeneity of food security indicators and the lack of consensus on comparing and ranking countries have driven international organisations to build composite indicators (Santeramo 2015). Composite indicators are aggregated indexes comprising individual indicators and weights, each representing an indicator’s relative importance based on a given underlying model (Nardo et al. 2005). Policymakers often rely on composite indicators as useful diagnostic tools for prioritising policies (Turan et al. 2018), while in benchmarking exercises, poor-performing countries also learn from better-performing countries. Moreover, composite indicators are essential for public communication due to ease of interpretation (Santeramo 2017).

Although composite indicators help set policy priorities when benchmarking or monitoring country performances, some factors may hinder their reliability (OECD 2008). A composite indicator may send misleading policy messages if poorly constructed or misinterpreted, leading to inappropriate policy choices. Vollmer et al. (2016) analysed how using composite indicators to assess freshwater systems’ sustainability impacts policy decisions. The study notes that the definitions and theoretical underpinnings of each indicator making the composite index may imply different methods with different policy implications and outcomes. For example, water stress can refer to the physical availability of water for production or the high cost of water for direct human use. Therefore, to quantify water as a resource, policymakers must first be clear on what constitutes a water resource.

Building composite indicators is also a complex process full of drawbacks such as unavailability of data, the choice of individual indicators, normalisation, weighting and aggregation methods to use (Mazziotta and Pareto 2013). While composite indicators can be used as solution-oriented tools for evaluating scenarios and identifying trade-offs among services and beneficiaries, there is a need for earlier engagement with end-users to help researchers find a balance between the robustness, legitimacy and credibility, to improve decision-making (Vollmer et al. 2016). Moreover, the abundance of information related to the measured phenomenon makes it difficult for end-users to navigate and understand and identify the most appropriate assessment methods for their informational needs.

Angeon and Bates (2015), while assessing the use of composite indicators to quantify vulnerability and resilience at the macro level, highlight that an in-depth analysis of the composite indicators should be interpreted with cauction as not all indicators cover all the dimensions of sustainability. In addition, multiple variables and computation methods used to build composite indicators lead to the question of whether there is a minimum set of variables that consistently describes the studied phenomenon. The considerable variation in what is being measured and how indicators are applied may make it difficult for end-users to identify suitable assessment methods. As a result, composite indicators are often subject to political disputes on their overall results and credibility (OECD 2008). Constructing a robust composite indicator requires the inclusion of only the relevant variables for efficient measurement and transparency in every step of the composite indicators construction process(Angeon and Bates 2015).

Constructing a composite indicator involves seven main steps: defining the phenomena, variable selection, missing data imputation, normalisation, weighting, aggregation and sensitivity and uncertainty analysis test. However, opinions differ within studies on which of the steps are critical and subjective (Hudrliková 2013; Santeramo 2017; Caccavale and Giuffrida 2020). For example, Hudrliková (2013) states that data aggregation, normalisation and weighting methods are the fundamental and subjective steps, while Santeramo (2017) states that normalisation and weighting approaches do not significantly affect the results. However, the methods of aggregation and imputation of missing data must be carefully selected (Santeramo 2017; Caccavale and Giuffrida 2020) highlight thatthe methods used for imputing missing data, normalisation, weighting and variable selection cause variability in output, while the aggregation method has a minimal effect on the output. Regardless of the methods used for constructing a composite indicator, aggregation, normalisation, missing data imputation and weighting methods remain subjective and lead to different results (Dialga and Giang 2017). Therefore, transparency when constructing a composite indicator is critical, as every methodological decision can impact the index’s outcome and robustness (OECD 2008; Santeramo 2015).

Besides the methodology to use when constructing a composite indicator, outdated data and outliers also challenge the robustness of a composite indicator. Outdated data occurs due to lack of frequent surveys to update databases, while outliers are extremely large or small values contained in some variables in an observation (OECD 2008). The outdated data and outliers could distort findings on countries’ performances, thereby, incorrect policy prescriptions in benchmarking processes (Freudenberg 2003). Unlike a simple one-dimensional measure of a phenomenon, composite indicators use numerous indicators and sub-indicators to investigate different areas to provide an understanding at local, national or international levels. However, this requires overreliance on availabl data collected at different spatial scales and for different purposes.

Consequently, such data increases uncertainties in the overall applicability of the data to the construction of the composite indicator in terms of the data collection methods, sampling period, credibility of the source, measurement errors and data interpretation (Burgass et al. 2017). Furthermore, due to economic, social or even historical changes, data collected at a given time might not reflect the intended results in the future (Abberger et al. 2018). Therefore, timely and high data quality is critical for accurate and credible measurement (Benin et al. 2020). Therefore, a critical assessment of the data quality and age of data (outdatedness) should be undertaken to ensure timeliness, relevance and coherence of data before inclusion in the composite indicator is required to improve the robustness of a composite indicator.

Poor data quality, inconsistency and high data collection costs could also limit a composite indicator’s robustness (Abberger et al. 2018; Benin et al. 2020) analysed how the data reporting rate (reported data values as a percentage of total data values required) and data quality (the proportion of reported data with no issues) affect African countries’ policymaking towards achieving the Malabo goals. Benin et al. (2020) compared piloted and non-piloted countries’ performances in the 2018 and 2020 Biennial Review using the difference in difference method. From their results, the piloted countries improved data quality and reporting rate in the 2020 Biennial Review compared to their non-piloted counterparts. They concluded that continuous and timely data updating is critical to achieving the Malabo goals. However, more effort is needed to ensure countries update the outdated critical indicators, such as those of ending hunger (Benin et al. 2020).

Outliers also affect the robustness and reliability of composite indicators. Thomas et al. (2017), analysed the impact of the outliers on the 2016 Global Food Security Index (GFSI) and found that countries with outlier data points shifted in rank after these data points were winsorised (assigning the outlier data points with the next closest value to reducing their effect). The most considerable shift after the winsorisation was upwards by six places. However, Thomas et al. (2017) concluded that the presence of outlier data points did not affect the GFSI final score and rank, which warrants further investigation.

Some examples of food security composite indicators include the Coping Strategy Index (CSI), the Global Food Security Index (GFSI) and the Global Hunger Index (GHI). The CSI assess households’ coping strategies when they do not have enough food or money to buy food. By contrast, the GFSI analyses the global food security environment in the dimensions of affordability, availability, quality and safety and natural resources and resilience. The GHI measures and tracks hunger globally by region and country using undernourishment, child stunting, wasting and mortality indicators (Pangaribowo et al., 2013; IFPRI, 2019; EIU 2019).

Kenya’s Performance in the GFSI since 2012

Kenya’s performance in all the GFSI dimensions since 2012 is shown in Fig. 1. The affordability dimension had the most significant score improvement in 2019 of 56.7 out of 100 compared to its performance since 2012. The GFSI replaced food consumption as a share of household expenditure indicator by the change in the average food cost indicator in the 2019 reporting (EIU 2019). As a result, Kenya scored 95.3 out of 100 in the change in average food cost indicator compared to the other indicators in the dimension (EIU 2019). The GFSI also updated Kenya’s proportion of the population living under the global poverty line indicator from 2005 to 2015 data, consequently improving the overall affordability dimension score from 38.2 to 2018 to 56.7 in 2019 (EIU 2019).

Fig. 1
figure 1

Kenya’s scores out of 100 in the GFSI since 2012

In terms of the overall GFSI rank, Kenya has ranked among the bottom 30 countries since 2012 (Fig. 2). However, the availability dimension significantly improved in rank (101 to 93) in 2019 compared to the affordability and the, quality and safety dimensions. EIU (2018) highlighted Kenya’s political instability as a contributing factor to its poor performance in 2018; thus, the considerable improvement in 2019 post the presidential election. Political and social dynamics shape food systems’ economic context, particularly whether and how farmers invest in agricultural production (EIU 2018). Political instability causes economic uncertainties making it risky for farmers to plant crops, especially when they do not expect their efforts and inputs to pay off at harvest time. Consequently, this affects food availability and overall food affordability due to shortages (EIU 2018).

Fig. 2
figure 2

Kenya’s rank out of 113 countries in the GFSI since 2012

Kenya has implemented several policies in an ambitious effort to improve food security in the country and, ultimately, better performance in composite indicators. Examples include the Big Four agenda that aims to enhance the national grain reserves by increasing food production and improving the road network for market accessibility (GoK 2018). The National Nutrition Action Plan aims to improve nutrition intervention activities provided by the government and nutrition stakeholders (GoK 2012). However, Poulton and Kanyinga (2014) have highlighted that the successful implementation of these policies is still a challenge in Kenya as there has been limited progress towards achieving food security despite the numerous policies and institutional structures. Moreover, individuals are still vulnerable to cyclical shocks threatening food security, which require urgent attention (Hickey et al. 2012).

This study explored the impact of updating outdated data and correcting for outlier data points in the 2019 GFSI database and the effect on any particular country’s score and relative rank. The performance of a country relative to others is of great political importance to countries that can challenge these outcomes for various reasons. Therefore, this study can trigger academic discussion on the performance of the countries analysed by the GFSI. Kenya’s 2019 GFSI result was used as a case study given Kenya’s poor performance in the GFSI since 2012 and the economic changes experienced post the 2017 elections. Furthermore, given Kenya’s status as an economic, commercial and logistical hub in Eastern Africa and one of the favourable investment destinations globally, this study could serve as a reference point for other countries within the region. This paper is outlined as follows; section two presents the GFSI methodology, section three sets out the study’s methodology, section four describes the results and findings and lastly, section five outlines the conclusions and policy implications of the findings.

The methodology of the Global Food Security Index

The GFSI is a composite indicator developed by the Economist Intelligence Unit (EIU) in 2012. The GFSI is a dynamic quantitative and qualitative benchmarking model that analyses the drivers of food security across 113 countries in the dimensions of affordability, availability, quality and safety and natural resource and resilience (EIU 2020). The GFSI first incorporated the natural resource and resilience dimension into the main index in the 2020 report. The EIU selects countries included in the GFSI based on their regional diversity, economic importance and population size (EIU 2019). Countries with larger populations are chosen to represent a larger share of the global population. The GFSI aims to determine which countries are most and least vulnerable to food insecurity using 34 indicators, as shown in the Additional file 1: Table S1. The panel of experts at EIU determines the indicators to be included in the GFSI for each dimension in consultation with a panel of food security specialists (EIU 2019). The GFSI sources quantitative data from national and international statistical databases, while the qualitative indicators are created based on information from development banks and government websites. Other qualitative indicators are also drawn from various surveys and data sources and adjusted by the EIU (2019).

In the affordability dimension, the GFSI explores people’s capacity within a country to pay for food and the cost of food under normal circumstances and during price-related shocks (EIU 2019). The availability dimension explores elements that impact food supply, ease of accessing food and how structural aspects such as infrastructure determine a country’s capacity to produce and distribute food. This dimension further assesses the elements that might obstruct robust food availability within a country (Chen et al. 2019; Izraelov and Silber 2019). The quality and safety dimension analyses the nutritional quality of average diets and the food safety environment of a country (Izraelov and Silber 2019). Finally, the natural resources and resilience assess a country’s exposure to the impacts of climate change, a country’s susceptibility to natural resource risks and how the country is adapting to these risks (EIU 2020).

The selected indicators are then normalised to rebase the raw indicator data into a standard unit to allow aggregation. Indicators sourced from databases are often in different statistical units. Others over different ranges and scales. Therefore, normalising the indicators before aggregating them into a composite indicator is critical for having a standard unit. Several normalisation methods could be used, including but not limited to z-score, ranking, distance to target and Min-Max. The GFSI uses a min-max normalisation method. The min-max, normalises indicators within a range of 0–1 by subtracting the minimum value and dividing by the range of the indicator values (Saisana et al. 2005).

The GFSI normalises the indicators for which a higher value indicates a favourable environment like average food supply as shown by Eq. (1):

$$X= (x- Min(x))/(Max(x) - Min(x))$$
(1)

Min(x) is the lowest value and Max(x) is the highest value in the 113 countries for any given indicator. After normalisation, the values are transformed from zero to one range into a zero to 100 score. As a result, countries with the highest raw data will score 100, while countries with the lowest raw data will score zero (EIU 2019).

The normalisation of indicators in which a high value indicates an unfavourable food security environment, like the volatility of agricultural production, is shown in Eq. (2):

$$X= (Max(x))/(Max(x)-Min(x))$$
(2)

Min(x) is the lowest and Max(x) is the highest in the 113 countries for any given indicator. The normalised value is then transformed into a positive number on a scale of zero to 100 to make it directly comparable with other indicators (EIU 2019).

The GFSI uses two sets of weightings for the indicators. The first is equal weighting, which assumes all indicators are equally essential and distribute weights evenly to all indicators. The second is the peer panel recommendation weighting, which averages the weighting suggested by five EIU panel of experts. Expert weighting is the default weighting the GFSI uses to score and rank countries. Although this default weighting has been criticised for being biased and subjective (Chen et al. 2019; Izraelov and Silber 2019; Maricic et al. 2016), these studies have concluded that the GFSI is robust in its attempt to measure food security.

The GFSI assigns a higher weight to the availability dimension than the affordability and the quality and safety dimensions. Thomas et al. (2017) highlight that even though the EIU considers affordability and availability dimensions of greater statistical importance, the quality and safety dimension is equally essential. The availability dimension is weighted 44%, while the affordability and quality and safety are weighted 40% and 16%, respectively. Weights assigned to some individual indicators in the GFSI dimensions are also not equal to their statistical importance (Thomas et al. 2017) have stressed that the GFSI developers should justify assigning weights and the statistical importance of indicators in the GFSI.

Methodology

The study used the 2019 GFSI database for the 113 countries to extract Kenya’s data for the analysis. The skewness and kurtosis absolute values were used to identify the outlier data points to study the shape and distribution of all the GFSI indicators. Any indicators with an absolute value above two for skewness and 3.5 for kurtosis were considered outlier data points. The skewness and kurtosis values were particularly relevant for testing normality as they are robust and generate percentiles, useful for further identifying an indicator with an outlier data point (Thomas et al. 2017; OECD 2008). The winsorisation method was then used to remove the identified outlier data points from the GFSI and determine their statistical significance to the GFSI countries’ scores and rankings. Winsorisation involved replacing the values tested for outlier data points with expected values. The identified outlier data points are then replaced with the largest or second smallest value in observations, excluding the outlier data points (Thomas et al. 2017). Winsorisation was chosen due to its robustness and because the resulting winsorised values are consistent with original data points (Kwak and Kim 2017).

For outdated data, the study considered any indicators from 2018 or older in Kenya’s 2019 GFSI database to be outdated. This is because the GFSI releases an annual benchmarking report based on data from the preceding year and the reference year for the study was 2019. The identified outdated indicators in Kenya’s GFSI database were then updated based on data from the Kenya National Bureau of Statistics (KNBS) database. The updated indicators were then normalised using the GFSI min-max normalisation method to rebase the raw indicator data into a standard unit, allowing data aggregation to new scores and ranks. Finally, paired t-tests and Spearman’s rank correlation were conducted to test the statistical significance of winsorisation of the outlier data points and updated Kenya’s 2019 GFSI database.

The paired t-test was used in this study to determine if the GFSI mean before and after the winsorisation of outlier data points and after updating Kenya’s outdated indicators would differ from the 2019 GFSI result. A paired t-test is essential when determining if the mean of a dependent variable is the same in two related groups who undergo two different conditions, which makes it favourable for the study. The difference between the paired values is assumed to be normally distributed, while the null hypothesis is that the expected value equals zero.

The Spearman rank correlation test was used to test for the changes in the GFSI rank before and after the winsorisation of outlier data points and updating Kenya’s 2019 GFSI outdated data. The Spearman rank correlation test is a nonparametric test used to determine the strength of association between two variables measured on an ordinal or continuous scale. The Spearman’s rho values of less than one indicate a strong correlation between the two groups: in this case, the 2019 GFSI result and the winsorised and updated Kenya’s 2019 GFSI indicators. All the analyses for the study were carried out in STATA version 15 and excel software.

Results and discussion

The study’s first objective was to determine the proportion of outdated data and outliers in the 2019 GFSI result. The results for the analyses are presented in the subsections that follow.

The proportion of outdated data and outliers in the 2019 GFSI database

Measuring food security is a multifaceted problem, requiring the use of several indicators and sub-indicators; often involving numerous data points and theoretical assumptions. However, data availability and outdated data are critical methodological limitations that hamper effective food security measurement by composite indicators, such as the GFSI. From the results, sixteen (44%) of the 34 indicators in the GFSI database were outdated. Data from 2005 was the oldest data point the GFSI used for reporting the proportion of the population living under the global poverty line. The quality and safety dimension had the highest number of outdated data entries. Seven (47%) of the 11 indicators for this dimension were data from 2018 or older. The micronutrient availability, a composite indicator measuring dietary availability of vitamin A, iron and zinc, dietary diversity and protein quality were all reported based on 2013 data.

Composite indicators often rely exclusively on existing data from various databases collected by sources other than index developers. Consequently, the use of these outdated data by the GFSI arises in part due to the high cost of frequent data collection, especially among developing countries. The EIU (2019) highlights that only 70% of the assessed countries in the 2019 GFSI had completed data collection on undernourishment and nutrient deficiencies over the past five years. Moreover, some countries in the GFSI exceeded the five-year threshold without collecting nutrition or undernourishment data (EIU 2019). While the dietary diversity and protein quality indicators play a critical role in informing the nutritional quality of average diets in a country, using outdated data to report the indicator’s performance could hinder the indicators’ helpful information (Thomas et al. 2017).

Table 1 shows the proportion of indicators identified as outlier data points in the 2019 GFSI. The quality and safety dimension had one outlier data point, namely the indicator on the agency to ensure the safety and health of food, which was an outlier data point for all countries. The indicator (agency to ensure the safety and health of food) is a qualitative indicator created and scored by the EIU for all countries based on the subjective judgement of a team of experts who designed it, making the indicator an outlier data point for all countries (Thomas et al. 2017). The availability dimension had six (60%) of the outlier data point. This finding was similar to the 2016 GFSI result, where the availability dimension had the highest number of outlier data points (three out of six) (Thomas et al. 2017). The agency to ensure food safety and health, agricultural import tariffs, food losses, public expenditure on agricultural research and development and urban absorption capacity were outlier data points in the 2016 and 2019 GFSI data (Thomas et al. 2017).

Table 1 Outlier data points identified in the 2019 GFSI database

Winsorisation of the identified outlier data points was carried out to prevent the outliers from acting as unintended benchmarks. However, the agency to ensure the safety and health of food, the existence of adequate crop storage facilities and the presence of food safety-net programmes were qualitative indicators and were not winsorised. The qualitative indicators can not be winsorised as their scoring are based on the subjective judgement of the EIU panel of experts who designed the indicators (Thomas et al. 2017).

Kenya’s proportion of outdated data and outliers in the 2019 GFSI database

Kenya had 13 (38%) outdated indicators in the 2019 GFSI (Table 2). Six (46%) outdated indicators were based on 2013 data. Five of which were in the quality and safety dimension. The oldest data points were Kenya’s indicators measuring micronutrient availability, dietary diversity and protein quality. However, Kenya has considerably improved micronutrient availability through mandatory fortification of staple foods to provide essential micronutrients to individuals (Linda et al. 2020).

Table 2 Kenya’s outdated data points in the 2019 GFSI database

The volatility of agricultural production indicator measures agricultural productivity fluctuations (in standard deviation) over the past five years to predict and plan for a consistent future food supply. The volatility in agricultural production could arise due to unpredictable shocks, such as bad weather, diseases and pests or price changes, consequently increasing dependence on chronic food aid (FAO 2017). For example, Kenya suffered a severe drought in 2017, where more than 2.7 million Kenyans were affected, almost entirely depending on emergency food aid (FAO 2017; FEWS.NET 2018). Such a catastrophe could result in volatility in agricultural production, thereby obstructing future food supply and affordability, translating to food insecurity.

The public expenditure on agricultural research and development indicator measures the agricultural share of government expenditure divided by the share of the agricultural value-added to the GDP (EIU 2019). Kenya’s government expenditure allocation to agriculture was reduced by a ratio of more than 12 in the 2019 GFSI (EIU 2019). Kenya’s low expenditure on agriculture was also evident in Kenya being off track in the Malabo goals (Benin et al. 2018). The 2018 Biennial Review recommended that Kenya increase its expenditure on agricultural investment. (Benin et al. 2018).

Overall, the study found that Kenya’s 2019 GFSI database did not have outlier data points, even though other countries’ outlier data points affected its scores and ranking, as described in Sect. "The impact of winsorisation of outlier data points to GFSI ranks". Therefore, the null hypothesis for the first objective, that the 2019 GFSI database did not contain outdated data and outliers was rejected because the GFSI database contained outdated data and outlier data points.

Paired t-test results for the winsorised outliers in the 2019 GFSI database

The study’s second objective was to determine the statistically significant effect of the outlier data points on the GFSI dimension scores and ranking. Table 3 shows the results for the paired t-test. The GFSI mean score in the affordability and availability dimensions reduced by 6.257 and 3.195 points, respectively, after the winsorisation of the outlier data points. However, the paired t-test result was not significant for the quality and safety dimension (no indicator was winsorised). Note: scores for indicators in all the GFSI dimensions are out of 100.

Table 3 Paired t-test result for the winsorisation of outlier data points in the GFSI dimensions

As highlighted in Sect. “Kenya’s proportion of outdated data and outliers in the 2019 GFSI database”, Kenya did not have any outlier data points, even though outlier data points for other countries’ databases affected Kenya’s score and rank. Winsorising other countries’ data points improved Kenya’s overall score from 41.9 to 2018 to 50.7 in 2019, while the affordability dimension improved by 18.5 points from 38.2 to 2018 to 56.7 in 2019. However, the outlier data points for these countries could bias the GFSI results and undermine intended policy implementation by acting as an unintended benchmark. This section assesses the impact of winsorisation on the scores and ranks of countries with the outlier data points (Egypt, Sierra Leone, Singapore, Syria, Venezuela, Zambia) to their overall GFSI score and ranking relative to the other countries.

Venezuela’s affordability score reduced by 1.3 points (15.8 to 14.5) after the winsorisation of the change in the average food cost data point, which could imply that the outlier data point inflated Venezuela’s score. The change in the average food costs measures the percentage change in the cost of an average food basket in a country since 2010 as captured through the Food Consumer Price Index (FCPI) using 2010 = 100 as the base year (EIU 2019). A sharp increase in the cost of an average food basket could reduce food affordability, especially among low-income households who spend significant proportions of their income on food. Venezuela’s high FCPI (2695.2%) in 2019 could be attributed to the political turmoil the country has experienced since 2016 (EIU 2019). Compared to 2019, the cost of an average food basket in Venezuela had increased only by 649.4% from 2010 to 2015 (EIU 2019).

Venezuela also had an outlier data point in the urban absorption capacity. Venezuela’s urban absorption capacity was 22% lower in 2019 than any country. Venezuela’s real GDP has hugely reduced, coupled with a high cost of living in urban areas, explaining the negative urban absorption capacity (EIU 2019). For example, in 2013, Venezuela’s GDP per capita (US$ at PPP) was 18,237.2 US dollars, while in 2019, the GDP was reduced to 8,800.0 US dollars (EIU 2019). The negative urban growth rate and the high inflation rates could explain the reduction in Venezuela’s GDP (EIU 2019). As a result, Venezuela’s availability score was reduced by 1.1 points (32 to 30.9) after the urban absorption capacity data point was winsorised.

The agricultural import tariff indicator measures (as a percentage) the most-favoured-nation (MFN) tariff on all agricultural imports. High agricultural import tariffs can increase the cost of food imports and result in high food costs for consumers. In the 2019 GFSI, Egypt’s agricultural import tariff was 63% higher than all countries. Egypt’s high agricultural import tariff is mirrored in its high food prices. EIU (2019) highlights that Egypt’s average food basket price has nearly tripled in the past five years, affecting food affordability. Egypt’s affordability score was reduced by 20 points (57.6 to 37.6) after the agricultural imports tariff data point was winsorised.

Egypt’s irrigation infrastructure was also an outlier data point. The irrigation infrastructure measures the proportion of cultivated agricultural land area equipped for irrigation in a country. The availability of irrigation infrastructure in a country can support farmers’ ability to provide consistent water supply to crops, reducing the dependence on rainfed agriculture. Egypt’s cultivated agricultural land area for irrigation was 99.55% higher than all countries. Being a desert country largely dependent on irrigation explains Egypt’s high proportion of cultivated agricultural land area equipped for irrigation. Egypt’s availability dimension score reduced by four points from 70 to 66 after the irrigation infrastructure indicator was winsorised.

The change in the dependency on chronic food aid indicator measured the change in the dependency on emergency food aid per capita by a country over the past five years (EIU 2019). A country’s dependence on chronic food aid increases when the available food supply is insufficient to meet the population’s demand (EIU 2019). Due to persistent conflict and insecurity, Syria was almost entirely (90%) reliant on emergency food aid in 2019. The GFSI has highlighted that despite increasing the average food supply in most regions globally, Syria’s food security has deteriorated (EIU 2019). Syria’s availability dimension score reduced by 0.6 points from 38.9 to 38.3 after the change in dependency on the chronic food data point winsorisation.

Public expenditure on agricultural research and development measures the ratio of the agricultural share of government expenditure divided by the share of the agricultural value-added to the GDP. Singapore had the highest public expenditure on agricultural research and development, while Zambia was the only African country with a high share of public investment in agriculture at 75% in the 2019 GFSI (EIU 2019). However, Zambia’s poor performance in the indicator (scored 20.4) contradicts Zambia’s high public investment in agriculture compared to other African countries. Unlike developing countries, Singapore’s agricultural investments are mainly towards extensive agricultural research and technology development (EIU 2019). By winsorising the public expenditure on agricultural research and development data point Singapore’s availability dimension score reduced by 0.7 points (from 83 to 82.3). On the contrary, Zambia’s availability dimension score increased by 2.9 points from 51 to 53.9, explaining the implications of winsorisation to the scores of the different countries.

High food losses can reduce overall food availability in a country and increase food costs, consequently food insecurity. Moreover, food losses can reduce farmers’ incomes and necessitate overproduction to account for lost food (EIU 2019). The GFSI measures food losses (post-harvest and pre-consumer food losses) as a ratio of the total domestic supply of crops, livestock and fish commodities in tonnes to total food losses (EIU 2019). Sierra Leone’s food losses were 34.8 tonnes higher than any country in the 2019 GFSI. The lack of storage facilities, inadequate infrastructure and cold chains are some of the factors attributed to the high food losses in Sierra Leone (EIU 2019). As a result, Sierra Leone’s availability dimension score reduced by 6.6 (40.3 to 33.7) after the food losses data point was winsorisation. A graphical presentation of the countries with the winsorised data points is provided in the Additional file 1: Fig. S1.

Figure 3 shows the changes in the overall 2019 GFSI scores for the specific countries with outlying data points after the winsorisation. Egypt had the highest overall 2019 GFSI score reduction by 9.9 points from 64.5 to 54.6, while Singapore and Venezuela’s scores reduced by 1.1 points each (87.4 to 86.3 and 31.2 to 30.1), respectively.

Fig. 3
figure 3

Change in overall scores for countries with outlying datapoints after the winsorisation of outliers

All countries with outlying data points reduced in overall scores after the winsorisation, implying that outliers in these countries’ data points inflated their scores (Thomas et al. 2017). Countries without outliers in the 2019 GFSI also increased or decreased in scores, while some did not change. Results for the changes in scores for the 113 countries after winsorisation of the outlier data points is provided in the Additional file 1: Table S2.

Although Kenya did not have any outlier data point, Fig. 4 shows how winsorising outlier data points for other countries impacted Kenya’s scores. Kenya’s overall 2019 GFSI score reduced by six points from 50.7 to 44.7 after the winsorisation, implying that outlier data points for other countries inflated Kenya’s score. Kenya’s affordability dimension score was also reduced by 11.5 points (56.7 to 45.2) after the winsorisation of the agricultural import tariffs and the change in average food costs data points for the dimension.

Fig. 4
figure 4

Changes in Kenya’s 2019 GFSI dimension scores after winsorisation of outliers

Kenya’s availability dimension score reduced by 3.3 points, from 48.0 to 44.7, after winsorising outlier data points in the availability dimension. Overall, outliers in other countries’ data points inflated Kenya’s 2019 overall GFSI score, affordability and availability dimension scores. The implication is that outliers in other countries’ data points could act as unintended benchmarks, thereby not mirroring Kenya’s actual food security situations as well as for other assessed countries.

The impact of winsorisation of outlier data points to GFSI ranks

A Spearman’s rank test was conducted to determine if the 2019 GFSI rank differed after the winsorisation of outlier data points. The Spearman rho values (Table 4) obtained for the GFSI affordability, availability and the overall 2019 GFSI rank were not similar after the winsorisation of outlier data points (Spearman rho values close to one). This could imply the impact of the winsorisation on the 2019 GFSI ranks. However, the Spearman rho value in the quality and safety dimension was equal to one - indicating that countries’ rank for this dimension before and after the winsorisation of outlier data points were similar. This is also because no indicator was winsorised for this dimension. Note: the GFSI dimension ranks for countries are out of 113 for the entire section.

Table 4 Spearman’s rank correlation results for winsorisartion of outliers in the GFSI database

Figure 5 shows how countries with outlying indicators shifted in the overall GFSI rank after the winsorisation of outliers in the 2019 GFSI database. Singapore and Venezuela did not shift in the overall 2019 GFSI rank. However, Egypt had an enormous shift of 16 positions from 55 to 71. Egypt’s enormous shift in rank could be explained by the winsorisation of the agricultural import tariffs and irrigation infrastructure data points in the affordability and availability dimensions.

Fig. 5
figure 5

Shifts in overall 2019 GFSI rank for countries with outlying datapoints after winsorisation of outliers

Sierra Leone’s overall 2019 GFSI rank shifted by two positions from 106 to 108, while Syria shifted by one from 107 to tie rank, with Sierra Leone at 108. Sierra Leone and Syria’s food losses and the change in dependency on chronic food aid were respectively winsorised for the availability dimension.

Singapore retained position two in the affordability dimension, while Syria and Venezuela ranked second and last, respectively. Egypt had the highest shift of 16 positions, down from 81 to 97 in the affordability dimension. Sierra Leone shifted down in rank by two positions from 106 to 108. The shifts in rank for 113 countries after the winsorisation of outlier data points is provided in the Additional file 1: Table S3.

Syria had the highest shift in rank by ten positions up from 109 to 99 among countries with outlying data points in the availability dimension. Egypt had the second-highest shift in rank by six positions, down from 23 to 29 in the availability dimension. Singapore shifted from the second to fourth positions while Venezuela shifted up by three positions from 111 to 108. Sierra Leone maintained position 106 in the availability dimension after the winsorisation of the outliers in the dimension. Overall, Egypt and Sierra Leone were outliers in the agricultural import tariff and food losses data points in the 2016 and 2019 GFSI, respectively (Thomas et al. 2017). Egypt also shifted in rank by more than four positions in the 2016 and 2019 GFSI after winsorisation of the outliers (Thomas et al. 2017).

Figure 6 shows how Kenya shifted in rank in the GFSI dimensions after the winsorisation of outlier data points for other countries in the 2019 GFSI database. Kenya’s overall 2019 GFSI rank shifted down by one position, from 86 to 87. Kenya also shifted down in rank by five positions from 83 to 88 in the affordability dimension. However, Kenya’s availability dimension’s rank improved after the outliers’ winsorisation for other countries’ data points.

Fig. 6
figure 6

Kenya’s shifts in rank in the GFSI dimensions after the winsorisation of outliers

Kenya shifted up in rank by seven positions in the availability dimension from 93 to 86 - implying that Kenya’s rank in the availability dimension was by outlier data points in the GFSI dimensions (Thomas et al. 2017). Thus, we rejected the null hypothesis that there was no statistically significant effect of outlier data points in the GFSI scores and ranking. The outliers affected countries even if a country did not have an outlier data point.

Significance of updating Kenya’s outdated data to its GFSI scores and rank relative to the 113 countries

The study’s third objective was to determine if updating Kenya’s 2019 GFSI’s outdated data resulted in a statistically significant change in Kenya’s overall GFSI score and rank relative to the 113 countries. Overall, Kenya’s 2019 GFSI database contained 13 outdated indicators. However, only five (38%) of these outdated indicators were updated due to data unavailability -implying the critical challenge of data availability (OECD 2008).

A paired t-test was conducted to determine the statistically significant effect of updating Kenya’s outdated data points to its score. Kenya’s overall 2019 GFSI score increased by 0.5 points from 50.7 to 51.2 after updating five, while Kenya’s affordability dimension score increased by 0.2 points from 56.7 to 57.2. The GFSI mean score also increased by 0.003 and 0.021 points in the affordability and the quality and safety dimensions, respectively (Table 5). The overall 2019 GFSI mean score increased by 0.010 points. While the p-values for the GFSI dimensions were not statistically significant, the overall 2019 GFSI score was significant, even though the result was not different from zero. The result implies that while updating Kenya’s outdated indicators increased Kenya’s scores: the impact was minimal to change all countries’ overall 2019 GFSI mean score.

Table 5 Paired t-test result for updating Kenya’s 2019 GFSI outdated data

The change in average food costs and gross domestic product per capita (US$ PPP) was updated for the affordability dimension. Kenya’s quality and safety dimension score increased by 2.4 points (43.2 to 45.6) after updating the ability to store food safely, dietary diversity and the proportion of the population with access to potable water indicators. Overall, Kenya’s scores increased after updating the outdated indicators. However, the change in score was minimal to change the overall 2019 GFSI mean score for all countries.

The Spearman’s rho values obtained after updating Kenya’s outdated indicators were equal to one, implying that the 2019 GFSI and the updated GFSI rank for Kenya relative to the 113 countries were not different. However, Angola, Benin, Cambodia, Pakistan shifted in rank. Kenya shifted up in overall rank by one position (86 to 85) to replace Benin’s initial rank at position 85. Kenya displaced Cambodia from position 83 to 84 in the affordability dimension and shifted up to tie with Honduras at position 82. Kenya also shifted up in rank by two positions (94 to 92) in the quality and safety dimension displacing Angola and Pakistan down in rank by one position each. Angola and Pakistan shifted from position 92 to 93 and 93 to 94, respectively. Therefore, the null hypothesis that updating Kenya’s 2019 GFSI outdated indicators did not result in a statistically significant change to Kenya’s score and rank relative to the 113 countries was accepted.

Conclusion

While the GFSI attempts to assess the factors contributing to food insecurity globally, outdated data and outliers could threaten its reliability. This study set out to assess the impact of outdated data and outliers on Kenya’s 2019 GFSI result relative to the other countries. The GFSI database contained seven outlier data points. The winsorisation of these data points affected countries’ scores and ranks even if a country had no outlying data point. Contrary to the conclusion by Thomas et al. (2017) that outliers in the GFSI had minimal impact on countries’ scores and ranking, this study concluded that the outlier data points affect the GFSI scores and ranking. These outliers could act as unintended benchmarks and bias the GFSI results, affecting countries’ policy implementation, given that most countries rely on the GFSI results for policymaking.

Outdated data also affects the GFSI benchmarking exercise. Kenya’s 2019 GFSI score and rank increased after updating the affordability and quality and safety dimension indicators, consequently increasing Kenya’s overall GFSI score. If not updated, these outdated indicators could impede the GFSI’s objective of assessing the food security environment in different countries while obstructing the GFSI from conveying helpful information on food security situations, drivers or progress towards achieving global goals such as the SDGs. One recommendation is that countries should frequently update and release national data for open access by the public. Open access to such data will improve the food systems’ performance towards achieving global and national food security while easing the benchmarking process by composite indicators. Open access to updated national databases would also trigger further research. The available data could be utilised to create new knowledge and new products and formulate essential programmes for achieving food security. The lack of updated national data is one of the hindering factors toward effective policymaking, especially in most African countries (Benin et al. 2020).

One limitation of this study that future research could focus on is the normalisation method for the indicators. The GFSI uses the min-max normalisation to standardise data into a comparable unit from different sources. While this normalisation method is linked to the GFSI countries ranking, updating the outdated indicators for Kenya from different databases required the data to be re-normalised to make it comparable with the GFSI data. This re-normalisation could be could affect the GFSI’s normalised data for the other indicators, consequently affecting the indicators’ weighting.

Availability of data and materials

The datasets generated and analysed during the current study are available in the Global Food Security Index repository at https://foodsecurityindex.eiu.com/.

References

Download references

Acknowledgements

Not applicable.

Funding

Mastercard Foundation Bursary.

Author information

Authors and Affiliations

Authors

Contributions

Not applicable.

Corresponding author

Correspondence to Sheryl L Hendriks.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declared that they have no competing interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Table S1.

GFSI dimensions, indicators and sub indicators. Table S2. Changes in scores for the 113 countries after winsorisation of the outlier data points. Table S3. Shifts in ranks for the 113 countries after the winsorisation of the outlier data points in the 2019 GFSI database.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Atieno, P., Hendriks, S.L. The effects of outdated data and outliers on Kenya’s 2019 Global Food Security Index score and rank. CABI Agric Biosci 4, 6 (2023). https://doi.org/10.1186/s43170-023-00140-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s43170-023-00140-y