Data Issues

Country Names-- One of the very frustrating problems in dealing with data bases from different sources is linking the data by country name. Each organization has its own special spellings and in some cases the spellings differ from table to table within the same organization. When country codes are given, it is rare for organizations to be consistent in the use of all codes, even if most codes are standard ISO three character codes. This means that users must often create their own tables to match country names from each of the sources they are using. Fortunately there is a spelling and a set of two and three character identification codes recognized internationally and provided by the International Standards Organization (ISO). The data base of country names and two character codes can be downloaded from the ISO . The ISO provides country lists in both English and French, but both lists use the same two character code for the country.

To show the lack of uniformity of spellings we have compared the country name spellings in some of the more prominent multi-sectoral data bases with the ISO standard. The data bases used include FAOSTAT provide by the Food and Agriculture Organization of the UN (FAO),  the Human Development Indicators (HDI) provided by the United Nations Development Programme(UNDP), The World Economic Outlook provided by the International Monetary Fund (IMF), the United Nations (UN) official country list, UNAIDS time series data, the Population Information (POPIN) data base of the United Nations, population data provided by the US Census Bureau (USCB), World Development Indicators (WB) provided by the World Bank, and data provided by the World Food Programme (WFP),

The table below shows the data bases and the countries for which there is a spelling discrepancy (marked         ) between the data base and the ISO standard short spelling. For example  there are 29 spelling differences in the WB data base and12 in UNDP.  Coordinating the spellings between all of the data bases is not practical, however if the various providers included the ISO three character alphabetic code in each of their data bases, then it would be possible to link data from various data bases with much less difficulty. 

Copyright 1998/2015 GRI    Updated 5 May 2015