Empirics of Economic Growth Replicated, Revisited I

In this post I replicate the findings of Mankiw, Romer and Weil’s ‘A Contribution to the Empirics of Economic Growth’. This should be a pretty simple task since the paper only uses 121 observations; has been replicated before by Bernanke and Giirkaynak and uses a straightforward econometric framework but I run into a few issues. Ultimately, I successfully replicate the authors’ main findings but fail to satisfactorily replicate the authors’ estimates of the Solow residual (i.e. we get different intercepts). The ‘new’ old data I collect is available on GitHub.

Data

MRW use data from the Real National Accounts collated by Summers and Heston - now known as the Penn World Table - as well as the World Bank’s World Tables and World Development Report 1988.¹ Unfortunately, the PWT 4.0, which MRW use, is no longer available online (the earliest version available is 5.6 but this has revised figures) however the advantage of replicating a paper written in 1992 is that all the data can be found tabulated in the appendix:

Converting the data from an appendix pdf table into R requires the tabulizer library (and its dependency rJava):

library(tabulizer)
library(dplyr)


MRW_pdf <- extract_tables('assets/Original Data/MRW_appendix.pdf')

# Each list entry corresponds to a page of the appendix
df.1 <- MRW_pdf[[1]] %>% 
  as_tibble()

df.2 <- MRW_pdf[[2]] %>% 
  as_tibble()

df.3 <- MRW_pdf[[3]] %>% 
  as_tibble()
knitr::kable(head(df.1))

V1	V2	V3	V4	V6	V7	V8	V9	V10	V11
Number	Country	N	I	1960	1985	GDP	age pop
1	Algeria	1	1	2485	4371	4.8	2.6	24.1	4.5
2	Angola	1	0	1588	1171	0.8	2.1	5.8	1.8
3	Benin	1	0	1116	1071	2.2	2.4	10.8	1.8
4	Botswana	1	1	959	3671	8.6	3.2	28.3	2.9
5	Burkina Faso	1	0	529	857	2.9	0.9	12.7	0.4

Unfortunately not everything is perfect: the variables are saved as strings, the column names appear as the first row and some values are in the columns they shouldn’t be; here the OECD dummy and GDP in 1960 have both been stored in V5:

print(class(df.3$V6))

[1] "character"

knitr::kable(df.3 %>% filter(V2 == 'United States'))

V1	V2	V3	V4	V5	V6	V7	V8	V9	V10	V11
104	United States	1	1	1 12,362		18,988	3.2	1.5	21.1	11.9

The column names are pretty easy to fix using a generic cleaning function and mapping the three appendix page dataframes into a single tibble:

library(purrr)

clean.table <- function(table, columns){
  table <- table[2:nrow(table), ]
  colnames(table) <- columns
  
  return(table)
}
cols <- c("number",
          "country",
          "N",
          "I",
          "O",
          "1960",
          "1985",
          "gdp_ch",
          "working_age_pop_ch",
          "s",
          "school")

MRW_data <- list(df.1, df.2, df.3) %>% 
  map_df(~clean.table(table = .x, columns = cols)) %>% 
  mutate(nchar_oecd = nchar(O))

The jumbled values are a bit harder to solve. Spotting that errors aren’t random but only occur in observations with 5 digit (i.e. 10,000+) GDP figures speeds up the process - this is where nchar_oecd above comes in. We can use separate to split the OECD column whenever it encounters a space character.

library(tidyr)
MRW_data_ok_oecd <- MRW_data %>% 
  filter(nchar_oecd == 1) %>% 
  select(-nchar_oecd)


MRW_data_not_ok_oecd <- MRW_data %>% 
  filter(nchar_oecd > 1) %>% 
  separate(col = 'O', sep = ' ', into = c('O', '1960')) %>% 
  select(-nchar_oecd)

Next we convert the strings to numeric values whilst taking into account the presence of commas in the 1960 and 1985 GDP variables. Finally, we add 5 to the working_age_pop_ch variable - this corresponds to MRW using 0.05 as the value of depreciation and technology growth.²

MRW_clean <- bind_rows(MRW_data_not_ok_oecd,
                       MRW_data_ok_oecd) %>%
  mutate_at(c('N',
              'number',
              'I',
              'O',
              'gdp_ch',
              'working_age_pop_ch',
              's',
              'school'), as.numeric) %>% 
  mutate('1985' = as.numeric(gsub(',', '', `1985`)),
         '1960' = as.numeric(gsub(',', '', `1960`))) %>% 
  mutate(n_g_d = working_age_pop_ch + 5)

The data has the right number of OECD, intermediate and non-oil countries and no particularly unusual observations suggesting anything has gone awry. I have checked every OECD datapoint against the original appendix and a (ad-hoc) random sample of non-OECD observations. (Missing values come from those missing in the original appendix.)

library(plotly)
library(ggplot2)

skimr::skim(MRW_clean) # skimr's histograms are known to be buggy in Rmd.

Skim summary statistics
 n obs: 121 
 n variables: 12 

-- Variable type:character ---------------------------------------------
 variable missing complete   n min max empty n_unique
  country       0      121 121   4  18     0      121

-- Variable type:numeric -----------------------------------------------
           variable missing complete   n    mean      sd    p0     p25
               1960       5      116 121 3681.82 7492.88 383    973.25
               1985      13      108 121 5683.26 5688.67 412   1209.25
             gdp_ch       4      117 121    4.09    1.89  -0.9    2.8 
                  I       0      121 121    0.62    0.49   0      0   
                  N       0      121 121    0.81    0.39   0      1   
              n_g_d      14      107 121    7.28    1      5.3    6.7 
             number       0      121 121   61      35.07   1     31   
                  O       0      121 121    0.18    0.39   0      0   
                  s       0      121 121   18.16    7.85   4.1   12   
             school       3      118 121    5.53    3.53   0.4    2.4 
 working_age_pop_ch      14      107 121    2.28    1      0.3    1.7 
     p50     p75    p100     hist
 1962    4274.5  77881   <U+2587><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581>
 3484.5  7718.75 25635   <U+2587><U+2583><U+2581><U+2581><U+2582><U+2581><U+2581><U+2581>
    3.9     5.3      9.2 <U+2581><U+2582><U+2585><U+2587><U+2586><U+2583><U+2582><U+2581>
    1       1        1   <U+2585><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2587>
    1       1        1   <U+2582><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2587>
    7.4     7.9     11.8 <U+2583><U+2583><U+2587><U+2586><U+2581><U+2581><U+2581><U+2581>
   61      91      121   <U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587><U+2587>
    0       0        1   <U+2587><U+2581><U+2581><U+2581><U+2581><U+2581><U+2581><U+2582>
   17.7    24.1     36.9 <U+2586><U+2586><U+2587><U+2587><U+2587><U+2585><U+2585><U+2581>
    4.95    8.17    12.1 <U+2587><U+2587><U+2586><U+2583><U+2585><U+2586><U+2583><U+2585>
    2.4     2.9      6.8 <U+2583><U+2583><U+2587><U+2586><U+2581><U+2581><U+2581><U+2581>

A few interactive plots using plotly:

MRW_clean_subset <- MRW_clean %>% 
  select(-O)
q <- ggplot(MRW_clean, aes(y = log(`1960`), x = school, text = country)) +
  geom_point(data = MRW_clean_subset, colour = 'grey', alpha = 0.2, fill = 'grey') +
  geom_point(data = MRW_clean, colour = 'deeppink') +
  facet_wrap(~O) +
  ylab('Log Real GDP pc in 1960') +
  ggtitle('Schooling vs GDP in OECD and non-OECD countries') +
  theme_minimal()

rm(MRW_clean_subset)
ggplotly(q)

p <- plot_ly(data = MRW_clean,
             x = ~s,
             y = ~`1985`,
             size = ~school,
             text = ~country,
             type = 'scatter',
             mode = 'markers'
             ) %>% 
  layout(yaxis = list(title = 'Log Real GDP pc in 1985'),
         xaxis = list(title = 'Savings rate (I/Y)'),
         title = 'GDP vs Savings rate \n size = schooling')
p

Estimating the Solow Model and Augmented Solow Model

The regression equation Mankiw, Romer and Weil estimate can be written as \(ln(GDP_i) = \alpha + ln(s_i) + ln(n_i+g_i+d_i) + e_i\) where GDP is actually GDP per working age person (although the appendix displays this as GDP/adult in the interests of brevity); \(s_i\), the savings rate, is calculated using investment as a share of GDP; \(n_i\) is the average change in working age population between 1960 and 1985 and \(g_i+d_i\) are assumed to sum to 0.05.

The augmented model is the same but we add in a school variable to control for human capital.

Formulating the above in R and splitting countries into subsets gives:

solow_formula <- log(`1985`) ~ log(s) + log(`n_g_d`)

## Non-Oil
non_oil_data <- MRW_clean %>% 
  filter(N == 1) %>% 
  filter(!is.na(school)) # MRW drop missing school observations to make comparisons easier

non_oil_model <- lm(solow_formula,
                    data = non_oil_data)

## Intermediate
intermediate_data <- MRW_clean %>% 
  filter(I == 1) %>% 
  filter(!is.na(school))

intermediate_model <- lm(solow_formula,
                         data = intermediate_data)

## OECD
oecd_data <- MRW_clean %>% 
  filter(O == 1) %>% 
  filter(!is.na(school))

oecd_model <- lm(solow_formula,
                 data = oecd_data)

#### Augmented Model
augmented_formula <- log(`1985`) ~ log(s) + log(n_g_d) + log(school)

## Non-Oil
aug_non_oil_model <- lm(augmented_formula,
                        data = non_oil_data)


## Intermediate
aug_intermediate_model <- lm(augmented_formula,
                             data = intermediate_data)


## OECD

aug_oecd_model <- lm(augmented_formula,
                     data = oecd_data)

Now collecting and comparing results using broom’s tidy. We use map2_dfr because we want each observation to have a variable describing which model it came from and I haven’t figured out a simpler way to do this:

library(broom)
collect.results <- function(models, names){
  
  results <- map2_dfr(.x = models,
                      .y = names,
                      .f = function(x, y){tidy(x) %>% mutate(subset = y)}) %>% 
    as_tibble()
  return(results)
}

solow_rep <- collect.results(list(non_oil_model,
                                 intermediate_model,
                                 oecd_model),
                            list('Non-Oil',
                                 'Intermediate',
                                 'OECD'))

augmented_rep <- collect.results(list(aug_non_oil_model,
                                     aug_intermediate_model,
                                     aug_oecd_model),
                                list('Non-Oil',
                                     'Intermediate',
                                     'OECD'))

solow_MRW <- tribble(
  ~term, ~estimate, ~std.error, ~statistic, ~p.value, ~subset,
  '(Intercept)', 5.48, 1.59, NA, NA, 'Non-Oil',
  'log(s)', 1.42, 0.14, NA, NA, 'Non-Oil',
  'log(n_g_d)', -1.97, 0.56, NA, NA, 'Non-Oil',
  '(Intercept)', 5.36, 1.55, NA, NA, 'Intermediate',
  'log(s)', 1.31, 0.17, NA, NA, 'Intermediate',
  'log(n_g_d)', -2.01, 0.53, NA, NA, 'Intermediate',
  '(Intercept)', 7.97, 2.48, NA, NA, 'OECD',
  'log(s)', 0.5, 0.43, NA, NA, 'OECD',
  'log(n_g_d)', -0.76, 0.84, NA, NA, 'OECD'
)

augmented_MRW <- tribble(
  ~term, ~estimate, ~std.error, ~statistic, ~p.value, ~subset,
  '(Intercept)', 6.89, 1.17, NA, NA, 'Non-Oil',
   'log(s)',  0.69, 0.13, NA, NA, 'Non-Oil',
  'log(n_g_d)', -1.73, 0.41, NA, NA, 'Non-Oil',
  'log(school)', 0.66, 0.07, NA, NA, 'Non-Oil',
  '(Intercept)', 7.81, 1.19, NA, NA, 'Intermediate',
  'log(s)', 0.7, 0.15, NA, NA, 'Intermediate',
  'log(n_g_d)', -1.5, 0.4, NA, NA, 'Intermediate',
  'log(school)', 0.73, 0.1, NA, NA, 'Intermediate',
  '(Intercept)', 8.63, 2.19, NA, NA, 'OECD',
  'log(s)', 0.28, 0.39, NA, NA, 'OECD',
  'log(n_g_d)', -1.07, 0.75, NA, NA, 'OECD',
  'log(school)', 0.76, 0.29, NA, NA, 'OECD'
)

The MRW data comes from just manually transferring estimates and standard errors from Table 1 and Table 2 into a tibble using tribble.

Comparing Results

We need to collect parameters from each of our models and compare between replicated and original estimates. This function selects the right parameter (and re-orders them for easier comparison); merges the two dataframe estimates into one and calculates the estimated differences. We use grep to find replicated and original columns and pull lets us access the values in each selected tibble (I’m not entirely sure why we can’t subtract tibble A from tibble B here):

compare.results <- function(parameter, parameter_type = 'estimate', results_replicated, results_original){
  replicated_comparison <- results_replicated %>% 
    filter(term == parameter) %>% 
    select(term, subset, parameter_type)
  
  original_comparison <- results_original %>% 
    filter(term == parameter) %>% 
    select(term, subset, parameter_type)
  
  comparison_df <- full_join(x = replicated_comparison,
                             y = original_comparison,
                             by = c('term', 'subset'),
                             suffix = c('_replicated', '_original'))
  comparison_df$diff <- comparison_df[, grep('_original', colnames(comparison_df))] %>% 
    pull - comparison_df[, grep('_replicated', colnames(comparison_df))] %>% 
    pull
  comparison_df$rounded_diff <- round(comparison_df$diff, 2)
  comparison_df$pct_orig <- comparison_df$diff / comparison_df[, grep('_original', colnames(comparison_df))] %>% pull * 100
  return(comparison_df)
}

We then use map_df to apply the function to each parameter type and collect our output in a dataframe:

parameters <- c('(Intercept)',
                'log(s)',
                'log(n_g_d)')

solow_comparison_estimates <- parameters %>% 
  map_df(compare.results,
         parameter_type = 'estimate',
         results_replicated = solow_rep,
         results_original = solow_MRW)

augmented_comparison_estimates <- parameters %>% 
  map_df(compare.results,
         parameter_type = 'estimate',
         results_replicated = augmented_rep,
         results_original = augmented_MRW)


solow_comparison_std_error <- parameters %>% 
  map_df(compare.results,
         parameter_type = 'std.error',
         results_replicated = solow_rep,
         results_original = solow_MRW)

augmented_comparison_std_error<- parameters %>% 
  map_df(compare.results,
         parameter_type = 'std.error',
         results_replicated = augmented_rep,
         results_original = augmented_MRW)

Looking at Table 1’s replicated estimates is a mixed bag:

term	subset	estimate_replicated	estimate_original	diff	rounded_diff	pct_orig
(Intercept)	Non-Oil	8.0353051	5.48	-2.5553051	-2.56	-46.6296552
(Intercept)	Intermediate	8.5678580	5.36	-3.2078580	-3.21	-59.8480966
(Intercept)	OECD	9.1352049	7.97	-1.1652049	-1.17	-14.6198864
log(s)	Non-Oil	1.4240143	1.42	-0.0040143	0.00	-0.2826960
log(s)	Intermediate	1.3175527	1.31	-0.0075527	-0.01	-0.5765386
log(s)	OECD	0.4998895	0.50	0.0001105	0.00	0.0220925
log(n_g_d)	Non-Oil	-1.9897745	-1.97	0.0197745	0.02	-1.0037809
log(n_g_d)	Intermediate	-2.0171995	-2.01	0.0071995	0.01	-0.3581838
log(n_g_d)	OECD	-0.7419215	-0.76	-0.0180785	-0.02	2.3787554

The variable estimates are on point but we’re pretty off on the intercept. In fact, the same is true for both the standard errors and the augmented Solow model (which can also be found on GitHub.):

all_estimates <- bind_rows(solow_comparison_estimates,
                         augmented_comparison_estimates)
all_std_errors <- bind_rows(solow_comparison_std_error,
                            augmented_comparison_std_error)

mean_est_diff <- all_estimates %>% 
  group_by(term) %>% 
  summarise(mean(diff),
            mean(pct_orig))
mean_est_diff

# A tibble: 3 x 3
  term        `mean(diff)` `mean(pct_orig)`
  <chr>              <dbl>            <dbl>
1 (Intercept)     -1.52            -25.4   
2 log(n_g_d)       0.00490          -0.0608
3 log(s)          -0.00244          -0.0802

mean_std_diff <- all_std_errors %>% 
  group_by(term) %>% 
  summarise(mean(diff),
            mean(pct_orig))
mean_std_diff

# A tibble: 3 x 3
  term        `mean(diff)` `mean(pct_orig)`
  <chr>              <dbl>            <dbl>
1 (Intercept)      0.272             16.7  
2 log(n_g_d)      -0.00576           -0.972
3 log(s)          -0.00171           -0.995

Conclusion

Our replicated intercepts are off on average by -1.525 in absolute terms or -25.4%.³

Everything else, however, is pretty spot on. So, what’s happening here?

The fact that the variable estimates are near perfect i.e. we’re identifying the same slope as MRW but the wrong ‘location’ points to a couple of things. Since the intercept is off and there was a little confusion as to how add a constant to the model, recall MRW’s +0.05 for depreciation and technology growth, it seems plausible at first glance that we’ve got the wrong scale when we added +5 to working_age_pop_ch and should have perhaps added 0.05.

However, I don’t think this holds up simply because whilst the behaviour exhibited perfectly matches the case of adding a constant to a regressor⁴ the relationship isn’t linear but logarithmic and therefore not linearly separable⁵. Therefore I believe if this is the case our variable estimates should be off and whilst they’re not absolutely perfect they’re pretty close.

This close but not perfect leads me to my second hypothesis - the data MRW use is subtly different to the data in their appendix. Namely, appendix data are all presented as either integers or to two decimal places but MRW probably used more precise data. This would explain why our variables, which are in logs, are close but not perfect. That is, they’re less sensitive to small fluctuations in precision and our constant which is the only factor ‘measured’ in absolute terms is influenced more by these precision differences.

Finally, it’s totally possible that I’ve completely misinterpreted part of the paper or mis-specified the models. I hope this isn’t the case but the two ideas above look like a reasonable starting point to investigate further and will feature in my next post. The former will involve playing around with a lot of different transformations to \(s\) and \(n+g+d\) whilst the latter will look similar to wild block bootstrapping where the block is the data.

If you spot any mistakes or have any suggestions, please do get in touch.

Ben S. Bernanke & Refet S. Gürkaynak, 2002. “Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously,” NBER Chapters,in: NBER Macroeconomics Annual 2001, Volume 16, pages 11-72 National Bureau of Economic Research, Inc.↩
Mankiw, Romer and Weil aren’t entirely consistent in their treatment of percentages in the paper - the appendix shows that working age population change has mean value 2.28 but the authors refer to \(g + d = 0.05\) in the main body of the paper. Common sense and footnote 6 suggest that this means a 5% depreciation rate and the mean working age population should be treated as 2.28% (and not 200%…), I return to this later when the Solow residual fails to match.↩
This metric is a little tricky to interpret since we’re using different models and three different subsets of countries.↩
\(y = \alpha + \beta(x+1)\) can be written as \(y = \alpha + \beta + \beta x\) and identifies the ‘same’ \(\beta\) but a different constant.↩
\(log(1 + x) \neq log(x) + 1\) although they’re similar if \(x\) is small which arguably it is here.↩