Skip to contents

Installation

You can install the development version of demor from GitHub with:

# install.packages("devtools")
devtools::install_github("vadvu/demor")

Get Rosbris data

DEPRECATED

For getting data from The Russian Fertility and Mortality database (2024) there is a function get_rosbris() that can download data on mortality/fertility by 1/5-year age groups from 1989 to the last available year (in 2023 its 2022).
Worth noting: downloading and preparing the final file can get some time.
In the chunk below mortality data (type = "m", for fertility see Fertility section) for 1-age groups with population in “long” format is loading. For more function description use ?get_rosbris

# dbm <- get_rosbris(
#   #mortality data
#   type = "m",
#   #what age group download
#   age =  1,
#   #to get "long" data
#   initial = F,
#   #last available year (the name of the downloading file contains years, so for the downloading the last year is required)
#   lastyear = 2022
# )

Recently RosBris has been switched to new website, so the function get_rosbris does not work now. I hope in a several months all mistakes due to new website will be fixed and function will work.

Now RosBris data is presented in the demor as datasets in the long-format. The example of usage that replicates the data from the previous chunk with get_rosbris:

dbm <- demor::rosbris_mortality_pop_5

# for 1-year age interval
# dbm <- demor::rosbris_mortality_pop_1

Lets see the data for Russia in 2010 for males and for total population (both urban and rural)

dbm[dbm$year==2010 & dbm$code==1100 & dbm$sex=="m" & dbm$territory=="t",]
#>       year code territory sex age       mx       N        Dx
#> 20445 2010 1100         t   m   0 0.008823  869388   7670.61
#> 20446 2010 1100         t   m   1 0.000604 3222450   1946.36
#> 20447 2010 1100         t   m   5 0.000355 3613276   1282.71
#> 20448 2010 1100         t   m  10 0.000387 3398894   1315.37
#> 20449 2010 1100         t   m  15 0.001186 4353344   5163.07
#> 20450 2010 1100         t   m  20 0.002546 6193325  15768.21
#> 20451 2010 1100         t   m  25 0.004491 6002262  26956.16
#> 20452 2010 1100         t   m  30 0.006808 5395865  36735.05
#> 20453 2010 1100         t   m  35 0.007934 4973298  39458.15
#> 20454 2010 1100         t   m  40 0.009782 4464789  43674.57
#> 20455 2010 1100         t   m  45 0.013354 5140274  68643.22
#> 20456 2010 1100         t   m  50 0.018567 5207919  96695.43
#> 20457 2010 1100         t   m  55 0.026250 4333619 113757.50
#> 20458 2010 1100         t   m  60 0.037143 3117320 115786.62
#> 20459 2010 1100         t   m  65 0.049925 1573662  78565.08
#> 20460 2010 1100         t   m  70 0.068599 2149929 147482.98
#> 20461 2010 1100         t   m  75 0.097635 1077916 105242.33
#> 20462 2010 1100         t   m  80 0.138043  714191  98589.07
#> 20463 2010 1100         t   m  85 0.198593  231350  45944.49
#>                       name
#> 20445 Российская Федерация
#> 20446 Российская Федерация
#> 20447 Российская Федерация
#> 20448 Российская Федерация
#> 20449 Российская Федерация
#> 20450 Российская Федерация
#> 20451 Российская Федерация
#> 20452 Российская Федерация
#> 20453 Российская Федерация
#> 20454 Российская Федерация
#> 20455 Российская Федерация
#> 20456 Российская Федерация
#> 20457 Российская Федерация
#> 20458 Российская Федерация
#> 20459 Российская Федерация
#> 20460 Российская Федерация
#> 20461 Российская Федерация
#> 20462 Российская Федерация
#> 20463 Российская Федерация

Mortality

Life table

Now one can create life table based on gotten data for 2010-Russia using LT().
Note, axa_x for age 0 is modeled as in Evgeny M. Andreev and Kingkade (2015), while the function can use user-specific axa_x from the argument ax. However, despite there is a plethora of methods to construct life table that “are based upon very different assumptions, when applied to actual mortality rates they do not result in significant differences of importance to mortality analysis.” (WHO, 1977, p. 70, cite from Preston, Heuveline, and Guillot 2001, 47). For example, changing a0a_0 to 0.5 only increases the life expectancy e0e_0 by +0.01\approx +0.01 for the 2010 Russian male population, which is about 4 days when applied to human life span.

rus2010 <- dbm[dbm$year==2010 & dbm$code==1100 & dbm$sex=="m" & dbm$territory=="t",]

LT(
  age = rus2010$age, 
  sex = "m", 
  mx = rus2010$mx #age specific mortality rates (mx)
  )
#>       age      mx    ax      qx      lx      dx      Lx       Tx    ex
#>  [1,]   0 0.00882 0.132 0.00876 1.00000 0.00876 0.99885 63.04122 63.04
#>  [2,]   1 0.00060 2.000 0.00241 0.99124 0.00239 3.96019 62.04238 62.59
#>  [3,]   5 0.00036 2.500 0.00177 0.98885 0.00175 4.93988 58.08218 58.74
#>  [4,]  10 0.00039 2.500 0.00193 0.98710 0.00191 4.93072 53.14231 53.84
#>  [5,]  15 0.00119 2.500 0.00591 0.98519 0.00582 4.91139 48.21159 48.94
#>  [6,]  20 0.00255 2.500 0.01265 0.97937 0.01239 4.86586 43.30020 44.21
#>  [7,]  25 0.00449 2.500 0.02221 0.96698 0.02147 4.78120 38.43434 39.75
#>  [8,]  30 0.00681 2.500 0.03347 0.94550 0.03165 4.64841 33.65314 35.59
#>  [9,]  35 0.00793 2.500 0.03890 0.91386 0.03555 4.48042 29.00473 31.74
#> [10,]  40 0.00978 2.500 0.04774 0.87831 0.04193 4.28672 24.52431 27.92
#> [11,]  45 0.01335 2.500 0.06461 0.83638 0.05404 4.04679 20.23759 24.20
#> [12,]  50 0.01857 2.500 0.08872 0.78234 0.06941 3.73817 16.19080 20.70
#> [13,]  55 0.02625 2.500 0.12317 0.71293 0.08781 3.34513 12.45263 17.47
#> [14,]  60 0.03714 2.500 0.16994 0.62512 0.10623 2.86003  9.10751 14.57
#> [15,]  65 0.04992 2.500 0.22193 0.51889 0.11516 2.30657  6.24748 12.04
#> [16,]  70 0.06860 2.500 0.29278 0.40374 0.11821 1.72316  3.94091  9.76
#> [17,]  75 0.09764 2.500 0.39240 0.28553 0.11204 1.14754  2.21775  7.77
#> [18,]  80 0.13804 2.500 0.51313 0.17349 0.08902 0.64489  1.07021  6.17
#> [19,]  85 0.19859 5.035 1.00000 0.08447 0.08447 0.42532  0.42532  5.04

Note, from life table one can compute other functions (not just exe_x) and quantities of interest:

  1. Crude death rate CDR=1/e0=xDx/xNxCDR = 1/e_0 = \sum_x D_x / \sum_x N_x or death rate above some age x:1/exx: 1/e_x
  2. Probability of surviving from age xx to age yy: p(x,y)=ly/lxexp(mx×n)p(x,y)=l_y/l_x \approx exp(-m_x \times n)
  3. Probability that a newborn will die between ages xx and x+nx+n: dxn/l0d_x^n/l_0
  4. Probability that a newborn will die between ages xx and yy: (lxly)/l0(l_x-l_y)/l_0
  5. Probability that a newborn will survive to age xx: p(x)=lx/l0p(x) = l_x/l_0
  6. Probability that a newborn will die in age xx: q(x)=1pxq(x) = 1-p_x
  7. Life course ratio from age xx to yy that is the fraction of person-years lived from age xx onward: Ty/TxT_y/T_x
  8. Crude estimate of the number of births needed to “replace” expected deaths: P/e0P/e_0 where PP is total population

See also Preston, Heuveline, and Guillot (2001).

The life table is the one of the most important demographic tools that can be used not only for mortality, but for any other decrement processes (for ex., marital, occupation, migration status and etc).

Human Life Indicator (HLI)

A good alternative to the human development indicator (HDI) is the human life indicator (HLI) proposed by Ghislandi, Sanderson, and Scherbov (2019). It requires just mxm_x (and it is based on life table). It is calculated as geometric mean of lifespans: HLI=xω(x+ax)dxHLI = \prod_x^\omega{(x+a_x)^{d_x}} where xx is age and axa_x, dxd_x are functions from life table that corresponds to age xx.

Calculation in the demor is as follows (one need only mxm_x):


hli(
  age = rus2010$age, 
  mx = rus2010$mx,
  sex = "m"
  )
#> [1] 57.18468

Gini coefficient

As Shkolnikov, Andreev, and Begun (2003) note, “at present, the average level of life expectancy is high in many countries and it is interesting to study to what extent this advantage is equally accessible to all people”. (p. 306). One of the tools for analysing mortality inequality within the population (or, rather, the life table) is the Gini index, which is similar to the “usual” index widely used in economics. Life table Gini, G0G_0, shows a degree of inter-individual variability in age at death and can be interpreted as in other fields (G0[0,1]G_0 \in [0,1]; higher G0G_0, higher inequality and vice versa). Also one can be interested in “absolute” Gini coefficient (aka AID - Absolute Inter-individual Difference) that is G0abs=G0×e0G_0^{abs} = G_0 \times e_0 which “is equal to the average inter-individual difference in length of life and is measured in years.” (Shkolnikov, Andreev, and Begun 2003, 312)

Below is the function gini that computes Gini index as well as produces data for Lorenz curve - the graphical representation of inequality. The formulas for the calculations are derived from Shkolnikov, Andreev, and Begun (2003) (that is based on the Hanada (1983) formulation).

dbm.1 <- demor::rosbris_mortality_pop_1
mx <- dbm.1[dbm.1$code==1100 & dbm.1$territory=="t" & dbm.1$sex=="f" & dbm.1$year == 1995,]$mx
res = gini(age = 0:100, mx = mx, sex = "f")
res$Gini
#> $G0
#>        ex 
#> 0.1295455 
#> 
#> $G0_abs
#>      ex 
#> 9.27805

Note, in Shkolnikov, Andreev, and Begun (2003) the G0=0.13G_0 = 0.13 (see p. 310, figure 1), while using gini one obtains 0.1295. Also note that the larger the age interval, the less accurate the estimate will be, since it is based on a discrete approximation of a continuous function.

Below is an example of Lorenz curve that can be created using ggplot2 and gini output.


res$plot %>% 
  ggplot(aes(Fx, Phix))+
  geom_line(color = "red")+
  geom_abline(intercept = 0, slope = 1)+
  scale_x_continuous(limits=c(0,1), expand = c(0, 0)) +
  scale_y_continuous(limits=c(0,1),expand = c(0, 0)) +
  theme_classic()+
  labs(x = "Proportion in population", y = "Proportion in person-years of life")+
  ggtitle("Lorenz curve: Russia, 1995, females", 
          subtitle = paste0("G0 = ", round(res$Gini$G0, 2), ", ", 
                            "G0 abs = ", round(res$Gini$G0_abs, 2), ", ",
                            "e0 = ", round(res$Gini$G0_abs/res$Gini$G0, 2))
          )

Note, one can calculate not only Gini inequality index, but also Drewnowski’s index of equality that is D0=1G0D_0 = 1 - G_0 proposed by Aburto et al. (2022) (so, for Russian 1995 females it is 10.13=0.871-0.13=0.87) that also can serve as an indicator of the shape of mortality patterns.

e-dagger

Another measure of mortality inequality is “e-dagger”, exe^{\dagger}_x, proposed by Vaupel and Romo (2003), that is approximation of “the average life expectancy lost because of death” (Vaupel and Romo 2003, 206).

In demor there is a function edagger for calculation “e-dagger” for age xx, exe^{\dagger}_x:


mx <- dbm.1[dbm.1$code==1100 & dbm.1$territory=="t" & dbm.1$sex=="f" & dbm.1$year == 1995,]$mx
res = edagger(age = 0:100, mx = mx, sex = "f")
res[c(1, 26, 51, 76)]
#>         0        25        50        75 
#> 12.719536 10.968361  9.137110  5.201578

Hence, e0e_0^{\dagger} for Russian female population in 1995 is 12.72. From exe^{\dagger}_x one can also calculate life table entropy HxH_x(see, for ex., Wrycza, Missov, and Baudisch 2015) that is simply Hx=ex/exH_x = e^{\dagger}_x/e_x. Usually only one quantity as entropy is presented that is H0H_0, which sometimes is denoted as H\bar H, and it is e0/e0e^{\dagger}_0/e_0.

Note that the larger the age interval, the less accurate the estimate will be, since it is based on a discrete approximation of a continuous function.

For more inequality indicators and their comparisons, see Shkolnikov, Andreev, and Begun (2003), Wrycza, Missov, and Baudisch (2015) and Wilmoth and Horiuchi (1999) as well as LifeIneq package.

Years of Life Lost (YLL)

One of the most popular (and relatively young) measure of lifespan inequality is “years of life lost” (YLL) proposed in Martinez et al. (2019). As authors claim, “YLL is a valuable measure for public health surveillance, particularly for quantifying the level and trends of premature mortality, identification of leading causes of premature deaths and monitoring the progress of YLL as a key indicator of population health” (Martinez et al. 2019, 1368).

Authors proposed different metrics of YLL:

  1. Absolute number of YLL: YLLx,t,c=Dx,t,c*SLExYLL_{x,t,c}=D_{x,t,c}*SLE_x that is calculated for age x, time t and cause of death c. YLL for the whole population is just sum of YLLxYLL_x. SLE is the standard life expectancy that is invariant over time, sex and population (it’s meaning is straightforward: it is the potential maximum life span of an individual, who is not exposed to avoidable health risks or severe injuries and receives appropriate health services), and DxD_x is a number of deaths at age x. Of course, one can calculate YLL not for specific cause c, but for overall mortality that is called all-causes YLL.
  2. YLL as proportion: YLLx,t,cp=YLLx,t,c/YLLx,tYLL^p_{x,t,c}=YLL_{x,t,c}/YLL_{x,t} that is just cause specific YLL divided by all-causes YLL.
  3. YLL rate: YLLx,t,cr=[YLLx,t,c/Px,t]*100000YLL^r_{x,t,c}=[YLL_{x,t,c}/P_{x,t}] * 100'000 where Px,tP_{x,t} is population.
  4. Age-standardized YLL rate: ASYRx,t,c=xω[YLLx,t,cr*Wx]ASYR_{x,t,c} = \sum_x^\omega{[YLL^r_{x,t,c}*W_x]} where WxW_x is the standard population weight at age x, where ω\omega is the oldest, closing age (for ex., 85+ or 100+). In other words, it’s just direct standardization of YLLx,t,crYLL^r_{x,t,c}.

Let’s calculate all-cause YLL, Yll rate and ASYR using Rosbris data that we have downloaded.

#YLL
yll(rus2010$Dx, type = "yll")
#> $yll_all
#> [1] 33640561
#> 
#> $yll
#>  [1]  705159.2  174024.0  108414.6  104611.4  384855.2 1096994.4 1741367.9
#>  [8] 2190511.0 2157177.1 2171936.4 3075902.7 3860081.6 3989475.5 3502545.3
#> [15] 2002623.9 3063221.5 1729131.5 1233349.3  349178.1
#YLL rate
yll(rus2010$Dx, type = "yll.r", pop = rus2010$N)
#> $yll.r_all
#> [1] 50945.02
#> 
#> $yll.r
#>  [1]  81109.836   5400.365   3000.453   3077.806   8840.451  17712.527
#>  [7]  29011.861  40596.105  43375.182  48645.890  59839.275  74119.462
#> [13]  92058.751 112357.578 127258.833 142480.124 160414.307 172691.796
#> [19] 150930.678

For ASYR one needs standard population. Let’s use 2010 population as standard (note, in this case ASYR equals YLL rate because we use 2010 mortality).

#ASYR
yll(rus2010$Dx, type = "asyr", pop = rus2010$N, w = rus2010$N/sum(rus2010$N))
#> $asyr_all
#> [1] 50945.02
#> 
#> $asyr
#>  [1] 1067.8879  263.5407  164.1824  158.4227  582.8220 1661.2803 2637.1149
#>  [8] 3317.2939 3266.8132 3289.1646 4658.1245 5845.6793 6041.6326 5304.2289
#> [15] 3032.7590 4638.9202 2618.5841 1867.7751  528.7928

Also one can calculate different YLL measures using standards that are provided by demor as dataframe.

demor::sle_stand

Age decomposition of differences in life expectancies

Also one can do simple decomposition between 2 populations. Lets use Russia-2000 as base population and Russia-2010 as compared population. This function implements two almost identical discrete methods that were proposed almost simultaneously: first in Андреев (1982) and the second in Arriaga (1984).

rus2010 <- dbm[dbm$year==2010 & dbm$code==1100 & dbm$sex=="m" & dbm$territory=="t",]
rus2000 <- dbm[dbm$year==2000 & dbm$code==1100 & dbm$sex=="m" & dbm$territory=="t",]

dec <- decomp(mx1 = rus2000$mx, 
              mx2 = rus2010$mx, 
              age = rus2000$age, 
              method = "andreev")
head(dec)
#>   age   ex1   ex2     lx2  dex ex12  ex12_prc
#> 1   0 58.99 63.04 1.00000 4.05 0.54 13.235294
#> 2   1 59.04 62.59 0.99124 3.55 0.11  2.696078
#> 3   5 55.29 58.74 0.98885 3.45 0.07  1.715686
#> 4  10 50.45 53.84 0.98710 3.39 0.05  1.225490
#> 5  15 45.59 48.94 0.98519 3.35 0.22  5.392157
#> 6  20 41.05 44.21 0.97937 3.16 0.46 11.274510

Than let us plot the result of decomp using ggplot2:

library(ggplot2)

ggplot(dec, aes( as.factor(age), ex12))+
  geom_bar(stat = "identity", color = "black", fill = "orange3")+
  theme_minimal()+
  labs(x = "Age-groups", 
       y = "Сontribution to the e0 difference")+
  annotate("text", x = "70", y = 0.5, label = paste0("Total difference in e0: ", sum(dec$ex12)))+
  geom_text(aes(label = ex12), vjust = 1.5, color = "white", size = 3.5)

Age and cause decomposition of differences in life expectancies

Also one can do decomposition between 2 populations by age and causes. Lets use example from E. M. Andreev and Shkolnikov (2012) where data for US and England and Wales men mortality by some causes are presented.

Lets see the data

data("mdecompex")
head(mdecompex)
#> # A tibble: 6 × 9
#>     age neoplasms circulatory respiratory  digestive  accident     other     all
#>   <dbl>     <dbl>       <dbl>       <dbl>      <dbl>     <dbl>     <dbl>   <dbl>
#> 1     0 0.0000349  0.000173    0.000188   0.000151   0.000377  0.00669   7.62e-3
#> 2     1 0.0000309  0.0000155   0.0000220  0.00000930 0.000163  0.000112  3.53e-4
#> 3     5 0.0000313  0.00000560  0.00000687 0.00000324 0.0000792 0.0000409 1.67e-4
#> 4    10 0.0000305  0.0000127   0.00000896 0.00000286 0.000127  0.0000490 2.31e-4
#> 5    15 0.0000424  0.0000322   0.0000141  0.00000401 0.000765  0.0000821 9.40e-4
#> 6    20 0.0000600  0.0000511   0.0000158  0.00000928 0.00114   0.000130  1.41e-3
#> # ℹ 1 more variable: cnt <chr>

For mdecomp 2 lists with arrays for 2 population are required.

#US men
mx1 <- mdecompex %>% 
  filter(cnt == "usa") %>% 
  select(-cnt, -age) %>% select(all, everything()) %>% 
  as.list()

#England and Wales men
mx2 <- mdecompex %>% 
  filter(cnt == "eng") %>% 
  select(-cnt, -age) %>% select(all, everything()) %>% 
  as.list()

decm <- mdecomp(mx1 = mx1, 
              mx2 = mx2, 
              sex = "m", 
              age = unique(mdecompex$age), 
              method = "andreev"
              )
head(decm)
#>   age ex12     neoplasms  circulatory   respiratory     digestive   accident
#> 1   0 0.13  0.0009474709 7.090137e-03  0.0042212672  7.167768e-03 0.02282901
#> 2   1 0.03 -0.0025030458 5.483996e-04  0.0020247182  6.245908e-04 0.03637029
#> 3   5 0.02 -0.0016468368 8.262058e-05 -0.0001300012  9.929857e-05 0.02178479
#> 4  10 0.01 -0.0014084238 4.993427e-04 -0.0002451017  8.338450e-05 0.01117647
#> 5  15 0.13 -0.0032690840 2.657675e-03  0.0005053688 -2.052103e-04 0.13796276
#> 6  20 0.17 -0.0009648903 2.773336e-03 -0.0016773417 -8.983808e-04 0.18353643
#>           other
#> 1  0.0877443429
#> 2 -0.0070649563
#> 3 -0.0001898743
#> 4 -0.0001056675
#> 5 -0.0076515085
#> 6 -0.0127691565

Than let us plot the result of mdecomp using ggplot2. This requires some data transformations

library(ggplot2)

decm_plot <- decm[,c(1,3)]
decm_plot$group = colnames(decm)[3]
colnames(decm_plot)[2]<-"ex12"
for(i in 4:ncol(decm)){
  decm_plot_i <- decm[,c(1,i)]
  decm_plot_i$group = colnames(decm)[i]
  colnames(decm_plot_i)[2]<-"ex12"
  decm_plot <- rbind(decm_plot,decm_plot_i)
  rm(decm_plot_i)
}

for (i in unique(decm_plot$group)){
  decm_plot[decm_plot$group==i,]$group <- paste0(i, " (", round(sum(decm_plot[decm_plot$group==i,]$ex12),2), ")")
}

ggplot(data = decm_plot, aes(x = as.factor(age), y = ex12, fill = group))+
  geom_bar(stat="identity", colour = "black")+
  theme_minimal()+
  scale_fill_brewer(palette="Set1")+
  labs(x = "Age", y = "Contribution to difference in ex", fill = "Cause (contribution):", 
       caption  = paste0("Total difference in ex = ", sum(decm[,2]))
       )

Multiple Decrement Life Table

Also one can construct the Multiple Decrement Life Table that expands the usual life table adding additional columns (qxi,dxi,lxicausesiq_x^i, d_x^i, l_x^i \ \forall \ \text{causes} \ i) for specific decrement causes. Note, mx=imxim_x = \sum_i m_x^i and user should specify mxm_x (the first array in the list). Let us use the shorten vesrion of data from the previous example, using only overall mxm_x and mxim_x^i from neoplasm.


mx <- mx1[c("all", "neoplasms")]
age = unique(mdecompex$age)
mlt.res = MLT(age, mx)
mlt.res
#>       age      mx    ax      qx      lx      dx      Lx       Tx    ex
#>  [1,]   0 0.00762 0.134 0.00757 1.00000 0.00757 0.99899 74.65476 74.65
#>  [2,]   1 0.00035 2.000 0.00141 0.99243 0.00140 3.96693 73.65578 74.22
#>  [3,]   5 0.00017 2.500 0.00084 0.99103 0.00083 4.95310 69.68885 70.32
#>  [4,]  10 0.00023 2.500 0.00115 0.99021 0.00114 4.94818 64.73574 65.38
#>  [5,]  15 0.00094 2.500 0.00469 0.98907 0.00464 4.93373 59.78756 60.45
#>  [6,]  20 0.00141 2.500 0.00702 0.98443 0.00691 4.90486 54.85384 55.72
#>  [7,]  25 0.00134 2.500 0.00670 0.97752 0.00655 4.87122 49.94898 51.10
#>  [8,]  30 0.00149 2.500 0.00744 0.97097 0.00723 4.83679 45.07775 46.43
#>  [9,]  35 0.00207 2.500 0.01028 0.96374 0.00990 4.79395 40.24097 41.75
#> [10,]  40 0.00306 2.500 0.01521 0.95384 0.01450 4.73293 35.44701 37.16
#> [11,]  45 0.00455 2.500 0.02248 0.93933 0.02112 4.64387 30.71408 32.70
#> [12,]  50 0.00654 2.500 0.03216 0.91822 0.02953 4.51726 26.07021 28.39
#> [13,]  55 0.00962 2.500 0.04697 0.88869 0.04174 4.33909 21.55295 24.25
#> [14,]  60 0.01476 2.500 0.07119 0.84695 0.06029 4.08400 17.21387 20.32
#> [15,]  65 0.02266 2.500 0.10724 0.78665 0.08436 3.72235 13.12987 16.69
#> [16,]  70 0.03534 2.500 0.16235 0.70229 0.11402 3.22641  9.40752 13.40
#> [17,]  75 0.05524 2.500 0.24269 0.58827 0.14277 2.58445  6.18111 10.51
#> [18,]  80 0.08654 2.500 0.35575 0.44551 0.15849 1.83131  3.59666  8.07
#> [19,]  85 0.16258 6.151 1.00000 0.28702 0.28702 1.76535  1.76535  6.15
#>       qx_neoplasms dx_neoplasms lx_neoplasms ex_no_neoplasms
#>  [1,]      0.00003      0.00003      0.24395           84.29
#>  [2,]      0.00012      0.00012      0.24392           83.93
#>  [3,]      0.00015      0.00015      0.24380           80.03
#>  [4,]      0.00015      0.00015      0.24365           75.08
#>  [5,]      0.00021      0.00021      0.24350           70.14
#>  [6,]      0.00030      0.00030      0.24329           65.40
#>  [7,]      0.00037      0.00036      0.24299           60.76
#>  [8,]      0.00059      0.00057      0.24263           56.06
#>  [9,]      0.00102      0.00098      0.24206           51.37
#> [10,]      0.00218      0.00208      0.24108           46.73
#> [11,]      0.00451      0.00424      0.23900           42.18
#> [12,]      0.00855      0.00785      0.23476           37.72
#> [13,]      0.01529      0.01359      0.22691           33.34
#> [14,]      0.02537      0.02149      0.21332           29.02
#> [15,]      0.03829      0.03012      0.19183           24.82
#> [16,]      0.05411      0.03800      0.16171           20.73
#> [17,]      0.06992      0.04113      0.12371           16.72
#> [18,]      0.08265      0.03682      0.08258           12.54
#> [19,]      0.15944      0.04576      0.04576            7.32

From this table one can calculate, for ex.,

  • Proportion of newborn that will eventually die from cause i: l0i/l0l_0^i/l_0
  • Proportion of people who survive to age xx that will die from cause i: lxi/lxl_x^i/l_x

Associated single decrement life table

There is asdt() function that calculates associated single decrement life table (ASDT) for causes of death (cause-deleted life table). In other words, by this function one can answer the question “what will be the life expectancy if there is no mortality from cause i?” It is a natural expansion of Multiple Decrement Life Table (MLT function, see above)

For example in the demor data (as it is easy to guess, taken from E. M. Andreev and Shkolnikov (2012)) on mortality of US men in 2002 by some causes is added. Let me show what would be exe_x if there is no deaths from neoplasm (i).

data("asdtex")

asdt_neoplasm <- asdt(age = asdtex$age, 
                      sex = "m",
                      m_all = asdtex$all, 
                      m_i = asdtex$neoplasms, 
                      full = F, 
                      method = "chiang1968")

head(asdt_neoplasm[,c("age", "ex", "ex_without_i")])
#>   age    ex ex_without_i
#> 1   0 74.65        84.29
#> 2   1 74.22        83.93
#> 3   5 70.32        80.03
#> 4  10 65.38        75.08
#> 5  15 60.45        70.14
#> 6  20 55.72        65.40

One can plot the results using ggplot2:


ggplot(data = asdt_neoplasm, aes(x = age))+
  geom_line(aes(y = ex, color = "ex all"), size = 1)+
  geom_line(aes(y = ex_without_i, color = "ex without neoplasms"), size = 1)+
  theme_linedraw()


ggplot(data = asdt_neoplasm, aes(x = age))+
  geom_line(aes(y = lx, color = "lx all"), size = 1)+
  geom_line(aes(y = l_not_i, color = "lx without neoplasms"), size = 1)+
  geom_hline(yintercept = 0.5, linetype = "dashed")+
  theme_linedraw()

Mortality models for mx approximation

In demor there is a function mort.approx for modeling mx. Now “Gompertz” and “Brass” are supported (see Preston, Heuveline & Guillot, 2001 for more details on the functions).

Function returns list with estimated model and dataframe with predicted mx.

The example below shows the model that try to approximate mx of russian men in 2010 using Brass function and russian mortality of 2000 as standard mortality.

rus2010 <- dbm[dbm$year==2010 & dbm$code==1100 & dbm$sex=="m" & dbm$territory=="t",]
rus2000 <- dbm[dbm$year==2000 & dbm$code==1100 & dbm$sex=="m" & dbm$territory=="t",]


brass_2010 <- mort.approx(mx = rus2010$mx,
                          age = rus2010$age,
                          model = "Brass",
                          standard.mx = rus2000$mx,
                          sex = "m")
brass_2010[[1]]
#> Nonlinear regression model
#>   model: 0.5 * log((qx/(1 - qx))) ~ a + b * stand.logit
#>    data: parent.frame()
#>        a        b 
#> -0.06074  1.07357 
#>  residual sum-of-squares: 0.09628
#> 
#> Number of iterations to convergence: 1 
#> Achieved convergence tolerance: 1.178e-08

Lets plot the modeled and observed mx.

brass_2010[[2]] %>% 
  mutate(mx = rus2010$mx) %>% 
  ggplot(aes(x = age))+
  geom_line(aes(y = log(mx), color = "Observed"), linewidth = 1)+
  geom_line(aes(y = log(mx.pred), color = "Predicted"), linewidth = 1)+
  theme_linedraw()+
  scale_color_manual(values = c("blue", "red"))+
  labs(x = "Age", y = "ln mx")

Fertility

Get fertility data

DEPRECATED

Lets get basic fertility data (asFR or fxf_x) from RosBris using get_rosbris()

# dbf <- get_rosbris(
#   #fertility data
#   type = "f",
#   #what age group download
#   age =  1,
#   #to get "long" data
#   initial = F,
#   #last available year
#   lastyear = 2022
# )

Recently RosBris has been switched to new website, so the function get_rosbris does not work now. I hope in a several months all mistakes due to new website will be fixed and function will work.

Now RosBris data is presented in the demor as datasets in the long-format. The example of usage that replicates the data from the previous chunk with get_rosbris:

For the example Russia-2010 is gotten

rus2010f <- dbf[dbf$year==2010 & dbf$code==1100 & dbf$territory=="t",]
head(rus2010f)
#>       year code territory age       fx       N       Bx                 name
#> 16155 2010 1100         t  15 0.002828  718636  2032.30 Российская Федерация
#> 16156 2010 1100         t  16 0.007868  738942  5814.00 Российская Федерация
#> 16157 2010 1100         t  17 0.019538  806058 15748.76 Российская Федерация
#> 16158 2010 1100         t  18 0.036598  915096 33490.68 Российская Федерация
#> 16159 2010 1100         t  19 0.054816 1035519 56763.01 Российская Федерация
#> 16160 2010 1100         t  20 0.068020 1128146 76736.49 Российская Федерация

TFR

Now one can compute total fertility age (TFR) - the most popular measure of fertility - that is standardized measure of “the average number of children a woman would bear if she survived through the end of the reproductive age span and experienced at each age a particular set of age-specific fertility ages” (Preston, Heuveline, and Guillot 2001, 95).

Surely, using tfr function one can compute TFR by parity whereby using fxpf_x^p instead of overall fxf_x.

tfr(
  #asFR
  rus2010f$fx,
  #age interval
  age.int = 1
    )
#> [1] 1.565588

However, period “TFR is a very problematic measure for assessing both the need for and the impact of policy changes and, more generally, for studying fertility trends in conjunction with selected social and economic trends” (Sobotka and Lutz 2010, 639). One correction to usual TFR is a tempo (meaning fertility schedule) adjusted TFR, TFR’, proposed by Bongaarts and Feeney (1998) (see also Bongaarts and Feeney (2000)). To calculate it, one can use tatfr function:


dbf5 <- demor::rosbris_fertility_pop_5 %>% dplyr::filter(code == 1100 & year %in% 2009:2011 & territory == "t")

past_fx = dbf5[dbf5$year == 2009,] %>% select(6:10) %>% as.list()
present_fx = dbf5[dbf5$year == 2010,] %>% select(6:10) %>% as.list()
post_fx = dbf5[dbf5$year == 2011,] %>% select(6:10) %>% as.list()

tatfr(past_fx = past_fx, present_fx = present_fx, post_fx = post_fx, age = seq(15, 50, 5))
#> $tatfr
#> [1] 1.726415
#> 
#> $tatfr_i
#> [1] 0.91379429 0.57933163 0.16968478 0.04180952 0.02179487
#> 
#> $tfr
#> [1] 1.584185
#> 
#> $tfr_i
#> [1] 0.799570 0.567745 0.156110 0.039510 0.021250

MAC

MAC is a mean age at childbearing. One can compute it using fxf_x. Surely, using mac function one can compute mean age at childbearing by parity whereby using fxpf_x^p instead of overall fxf_x.

mac(
  #asFR
  rus2010f$fx,
  #array with ages
  age = 15:55
    )
#> [1] 27.65

Fertility models for ASFR approximation

In demor there is a function fert.approx for modeling ASFR. Now “Hadwiger”, “Gamma” and “Brass” are supported (see Peristera and Kostaki (2007) for more details on the functions).

Hadwiger model (optimal choice with balance of simplicity and accuracy) is as follows:

f(age)=abccage3/2exp[b2(cage+agec2)]f(age) = \frac{ab}{c} \frac{c}{age}^{3/2} exp[-b^2(\frac{c}{age}+\frac{age}{c}-2)]

Gamma model (sophisticated and accurate, but not really sustainable due to convergence issues) is as follows: f(age)=RΓ(b)cb(aged)b1exp[(agedc)]f(age) = \frac{R}{\Gamma(b)c^b}(age-d)^{b-1} exp[-(\frac{age-d}{c})]

Brass model (the simplest and the most inaccurate) is as follows:

f(age)=c(aged)(d+wage)f(age) = c(age-d)(d+w-age)

Function returns list with estimated model (parameters, R-squared, variance-covariance matrix of parameters) and dataframe with predicted ASFR.

The example below shows the model that try to approximate ASFR of russian women in 2010 using Gamma function.

approximation_2010 = fert.approx(fx = rus2010f$fx, age = 15:55, model = "Gamma", boot = T)

approximation_2010[[1]]
#> $type
#> [1] "Gamma"
#> 
#> $params
#>        R        b        c        d 
#> 1.603204 8.512910 2.161928 9.048034 
#> 
#> $Rsq
#> [1] 0.9954142
#> 
#> $covmat
#>               R            b             c             d
#> R  0.0002956914 -0.002461275  0.0006176426  0.0006317489
#> b -0.0024612750  1.668815802 -0.2388863457 -1.6943450227
#> c  0.0006176426 -0.238886346  0.0349105043  0.2375133186
#> d  0.0006317489 -1.694345023  0.2375133186  1.7605961160

Lets plot the modeled and observed ASFR with bootstrapped 95% CI. One can see that model perfectly approximates real ASFR from 15 to 40 ages, while after the fit is not really good.

data = approximation_2010[[2]]
ggplot()+
  theme_linedraw()+
  geom_line(data = data, aes(x = age, y = predicted, color = "fx from Gamma model"), linewidth = 1)+
  geom_ribbon(data = data,
              aes(ymin = low_2.5prc, ymax = high_97.5prc, x = age, color = "95% CI"), alpha = 0, linetype = "dashed")+
  geom_line(data = data.frame(real = rus2010f$fx, age = 15:55), aes(x = age, y = real, color = "Observed fx"), linewidth = 1)+
  scale_color_manual(values = c("red", "red", "blue"))+
  labs(y = "fx", x = "Age")

Lets now do the same procedure but with Hadwiger function:

data = fert.approx(fx = rus2010f$fx, age = 15:55, model = "Hadwiger", boot = T)[[2]]
ggplot()+
  theme_linedraw()+
  geom_line(data = data, aes(x = age, y = predicted, color = "fx from Hadwiger model"), linewidth = 1)+
  geom_ribbon(data = data,
              aes(ymin = low_2.5prc, ymax = high_97.5prc, x = age, color = "95% CI"), alpha = 0, linetype = "dashed")+
  geom_line(data = data.frame(real = rus2010f$fx, age = 15:55), aes(x = age, y = real, color = "Observed fx"), linewidth = 1)+
  scale_color_manual(values = c("red", "red", "blue"))+
  labs(y = "fx", x = "Age")

Lets now do the same procedure but with Brass function:

data = fert.approx(fx = rus2010f$fx, age = 15:55, model = "Brass", boot = T)[[2]]
ggplot()+
  theme_linedraw()+
  geom_line(data = data, aes(x = age, y = predicted, color = "fx from Brass model"), linewidth = 1)+
  geom_ribbon(data = data,
              aes(ymin = low_2.5prc, ymax = high_97.5prc, x = age, color = "95% CI"), alpha = 0, linetype = "dashed")+
  geom_line(data = data.frame(real = rus2010f$fx, age = 15:55), aes(x = age, y = real, color = "Observed fx"), linewidth = 1)+
  scale_color_manual(values = c("red", "red", "blue"))+
  labs(y = "fx", x = "Age")

Projections

Lee-Carter model

In the demor there is leecart() function that provides users with basic Lee-Carter model (proposed by Lee and Carter (1992) and that now has a lot of extensions, which are not yet implemented in the demor) for mortality forecasting:


dbm.1 <- demor::rosbris_mortality_pop_1

leecart_forecast <- leecart(data = dbm.1[dbm.1$code==1100 & 
                                           dbm.1$territory=="t" & 
                                           dbm.1$sex=="m" & 
                                           dbm.1$year %in% 2004:2019,c("year", "age", "mx")], 
                            n = 5, 
                            sex = "m", 
                            concise = T
                            )
head(leecart_forecast)
#>   year age           mx     mx_low95    mx_high95    ex ex_low95 ex_high95
#> 1 2020   0 0.0051383321 0.0048555214 0.0054376152 69.45    68.87     70.03
#> 2 2020   1 0.0004663746 0.0004393745 0.0004950340 68.80    68.24     69.37
#> 3 2020   2 0.0002934932 0.0002756235 0.0003125214 67.83    67.27     68.40
#> 4 2020   3 0.0002396593 0.0002248883 0.0002554005 66.85    66.29     67.42
#> 5 2020   4 0.0002001969 0.0001869453 0.0002143880 65.87    65.31     66.43
#> 6 2020   5 0.0001918685 0.0001805702 0.0002038738 64.88    64.32     65.44

One can plot the results using ggplot2 to compare predicted data with actual that, however, requires some data handling:

#LE data calculation
ledata <- data.frame(year = 2010:2022, lem = NA, lem.predicted = NA, ci.low = NA, ci.high = NA)
for(i in 2010:2022){
  ledata[ledata$year==i,2] <- LT(age = unique(dbm$age), 
                                 sex = "m", 
                                 mx = dbm[dbm$year==i & 
                                            dbm$territory == "t" & 
                                            dbm$code == 1100 & 
                                            dbm$sex == "m",]$mx)[1,"ex"]
  if(i>=2019){
    if(i==2019){
      ledata[ledata$year==i,3:5] <- rep(ledata[ledata$year==i,2],3)
    }else{
      ledata[ledata$year==i,3:5] <- leecart_forecast[leecart_forecast$age=="0" & 
                                                     leecart_forecast$year == i,6:8] %>% as.numeric()
    }
  }
}

ggplot(data = ledata, aes(x = year))+
  geom_point(aes(y = lem, color = "Observed"))+
  geom_line(aes(y = lem, color = "Observed"))+
  geom_point(aes(y = lem.predicted, color = "Predicted\n(counterfactual)"))+
  geom_line(aes(y = lem.predicted, color = "Predicted\n(counterfactual)"))+
  geom_ribbon(aes(ymin = ci.low, ymax = ci.high), fill = "pink", alpha = 0.5)+
  geom_vline(xintercept = 2019, linetype = "dashed", color = "darkgrey", size = 1)+
  scale_x_continuous(breaks = 2010:2022)+
  scale_y_continuous(breaks = 63:70)+
  theme_linedraw()+
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))+
  scale_color_manual(values = c("black", "darkred"))+
  labs(x="Year", y = "Male's LE at birth", colour = "Data:")

Leslie matrix & Cohort-component model

Leslie matrix is a powerful tool for demographic analysis that was introduced by Leslie (1945). For a nice and detailed introduction to it and matrix projections in general, see Wachter (2014), p. 98-122. Also see R package demogR and its tutorial (Jones 2007) for more on “matrix methods” in demography.

In the demor one can compute its using mxm_x and fxf_x that is age-specific mortality and fertility rates respectively.

mx <- dbm.1[dbm.1$code==1100 & dbm.1$territory=="t" & dbm.1$sex=="f" & dbm.1$year == 2022,]$mx
fx <- dbf[dbf$code==1100 & dbf$territory=="t" & dbf$year == 2022,]$fx

les <- leslie(mx = mx, fx = fx, age1 = 0:100, age2 = 15:55, sex = "f")

les[1:5, 1:5]
#>           [,1]      [,2]      [,3]      [,4] [,5]
#> [1,] 0.0000000 0.0000000 0.0000000 0.0000000    0
#> [2,] 0.9964981 0.0000000 0.0000000 0.0000000    0
#> [3,] 0.0000000 0.9996486 0.0000000 0.0000000    0
#> [4,] 0.0000000 0.0000000 0.9997589 0.0000000    0
#> [5,] 0.0000000 0.0000000 0.0000000 0.9998292    0

Leslie matrix 𝐋\pmb L can be expressed as 𝐋=𝐅+𝐌\pmb L = \pmb F + \pmb M where 𝐅\pmb F is the fertility component, which has nonzero values only across first row, and 𝐌\pmb M is the “survival” component, which is “shifted” down diagonal matrix. From the 𝐌\pmb M one can compute life expectancy vector 𝐞\pmb e as column sum of the matrix 𝐄\pmb E

𝐄=(𝐈𝐌)1 \pmb E = (\pmb I - \pmb M)^{-1} 𝐞=𝐄T×𝟏 \pmb e = \pmb E^T \times \pmb 1

where 𝐈\pmb I is an identity matrix of the same size as 𝐌\pmb M (so n×nn \times n) and 𝟏\pmb 1 is a column vector of size nn of 1.

So, in the R it is

M <- les
M[1,] <- 0
E <- solve(diag(nrow(M))-M)
ex = t(E) %*% rep(1, nrow(E))
head(ex)
#>          [,1]
#> [1,] 77.89959
#> [2,] 77.16983
#> [3,] 76.19661
#> [4,] 75.21474
#> [5,] 74.22741
#> [6,] 73.23845

The graph below compares life expectancy at age x, exe_x, from the usual life table and from Leslie matrix (they are almost identical). However, there is a discrepancy in the last age group that is, according to the life table, has higher life expectancy than in Leslie model. This is because of the last age-group survival rate calculating. In the classical Leslie model (default in leslie) it is 0. It can be changed by assuming that it is Tx/Tx1T_x/T_{x-1} and then the exe_x from life table and Leslie matrix will be identical (in leslie one can write fin = TRUE to get this; by default, it is FALSE as in this example).

Also from 𝐋\pmb L one can compute stable population properties: stable age distribution, asymptotic growth rate, etc. For more details, references and functions see Jones (2007), p. 10-24 where R package demogR for matrix demographic models and their analysis was introduced. Below is some simple example how to do it “by hand”:

lambda = Re(eigen(les)$values) # largest real eigenvalue
lvector = Re(eigen(les)$vectors[,which(lambda == max(lambda))]) # corresponding eigenvector
stable_age = lvector / sum(lvector) # stable age distribution (normalized to sum to 1) 

The graph below compares observable and stable age distributions

Finally, one of the most important implications of Leslie matrix is the demographic projection in a concise and efficient matrix form. 𝐍(t+n)=𝐋×𝐍(t) \pmb N(t+n) = \pmb L \times \pmb N(t) where 𝐍(t)\pmb N(t) and 𝐍(t+n)\pmb N(t+n) are column-vectors of population in the time tt and t+nt+n respectively. If the mortality and fertility rates are constant over time, the population in the last year of projection (of the horizon h) is simply 𝐍(t0+n×h)=𝐋h×𝐍(t0) \pmb N(t_0+n \times h) = \pmb L^h \times \pmb N(t_0)

Usually to calculate the 𝐋h\pmb L^h it is easier (and more accurate) to use iterative procedure since the modern computers are able to do it in a second (for reasonable horizon), but one surely can diagonalize 𝐋\pmb L and find its transition matrix to reach more efficient (though less accurate) calculations.

Below is a small function to produce such constant-rates projections assuming closed population and using iterative procedure:

constant.proj <- function(les, h, N){
  dat <- cbind(N)
  for(i in 1:h){
    dat <- cbind(dat, N) 
  }
  colnames(dat) <- paste0("t", 0:h)
  for(i in 1:h){
    dat[,i+1] <- les %*% dat[,i]
  }
  return(round(dat))
}

N0 <- dbm.1[dbm.1$code==1100 & dbm.1$territory=="t" & dbm.1$sex=="f" & dbm.1$year == 2022,]$N
Nt <- constant.proj(les, 78, N0)
head(Nt)[,1:10]
#>          t0     t1     t2     t3     t4     t5     t6     t7     t8     t9
#> [1,] 644191 615841 597083 580146 565350 552982 543234 536141 531754 530166
#> [2,] 675909 641935 613684 594992 578114 563370 551046 541331 534264 529892
#> [3,] 696760 675671 641710 613469 594783 577911 563172 550852 541141 534076
#> [4,] 736044 696592 675509 641555 613321 594640 577772 563036 550720 541011
#> [5,] 786765 735918 696473 675393 641445 613216 594538 577673 562940 550625
#> [6,] 856488 786646 735807 696368 675291 641349 613123 594449 577586 562855

And two graphs below show dynamics of total female population (in mln) and female population age structure (in %) that converges to stable population age structure.

plot(2022:2100, colSums(Nt)/1e6, type = "l", 
     xlab = "year", ylab = "Female Population (mln)")

Nt %>% 
  as.data.frame() %>% 
  mutate(age = 0:100) %>% 
  pivot_longer(!age, names_to = "time", values_to = "pop") %>% 
  mutate(time = as.numeric(sub('.', '', time))) %>% 
  group_by(time) %>% 
  mutate(pop.s = 100*pop / sum(pop)) %>% 
  as.data.frame() %>% 
  mutate(time = time + 2022) %>% 
  ggplot(aes(x = age, y = pop.s))+
  geom_line(aes(group = time, color = time))+
  scale_color_gradient(low = "pink", high = "steelblue")+
  geom_line(data = data.frame(age = 0:100, stable_age = 100*stable_age), 
            aes(x = age, y = stable_age, linetype = "Stable age\nstructure"), 
            color = "darkblue")+
  theme_classic()+
  scale_linetype_manual(values = c("longdash"))+
  labs(x = "age", y = "Age structure (%)", color = "Year:", linetype = "")

Other functions

Population pyramid

plot_pyr plots population pyramid using ggplot2

Lets create population pyramid using midyear population from Rosbris mortality data. We already have data in dbm.


plot_pyr(
  popm = dbm[dbm$year==2022 & dbm$code==1100 & dbm$territory=="t" & dbm$sex=="m",]$N,
  popf = dbm[dbm$year==2022 & dbm$code==1100 & dbm$territory=="t" & dbm$sex=="f",]$N, 
  age = dbm[dbm$year==2022 & dbm$code==1100 & dbm$territory=="t" & dbm$sex=="f",]$age)

Also one can redesigned plot using ggplot2 commands:


plot <- 
  plot_pyr(
  popm = dbm[dbm$year==2010 & dbm$code==1100 & dbm$territory=="t" & dbm$sex=="m",]$N,
  popf = dbm[dbm$year==2010 & dbm$code==1100 & dbm$territory=="t" & dbm$sex=="f",]$N, 
  age = dbm[dbm$year==2010 & dbm$code==1100 & dbm$territory=="t" & dbm$sex=="f",]$age)

plot + 
  labs(y = "My y-axis name", 
       x = "My age name", 
       caption = "Data sourse: Rosbris")+
  ggtitle("Rissian population in 2010")+
  scale_fill_manual(labels = c("f", "m"), values = c("pink", "lightblue"))+
  theme_minimal()


plot_pyr(
  popm = dbm.1[dbm.1$year==2010 & dbm.1$code==1100 & dbm.1$territory=="t" & dbm.1$sex=="m",]$N,
  popf = dbm.1[dbm.1$year==2010 & dbm.1$code==1100 & dbm.1$territory=="t" & dbm.1$sex=="f",]$N, 
  age = dbm.1[dbm.1$year==2010 & dbm.1$code==1100 & dbm.1$territory=="t" & dbm.1$sex=="f",]$age, 
  age.cont = T)

Median age

Also one can compute median age of some population, using vectors of population sizes in the age groups and that age groups.

#Using 1-year age interval
med.age(N = dbm.1[dbm.1$year==2010 & dbm.1$code==1100 & dbm.1$territory=="t" & dbm.1$sex=="m",]$N,
        age = dbm.1[dbm.1$year==2010 & dbm.1$code==1100 & dbm.1$territory=="t" & dbm.1$sex=="m",]$age,
        int = 1)
#> [1] 34.97

#Using 5-year age interval
med.age(N = dbm[dbm$year==2010 & dbm$code==1100 & dbm$territory=="t" & dbm$sex=="m",]$N,
        age = dbm[dbm$year==2010 & dbm$code==1100 & dbm$territory=="t" & dbm$sex=="m",]$age,
        int = 5)
#> [1] 34.99

Recoding age

ages converts continuous ages to age groups. For example, array with {0,1,2,3...100}\{ 0,1,2,3...100 \} it can convert to {<5,69,1014...85+}\{<5,6-9,10-14 ... 85+ \} using user-specific lower boundaries.

groups <- c(0, 1, 5, 10, 20, 45, 85) #lower boundaries of age groups
age <- 0:100 #array with ages

new.age <- ages(x = age, groups = groups, char = TRUE)

table(new.age) #see new groups with the number of values in the group 
#> new.age
#>     0   1-4   5-9 10-19 20-44 45-84   85+ 
#>     1     4     5    10    25    40    16

References

Aburto, José Manuel, Ugofilippo Basellini, Annette Baudisch, and Francisco Villavicencio. 2022. “Drewnowski’s Index to Measure Lifespan Variation: Revisiting the Gini Coefficient of the Life Table.” Theoretical Population Biology 148: 1–10. https://doi.org/10.1016/j.tpb.2022.08.003.
Andreev, E. M., and V. M. Shkolnikov. 2012. “An Excel Spreadsheet for the Decomposition of a Difference Between Two Values of an Aggregate Demographic Measure by Stepwise Replacement Running from Young to Old Ages.” Rostock: Max Planck Institute for Demographic Research. https://www.demogr.mpg.de/en/publications_databases_6118/publications_1904/mpidr_technical_reports/an_excel_spreadsheet_for_the_decomposition_of_a_difference_between_two_values_of_an_aggregate_demographic_4591.
Andreev, Evgeny M., and W. Ward Kingkade. 2015. Average age at death in infancy and infant mortality level: Reconsidering the Coale-Demeny formulas at current levels of low mortality.” Demographic Research 33 (13): 363–90. https://doi.org/10.4054/DemRes.2015.33.13.
Arriaga, Eduardo E. 1984. “Measuring and Explaining the Change in Life Expectancies.” Demography 21 (1): 83–96. https://doi.org/10.2307/2061029.
Bongaarts, John, and Griffith Feeney. 1998. “On the Quantum and Tempo of Fertility.” Population and Development Review, 271–91.
———. 2000. “On the Quantum and Tempo of Fertility: Reply.” Population and Development Review 26 (3): 560–64.
Ghislandi, Simone, Warren C. Sanderson, and Sergei Scherbov. 2019. “A Simple Measure of Human Development: The Human Life Indicator.” Population and Development Review 45 (1): 219–33. https://doi.org/10.1111/padr.12205.
Hanada, Kyo. 1983. “A Formula of Gini’s Concentration Ratio and Its Application to Life Tables.” Journal of the Japan Statistical Society, Japanese Issue 13 (2): 95–98. https://doi.org/10.11329/jjss1970.13.95.
Jones, James Holland. 2007. “demogR: A Package for the Construction and Analysis of Age-Structured Demographic Models in r.” Journal of Statistical Software 22 (10): 1–28. https://doi.org/10.18637/jss.v022.i10.
Lee, Ronald D, and Lawrence R Carter. 1992. “Modeling and Forecasting US Mortality.” Journal of the American Statistical Association 87 (419): 659–71. https://doi.org/10.1080/01621459.1992.10475265.
Leslie, P. H. 1945. “On the Use of Matrices in Certain Population Mathematics.” Biometrika 33 (3): 183–212. https://doi.org/10.2307/2332297.
Martinez, Ramon, Patricia Soliz, Roberta Caixeta, and Pedro Ordunez. 2019. “Reflection on Modern Methods: Years of Life Lost Due to Premature Mortality—a Versatile and Comprehensive Measure for Monitoring Non-Communicable Disease Mortality.” International Journal of Epidemiology 48 (4): 1367–76. https://doi.org/10.1093/ije/dyy254.
Peristera, Paraskevi, and Anastasia Kostaki. 2007. “Modeling Fertility in Modern Populations.” Demographic Research 16: 141–94. https://doi.org/10.4054/DemRes.2007.16.6.
Preston, Samuel H., Patrix Heuveline, and Michel Guillot. 2001. Demography: Measuring and Modeling Population Process. Malden: Blackwell.
Shkolnikov, Vladimir, Evgeny M. Andreev, and Alexander Begun. 2003. Gini coefficient as a life table function: Computation from discrete data, decomposition of differences and empirical examples.” Demographic Research 8 (11): 305–58. https://doi.org/10.4054/DemRes.2003.8.11.
Sobotka, Tomáš, and Wolfgang Lutz. 2010. “Misleading Policy Messages Derived from the Period TFR: Should We Stop Using It?” Comparative Population Studies 35 (3).
The Russian Fertility and Mortality database.” 2024. Database. The Centre for Demographic Research at the New Economic School. http://demogr.nes.ru/index.php/ru/demogr_indicat/data.
Vaupel, James W, and Vladimir Canudas Romo. 2003. “Decomposing Change in Life Expectancy: A Bouquet of Formulas in Honor of Nathan Keyfitz’s 90th Birthday.” Demography 40: 201–16. https://doi.org/10.1353/dem.2003.0018.
Wachter, Kenneth W. 2014. Essential Demographic Methods. Harvard University Press.
Wilmoth, John R, and Shiro Horiuchi. 1999. “Rectangularization Revisited: Variability of Age at Death Within Human Populations.” Demography 36 (4): 475–95. https://doi.org/10.2307/2648085.
Wrycza, Tomasz F, Trifon I Missov, and Annette Baudisch. 2015. “Quantifying the Shape of Aging.” PloS One 10 (3): e0119163. https://doi.org/10.1371/journal.pone.0119163.
Андреев, ЕМ. 1982. “Метод Компонент в Анализе Продолжительности Жизни.” Вестник Статистики 9: 42–47.