//img.uscri.be/pth/7fd0a4cb68b5e1f42a215a33f7d46ac88d5ccbc0
Cet ouvrage fait partie de la bibliothèque YouScribe
Obtenez un accès à la bibliothèque pour le lire en ligne
En savoir plus

Microsimulation methods for population projection - article ; n°1 ; vol.10, pg 97-136

De
42 pages
Population - Année 1998 - Volume 10 - Numéro 1 - Pages 97-136
Van Imhoff (Evert), Post (Wendy). - Microsimulation methods for population projections Microsimulation differs from traditional macrosimulation in using a sample rather than the total population, in operating at the level of individual data rather than aggregated data, and in being based on repeated random experiments rather than average numbers. Here are presented the circumstances in which microsimulation can be of greater value than the more conventional methods. It is particularly relevant when the results of the process being studied are complex whereas the forces driving it are simple. A particular problem in microsimulation results from the fact that the projections are subject to random variation. Various sources of random variations are examined but the most important is the one we refer to as specification randomness: the more explanatory variables are included in the model, the greater the degree of random variation affecting the output of the model. After a brief survey of the microsimulation models which exist in demography, a number of the essential characteristics of microsimulation are illustrated using the KINSIM model for projecting the future size and structure of kinship networks.
Van Imhoff (Evert), Post (Wendy). - Méthodes de micro-simulation pour des projections de population La micro-simulation se distingue de la macro-simulation traditionnelle, en utilisant un échantillon plutôt que la population totale, en travaillant au niveau de données individuelles plutôt que de données agrégées, et en se basant sur des expériences aléatoires répétées plutôt que sur des nombres moyens. Nous présentons ici les circonstances sous lesquelles la micro-simulation peut être plus intéressante que des méthodes plus conventionnelles. Elle est particulièrement appropriée si les résultats du processus étudié sont complexes, tandis que les forces qui lui sont sous-jacentes sont simples. Un problème difficile en micro-simulation vient de ce que les projections sont sujettes à des variations aléatoires. Diverses sources d'aléas sont présentées, mais la plus importante est ce que nous appelons l'aléa de spécification : plus on introduit de variables explicatives dans le modèle, plus le degré d'aléa, auquel les sorties du modèle sont sujettes sera important. Après une revue rapide des modèles de micro-simulation qui existent en démographie, plusieurs des caractéristiques essentielles de la micro-simulation sont illustrées avec le modèle KINSIM, pour projeter la taille et la structure des réseaux de parenté futurs.
Van Imhoff (Evert), Post (Wendy). -Métodos de micro-simulación para proyecciones de población Algunos de los elementos que distinguen la micro-simulación de la macro-simula- ción tradicional son: el uso de muestras en lugar de población total, trabajo a nivel de datos individuates en vez de datos agregados y uso de experimentos aleatorios repetidos en lugar de médias. El articulo présenta las condiciones bajo las cuales la micro-simulación puede ser más interesante que los métodos convencionales. La micro-simulación es especialmente apropiada si los resultados del proceso estudiado son complejos mientras que las fuerzas subyacentes son simples. Una de las dificultades existentes en micro-simulación es que las proyecciones están sujetas a variaciones aleatorias. Existen varias fuentes de error, pero el más importante es el derivado de la propia especificación del modelo : cuantas más variables explicativas se introduzcan en el modelo, mayor sera el nivel de error al cual los resultados del modelo están sujetos. Después de realizar una rápida revision de los modelos de micro-simulación que existen en demografia, el articulo ilustra varias caracteristicas esenciales de la micro- simulación a través del modelo KINSIM, para proyectar el tamaňo y la estructura de las re- des de parentesco futuras.
40 pages
Source : Persée ; Ministère de la jeunesse, de l’éducation nationale et de la recherche, Direction de l’enseignement supérieur, Sous-direction des bibliothèques et de la documentation.
Voir plus Voir moins

Evert Van Imhoff
Wendy Post
Microsimulation methods for population projection
In: Population, 10e année, n°1, 1998 pp. 97-136.
Citer ce document / Cite this document :
Van Imhoff Evert, Post Wendy. Microsimulation methods for population projection. In: Population, 10e année, n°1, 1998 pp. 97-
136.
http://www.persee.fr/web/revues/home/prescript/article/pop_0032-4663_1998_hos_10_1_6824Abstract
Van Imhoff (Evert), Post (Wendy). - Microsimulation methods for population projections Microsimulation
differs from traditional macrosimulation in using a sample rather than the total population, in operating at
the level of individual data rather than aggregated data, and in being based on repeated random
experiments rather than average numbers. Here are presented the circumstances in which
microsimulation can be of greater value than the more conventional methods. It is particularly relevant
when the results of the process being studied are complex whereas the forces driving it are simple. A
particular problem in microsimulation results from the fact that the projections are subject to random
variation. Various sources of random variations are examined but the most important is the one we refer
to as specification randomness: the more explanatory variables are included in the model, the greater
the degree of random variation affecting the output of the model. After a brief survey of the
microsimulation models which exist in demography, a number of the essential characteristics of are illustrated using the KINSIM model for projecting the future size and structure of
kinship networks.
Résumé
Van Imhoff (Evert), Post (Wendy). - Méthodes de micro-simulation pour des projections de population
La micro-simulation se distingue de la macro-simulation traditionnelle, en utilisant un échantillon plutôt
que la population totale, en travaillant au niveau de données individuelles plutôt que de données
agrégées, et en se basant sur des expériences aléatoires répétées plutôt que sur des nombres moyens.
Nous présentons ici les circonstances sous lesquelles la micro-simulation peut être plus intéressante
que des méthodes plus conventionnelles. Elle est particulièrement appropriée si les résultats du
processus étudié sont complexes, tandis que les forces qui lui sont sous-jacentes sont simples. Un
problème difficile en micro-simulation vient de ce que les projections sont sujettes à des variations
aléatoires. Diverses sources d'aléas sont présentées, mais la plus importante est ce que nous appelons
l'aléa de spécification : plus on introduit de variables explicatives dans le modèle, plus le degré d'aléa,
auquel les sorties du modèle sont sujettes sera important. Après une revue rapide des modèles de
micro-simulation qui existent en démographie, plusieurs des caractéristiques essentielles de la micro-
simulation sont illustrées avec le modèle KINSIM, pour projeter la taille et la structure des réseaux de
parenté futurs.
Resumen
Van Imhoff (Evert), Post (Wendy). -Métodos de micro-simulación para proyecciones de población
Algunos de los elementos que distinguen la de la macro-simula- ción tradicional son:
el uso de muestras en lugar de población total, trabajo a nivel de datos individuates en vez de datos
agregados y uso de experimentos aleatorios repetidos en lugar de médias. El articulo présenta las
condiciones bajo las cuales la micro-simulación puede ser más interesante que los métodos
convencionales. La micro-simulación es especialmente apropiada si los resultados del proceso
estudiado son complejos mientras que las fuerzas subyacentes son simples. Una de las dificultades
existentes en micro-simulación es que las proyecciones están sujetas a variaciones aleatorias. Existen
varias fuentes de error, pero el más importante es el derivado de la propia especificación del modelo :
cuantas más variables explicativas se introduzcan en el modelo, mayor sera el nivel de error al cual los
resultados del modelo están sujetos. Después de realizar una rápida revision de los modelos de micro-
simulación que existen en demografia, el articulo ilustra varias caracteristicas esenciales de la a través del modelo KINSIM, para proyectar el tamaňo y la estructura de las re- des de
parentesco futuras.MICROSIMULATION METHODS
FOR POPULATION PROJECTION
Evert VAN IMHOFF* and Wendy POST**
I. - Introduction
Population projections are almost invariably produced with the so-
called cohort-component method. In its simplest form, this method boils
down to the following. The population is classified by sex (males and fe
males) and age group (cohorts). For each combination of sex s and age x,
the initial population is transformed into a projected final population of
sex s and age x+l by projecting the population changes, distinguished by
type (components). Typical components are mortality and fertility. These
calculations are repeated for successive time intervals, where the final
population of one interval serves as the initial population for the next in
terval, until the end of the projection period has been reached.
The basic idea behind the cohort-component model is that the popul
ation changes because individuals experience certain demographic events,
and that the mechanisms underlying these events differ between the sexes,
age groups, and the type of event. The total number of events of a certain
type, for each combination of age and sex, is projected as the result of
two factors: the size of the population exposed to the risk of experiencing
the event; and the level (or intensity) of the risk for individual persons,
which may be interpreted as a measure of demographic behaviour.
Suppose that we want to project the number of children born during
the year out of 100,000 women aged 25. The population consists of
100,000 women and each 25-year old woman has a probability of 0.10 to
'rate' is 0.10). bear a child during the year (i.e. the age-specific fertility
Now according to the traditional methods of demographic projection, which
might be called macrosimulation, the projected number of births is obtained
by applying the fertility probability to the size of the group of women:
0.10 x 100,000 yields 10,000 projected births.
** * Netherlands Department of Interdisciplinary Medical Statistics, Demographic Faculty of Institute Medicine, (NIDI), Leyden The University, Hague, Netherlands. Netherlands
Population: An English Selection, special issue New Methodological Approaches
in the Social Sciences, 1998, 97-138. 98 E. VAN IMHOFF, W. POST
In contrast, microsimulation would proceed as follows:0 )
— first, a sample of, say, 1,000 women is drawn from the population;
— next, for each woman in the sample, a random experiment is done
with 0.10 probability of success. More specifically, in each experiment a
random number is drawn from the uniform distribution over the (0,1) in
terval. If the number drawn is less than 0.10, the woman is deemed to
have a child. For understandable reasons, this roulette-like procedure is
known as the Monte Carlo technique. On average, 1,000 experiments will
yield 100 successes, i.e. births. However, in a particular model run there
can be either less or more than 100 simulated births;
— finally, the number of births in the sample is scaled to
the population level: 100 births among a sample of 1,000 implies
10,000 births for a population of 100,000.
Now in this particular example, microsimulation is needlessly comp
licated. The projection problem is so trivial that no demographer would
ever user microsimulation to solve it. Nevertheless, the example illustrates
the three essential ingredients of the approach that dis
tinguish it from traditional macrosimulation:
— the model uses a sample rather than the total population;
— it works on the level of individual data rather than grouped data;
— it relies on repeated random experiments rather than on average
fractions.
Together, these three ingredients imply several strengths, as well as
several weaknesses for microsimulation. These strengths and weaknesses
will be discussed extensively in this article, but at the outset it should be
stressed that microsimulation can do certain things that macrosimulation
cannot. For this reason alone, microsimulation should definitely be taken
seriously as a potentially powerful tool for demographic as well as for
non-demographic projection purposes.
The origins of the microsimulation approach go back to the late 1950s
(Orcutt, 1957; Orcutt et ai, 1961). With the advances in computer tech
nology, the method has gained increasing popularity in recent years. Howe
ver, many of the advantages and principles of the approach have been
recognized independently and dispersed over several disciplines (Clarke,
1986). In demography, quite a number of applications of microsimulation
exist today - an overview will be given later on in this paper - but a
coherent literature of the essentials of demographic is
lacking. With this paper, we hope to provide a first attempt in this direction.
The outline of this paper is as follows. In section II, the conceptual
similarities and differences between micro- and macrosimulation, already
briefly introduced above, will be elaborated. In section III, we will outline
(1) This description applies to microsimulation in its conventional form. Variance-re
duction techniques like the sorting method work slightly differently. This issue will be ela
borated in section IV. MICROSIMULATION METHODS 99
the strengths of microsimulation, and the resulting relevance of microsimul
ation for demographic projection purposes. However, microsimulation does
have its drawbacks, too. In particular, the issue of randomness is extremely
important in which is why we devote a separate section
(section IV) to it. Section V deals with several other issues that are specific
to microsimulation. Existing microsimulation models in demography, which
can be viewed as concrete applications of the concept of microsimulation,
are briefly reviewed in section VI. The flavour of microsimulation can best
be obtained by having a closer look at a particular application. Therefore,
in section VII we briefly discuss the microsimulation model KINSIM de
veloped at NIDI, not because KINSIM is a particularly spectacular repre
sentative from the microsimulation family, but rather because it neatly
illustrates several of the essential characteristics of microsimulation that
were discussed from a more general perspective in earlier sections. The
paper ends with a summary of the main conclusions.
II. - Microsimulation versus macrosimulation
There are numerous conceptual and practical differences between
microsimulation methods and macrosimulation methods. However, the import
ance of these differences should not be overstated. Microsimulation and mac-
rosimulation have a lot of fundamental principles in common. Therefore, in
order to get a better understanding of the differences, we start this section
with a discussion of the common properties of both methods.
/. Common properties Making a population projection is making
statements about the future of the populat
ion. If such statements about the future are to be meaningful, they must
be based on a more or less valid description of the various processes that
govern the population system. In short, population projections must be
based on a model. For the purpose of this paper, we define a model as a
simplified, quantitative description of reality. This definition excludes sev
eral meanings in which the term 'model' is also commonly used. All models
are simplifications, most models describe reality (with varying degrees of
success), but only some are quantitative (Van Imhoff et ai, 1995).
In particular, a projection model does not contain any non-specified par
ameters. This is in contrast to theoretical models, which describe how vari
ables are linked without specifying the exact functional form of the
relationship. This is also in contrast to estimation models, in which the
functional form is specified but the parameters of the function are not.
All demographic projection models are simplified, quantitative descrip
tions of the processes that determine population structures. They are simplified
in the sense that not all variables affecting population structures are included
in the model (they are also simplified in terms of functional form). They are
quantitative in the sense that one set of numbers goes in and another set of 1 00 E. VAN IMHOFF, W. POST
numbers comes out. As a matter of fact, because of the obvious efficiency
gains that are achieved if all necessary calculations are made by computer
rather than by hand, many projection models are also concrete computer pro
grams. Strictly speaking, the computer program that actually produces the
numbers coming out of the model should not be confused with the model
itself. In practice, the term 'model' is frequently also applied to the computer
program, and, admittedly, the dividing line is not always easy to discern.
This latter observation is particularly true for microsimulation models,
which lose virtually all of their usefulness once the computer is taken away.
In principle, an algebraic representation of a model can be manipulated to
study the properties of the model. A macro model which is not too complicated
might be analysed in this way. For more complex macro models, analytic
manipulation becomes infeasible, so that numerical simulation methods have
to be used to study the implications of the model. Micro models typically
are such that either an algebraic representation is impossible, or the algebraic
representation is so complicated that it does not allow analytic manipulation.
Thus, a micro model almost by definition implies numerical simulation, and
numerical simulation implies a computer program.
Once we have a quantitative description of the population system, it
can be used for describing how this system will develop over time, from
the present into the future. Since the model is a simplified description of
reality, it will always contain certain elements that are exogenous to the
model, i.e. their quantitative value is not explained inside the model. For
projection purposes, therefore, the model will have to be supplemented by
hypotheses concerning the future values of these exogenous elements.
Within the context of the projection model, such elements are
usually referred to as model parameters.
Since projection models make statements about the future, they must
always contain the time element in one way or another. In this sense, all
projection models are dynamic by definition. However, there are many
ways in which time can be included into the model. Merely adding an
index t to all model variables and parameters hardly warrants the term
'dynamic'. A truly dynamic model should not only specify what the system
looks like in the year 2000, but also how the system is supposed to get
there. In other words, the processes that underlie the changes in the system
variables should be explicitly included in the model. (2) In a truly dynamic
model, the focus is on
"events rather than things, processes rather than states, as the ultimate com
ponent of the world of reality" (Ryder, 1964, p. 450).
(2) In a way, the dynamic-versus-static issue is gradual rather than fundamental. For
instance, if a demographic forecaster hypothesizes that mortality rates will fall by 10%
between now and the year 2000, one could argue that the mortality component of the model
is static since he does not specify how the mortality rates are going to fall. Similarly, the
headship rate method for producing household projections is generally termed static since
the changes in age-specific headship are not explicitly specified; however, the changes in the
age-specific population size to which these headship rates are applied are explicitly modelled,
so the headship rate model is dynamic at least to some extent. MICROSIMULATION METHODS 101
Now if we recall the elementary example given in the introduction
to this paper, microsimulation and macrosimulation are essentially two al
ternative methods for making similar statements about the future. Given
a description of reality ("the number of births is determined by the number
of women and the age-specific probability of bearing a child") and given
a hypothesis about the future value of the model parameters ("the fertility
rate will be 0.10"), both methods arrive at the same statement about the
future ("the expected number of births will be 10,000"). Of course, this
does not imply that both approaches are equally suitable implementations
for all descriptions of reality. However, conceptually the two approaches
share the essential feature of being based on a simplified description of
the real world. Just as the term 'simulation' suggests, the method of simul
ation, whether micro or macro, is based on the idea of imitating the pro
cess under consideration. Although the real world is imitated by a
simulation model - and, when supplemented by hypotheses on future
values of the parameters, by a projection model as well -, a model
is just a model and therefore not capable of reproducing the real world.
"What we make when we simulate is not a likeness of the operation of the
world, but a likeness of some sets of our own ideas concerning the operation
of the world" (Wachter, 1987).
2. Differences Both the microsimulation approach and the macro-
simulation approach simulate a dynamic process: they
describe the development of a system over time in terms of the events that
underlie the changes of the central variables in the model. Now the essential
feature of an event is that it is of the either/or type: either it happens, or
it doesn't. At the population level one can speak of the 'average' occurrence
of a certain type of event, but this average remains to be ultimately based
on the individual occurrences. Thus, events are random variables that occur
with a certain probability. When making a statement about a certain future
number of events, we are in fact a about the expected
value of a random variable. In doing so, both the microsimulation and the
macrosimulation approach rely upon the Law of Large Numbers. However,
they do so in different ways. A macro model assumes that the size of the
population (100,000 women) is so large that the projected number of events
(births) may be set equal to its expected value (which is 10,000). A micro
model assumes that the number of repetitions of the random experiment
in the sample (1,000) is so large that the resulting projected number of
events (which might be anything) will approximately equal its expected
value (100 in the sample; 10,000 after scaling to the population level).
Since the simulated process is inherently random, any projection into
the future is subject to random variation. The descriptive model is probab
ilistic. Therefore, the corresponding projection model should, in principle,
not only produce an expected value but also an indication of the variation
around the expected value. In macrosimulation, the random nature of the
process is generally disregarded altogether. Things like standard errors 102 E. VAN IMHOFF, W. POST
could be calculated in principle also in macro models, but it is hardly ever
done in practice, primarily because the necessary calculations are extremely
complicated. In contrast, the random nature of the process is explicitly
modelled in microsimulation, viz. in the form of repeated probabilistic ex
periments (drawing random numbers and deciding whether or not the event
should be deemed to have taken place). Thus, the projections produced by
microsimulation are subject to random variation. Performing several model
runs in produces different projections, from which stand
ard errors can be directly calculated. However, it should be added that this
is still insufficiently done in practice. Too often, microsimulators just pro
duce one model run and leave it at that.
At the heart of any modelling exercise lies the specification of the state
space: the representation of the components of the system of interest. At the
individual level, the state space consists of a number of characteristics or
attributes, each of which can take a certain value. At the population level,
the state space consists of all possible combinations of attribute values: it is
a breakdown of the individuals comprising the population by relevant char
acteristics. If there are К attributes and M, categories for attribute i=l,...,K,
the state space at the macro level consists ofM1xM2x...xMt cells: a matrix
of this size is required for a complete description of the population by relevant
characteristics. In contrast, at the micro level each individual is characterized
by a vector of attribute values of length K; a total population of N individuals
can then be described by a matrix with NxK cells.
In macrosimulation, the calculations required for the projection are
carried out in terms of the cells in the aggregate cross-classification table:
for each cell, the projection model should evaluate how the number it con
tains will change over time. Microsimulation, on the other hand, does its
calculations in terms of the individual records: for each individual, the
attribute vector is updated according to the specifications of the model
and the results of the Monte Carlo experiments. This has two important
consequences for the distinction between microsimulation and macrosimul
ation. First, in microsimulation the behavioural equations of the underlying
descriptive model should be reformulated into model specifications at the
individual level; in macrosimulation they should be translated into beha
vioural equations at the aggregate level. Second, the storage and retention
of information in microsimulation occurs via a list of individuals and their
attributes; in macrosimulation this is done via the aggregate cross-classi
fication table. For most applications where a relatively large number of
attributes is considered, the size of the aggregate table, which consists of
M,xM2x ...xMk cells, is much larger than the size of the list, which cons
ists of NxK cells only. We will return to this issue in the next section.
A further difference between microsimulation and macrosimulation
is that the latter works in terms of the population as a whole, while the
former typically in of a sample. There are two main reasons
for this. First, it would be very unpractical - and infeasible even with modern MICROSIMULATION METHODS 103
computer technology - to include a record for each individual member of
the population. Second, microsimulation models typically take into account
a much larger number of covariates than do macro models. The joint dis
tribution of all state variables and covariates is generally unknown at the
population level. Therefore, the necessary data are obtained from sample sur
veys, either cross-sectional surveys or longitudinal panels. These survey data
can be fed directly into the database of individual records on which the micro-
simulation model operates. Naturally, macromodels also frequently rely on
survey data for estimating information that is not available on the population
level. However, the link between the sample and the model is much more
explicit in the case of microsimulation: the list underlying microsimulation
is a sample, while the aggregate table underlying macrosimulation is the total
population, possibly supplemented by sample information.
Another difference, closely related to the previous one, concerns the
relationship between the empirical data feeding the model and the speci
fication of the behavioural equations. In every modelling exercise, there
is always a moment during the stage of model building at which the data
have to be taken into account. In a macro approach, there is a fair degree
of flexibility on this point. Naturally, when specifying a macro model the
state space has to be properly taken into account right from the start, but
for most of the covariates the estimation phase usually comes at a later
stage and relationships can be specified in an indirect way. In contrast, in
a microsimulation approach all the data have to be taken into account from
the very beginning. To see this, we must recall that in microsimulation all
behavioural equations, in principle, operate at the individual level. Theref
ore, all explanatory variables should be available in the records of indi
vidual attributes (possibly including links to other individuals in the
database). What is more, to the extent that these explanatory variables are
allowed to change over time, the model should also include a behavioural
equation for this change. Thus, microsimulation models can be regarded
as models which generate their own explanatory variables.
If we look at this problem from a different angle, we cannot avoid
the fundamental trade-off that any effort in modelling human behaviour
must face. This is the trade-off between information intensity, on the one
hand, and the capacity to make meaningful predictions, on the other hand.
The dependent variables of human behaviour are always stochastic.
Equally, our knowledge of the determinants of human behaviour is far from
being complete. These two facts together imply that there are limits to the
complexity of a projection model: beyond the certain point, the model
becomes so complex that the resulting projections are no longer meaningf
ul, being dominated by randomness. This holds both for macro and micro
models. However, in macro models it is much easier to isolate the central
process from its surroundings, by treating certain variables as being truly
exogenous. In doing so, one in fact acknowledges that partial processes
are insufficiently understood to justify their inclusion in the model. In
micro models, on the other hand, all explanatory variables must be included 1 04 E. VAN IMHOFF, W. POST
at the individual level, and as a consequence, processes for generating time-
dependent explanatory variables (explanatory for the main process) must
be included in the model as well. Thus, macro models suffer from info
rmation loss, while micro models suffer from high data requirements and
a much larger influence of disturbance terms.
A final difference is that, because of the tight link between data and
model in microsimulation models, standardization of computer software is
much more difficult for micro models than for macro models. Existing
microsimulation computer applications are almost impossible to transfer,
and the software is not really user-friendly. Many macro models are much
more accessible because of the availability of excellent software.
III. - The usefulness of microsimulation for demographic
projection purposes
In the preceding section, we have discussed several similarities as
well as differences between microsimulation and macrosimulation ap
proaches. It was stated that and are es
sentially two alternative methods for making similar statements about the
future. However, despite this essential similarity, for practical purposes one
will virtually never be indifferent between microsimulation and macrosimul
ation. Some types of statements about the future are more conveniently
arrived at using a microsimulation approach, others will require a macro-
simulation approach. For concrete research questions, the differences be
tween the two approaches imply that one method has certain advantages
over the other. In this section, we will indicate the circumstances under which
microsimulation might be more useful than more conventional methods.
1. Strong points A first strong point of microsimulation is its per-
of microsimulation formance under conditions of a sizeable state
space. If the number of individual attributes i
ncluded in the model and the number of values that these can
take becomes larger and larger, macro models tend to become unmanag
eable: the size of the state space increases exponentially with the number
of categories included in the model. Recall that the aggregate table in a macro
model contains M , x M2 x . . . x MK cells, while the size of the list (or database)
in a micro model consists of N x K cells. For even moderately sized problems,
the former will be much larger than the latter. As an example, consider a purely
demographic model for France, in which the population is classified by:
— sex (males/females: 2 categories);
— parity (females only; 0,..., 5+: 6 categories);
— current age (0, ..., 99+: 100