Modeling complex spatial dynamics of two-population interaction in urbanization process

This paper is mainly devoted to lay an empirical foundation for further research on complex spatial dynamics of two-population interaction. Based on the US population census data, a rural and urban population interaction model is developed. Subsequently a logistic equation on percentage urban is derived from the urbanization model so that spatial interaction can be connected mathematically with logistic growth. The numerical experiment by using the discretized urban-rural population interaction model of urbanization shows a period-doubling bifurcation and chaotic behavior, which is identical in patterns to those from the simple mathematical models of logistic growth in ecology. This suggests that the complicated dynamics of logistic growth may come from some kind of the nonlinear interaction. The results from this study help to understand urbanization, urban-rural population interaction, chaotic dynamics, and spatial complexity of geographical systems.


Introduction
The study of the logistic equation as viewed from ecology indicates that a simple deterministic system can present periodic oscillation and chaotic behavior along with the model parameter change (May, 1976). However, why the simple model contains complex dynamics still remains ambiguous. Urban study can provide us with facilities for exploring the springhead of complicated dynamics of logistic process. In the urbanization process, the level of urbanization follows the sigmoid curve and can be described with the logistic function (Karmeshu, 1988;United Nations, 1980;United Nations, 1993). Moreover, the urban system and ecological system show comparability in several aspects (Dendrinos, 1992;Dendrinos and Mullally, 1985), which implies that the process of urbanization might have period-doubling bifurcation or chaotic dynamics. In theory, urban system and the process of urbanization can generate complex behaviors such as chaos (e.g. Dendrinos, 1996;Dendrinos and El Naschie, 1994;Nijkamp, 1990;Nijkamp and Reggiani, 1998;Van der Leeuw and McGlade, 1997;Wong and Fotheringham, 1990). Many studies of chaotic cities are relative to spatial interaction and logistic growth.
On the other hand, a great number of simulation analyses and empirical researches show that urban system bears the fractal structure (e.g. Batty and Longley, 1994;Chen and Zhou, 2003;Frankhauser, 1994;White and Engelen, 1994). Fractal structure and chaotic behavior coexist in lots of systems. Fractal property of urban systems suggests complex dynamics of urban evolution.
What we concern is not only the bifurcation and chaos in the sheer numerical simulation experiments but also the ones that can be captured from the observation data. One of the viewpoints is that fractal actually appears at the edge of chaos and the coexistence phenomenon of fractal and chaos does not imply the certain correlation between them (Bak, 1996). Perhaps this is true, but we still intend to investigate it from the standpoint of urban systems and urbanization dynamics in order to reveal the relation between the chaotic behavior and fractal structure of nonlinear systems. Now chaotic cities and fractal cities have become important branch ranges of self-organized cities (Portugali, 2000). Fractal cities mean the cities with self-similarity or scaling invariance, while chaotic cities suggest the cities with spatial regularity behind random behaviors. The studies of fractal cities and systems of cities are supported by a great number of observations (e.g. Batty and Longley, 1994; Chen and Zhou, 2004;Chen and Zhou, 2006;Frankhauser, 1994;White et al, 1997). However, most applications of chaos theory in the social sciences lack empirical content (Nijkamp and Reggiani, 1992). This situation has changed little for more than ten years. In fact, cities and networks of cities are typical complex systems suitable for exploring complicated dynamics (Allen, 1997;Wilson, 2000). The key lies in how to associate theory with practice and reality. The principal aim of this paper is at two aspects. One is to lay an empirical foundation for researching chaotic cities, and the other is to prepare for revealing the essence of complicated 3 behaviors of simple models and the relation between chaotic cities and fractal cities.
The following parts of this paper are structured as follows. In section 2, we build a nonlinear dynamics model about the urban-rural interaction based on the population census data of the United States of America (USA), and then derive the logistic equation of urbanization level from the model. In section 3, with the aid of the US census data, we demonstrate the feasibility and rationality of the model based on statistical analysis, logistic analysis and numerical simulation analysis. This part is used to consolidate the empirical foundation of the model. In section 4, we implement numerical simulation experiment with the model of urbanization dynamics, testifying whether or not such a model presents all the behavior characters of the logistic equation, including periodic oscillation and chaotic behavior. Finally, in Section 5, the discussion is concluded by making some remarks on the significance of the complicated dynamics research from the aspect of the two-population interaction in urban geography.

Urban-rural interaction models
A variety of mathematical models has been made to describe the spatial dynamics of the urbanrural population migration. Among these models, two are attention-getting. One is the Keyfitz-Rogers linear model (Keyfitz, 1980;Rogers, 1968), and the other, the United Nations nonlinear model (United Nations, 1980;Karmeshu, 1988). The United Nations adopted a pair of nonlinear equations to characterize the urbanization dynamics where r(t) and u(t) denotes the rural and urban population in time t respectively, a, b, c, d, φ and ψ are parameters. If parameters φ=ψ=0, we can derive the logistic model of urbanization level from the UN model. For many years, the United Nations has been using the logistic function to forecast the level of urbanization of each country in the world (United Nations, 1993;United Nations, 2004 (2) According to equation (2), the rural population can not spontaneously flow into the cities and vice versa. The exchange of urban and rural population relies mainly on the urban-rural interaction.
The urban-rural interaction bears an analogy with the predator-prey interaction in ecology (Dendrinos and Mullally, 1985). The size of urban population is influenced by rural population size and in turn reacts on it. So both the growth rate of urban population and that of rural population depend to a great extent on the coupling or cross correlation between the urban and rural population. For a close region, it is theoretically expected b=d. As will be shown later, the US model of urbanization dynamics might be simpler than equation (2). That is c=0 in reality.

Derivation of the logistic model
In order to research into the above model, we need to examine it from two ways: one is the logical analysis, and the other empirical analysis. The logical analysis involves at least two aspects. First, whether or not the level of urbanization derived from the above model is close to the logistic increase, and whether or not the total population in a region is limited. Second, whether or not the result of the numerical simulation is coincident with that of the mathematical deduction.
First, we derive the well-known logistic model on the level of urbanization, i.e. urbanization ratio. The level of urbanization is defined as the proportion or share of urban population in relation to the total population in a region (United Nations, 2004). Thus, we have where L(t) refers to the level of urbanization, P(t)=r(t)+u(t) to the total population, and V(t)=u(t)/r(t) to the urban-rural ratio of population. Differentiating, we get (4) Substituting equation (2) into equation (4) yields For simplicity, taking a region as a close system, then we have b=d. In terms of the definition of urbanization level, equation (5) can be transformed into the following form According as equation (3), we have an urban-rural ratio This implies 1/V(t)=r(t)/u(t)=1/L(t)-1. Therefore, equation (6) can be transformed into a logistic (8) Thus, we have constructed a mathematical relation between models for two interacting population and the logistic growth. Let k=b+c-a=c+d-a represent the intrinsic rate of growth. Then equation (8) can be simplified as the usual form Solving equation (9) yields the well-known expression of the logistic curve where L 0 represents the initial value of L(t). That is, when t=0, we have L(t)= L 0 .
A key criterion to judge the urbanization model is the rationality of the increase curve of the total population. Taking derivative of population P(t) with respect to time t gives t t u t 6 Substituting equation (2) into equation (11) yields Obviously, from equation (12) we can get two inconsistent equations as follows According to equation (13), when a>c, the total population grows more quickly; while according to equation (14), when a>c, the total regional population grows slower. These two equations collide with each other. The inconsistency can be eliminated by two conditions: a=c or c=0. If a=c as given, then the total population will grow infinitely in the exponential way predicted by Malthus (1798Malthus ( /1996; On the other, if c=0, the total population will stop growing when it increases to certain extent. Under the latter circumstance, according as equation (13), since the rural population r(t)→0, the growth rate of the total population P(t) will gradually decrease to 0; According as equation (14), because the whole population will be completely urbanized, i.e.

u(t)→P(t)
, the growth rate of the total population will tend toward 0 ultimately. In the real world, we do have c=0, as will be illustrated in the following empirical analysis.
It is easy to see that b or d is a very significant parameter in equation (2). On the one hand, it controls the developing trend and quantity of the total population; on the other hand, it affects the original rate of growth k value of the logistic equation on level of urbanization. As we know, parameter k dominates the behavior characters of the dynamical system. When k>2.57, the logistic map coming from the discretization of equation (9) will present very complicated behaviors (May, 1976). So what is the case in reality? In the next section, we will validate the above models in virtue of the US observation data. Then we perform numerical simulation experiment to unfold some intrinsic regularity of the urbanization dynamics.

Data and method
The main purpose of this study, as indicated above, is to lay an empirical foundation for further research on complex spatial dynamics of urban-rural interaction. So it is necessary to make relevant statistical analysis of the dynamical equations. There are two central variables in the study of spatial dynamics of urban development: population and wealth (Dendrinos, 1992). According to our theme, we only choose the first variable, population, to test the models. Generally speaking, the population measure falls roughly into four categories: rural population r(t), urban population u(t), total population P(t)= r(t)+u(t), and level of urbanization, L(t)= u(t)/ P(t).
The American data comes from the population censuses whose interval is about 10 years.
Although   Dendrinos and Mullally, 1985;Lotka, 1956;Volterra, 1931  of the city after 1960 is different from before, but the two calibers generally fit with each other.)

Parameters estimation and model selection
In order to make statistical analysis, we must discretize the United Nations model so that it transform from differential equations into difference expressions, i.e., a 2-dimension map. Then the analysis of continuous dynamics changes to that of discrete dynamics. If ∆t=10 as taken, then

u(t) and r(t)*u(t)/[r(t)+u(t)]
be independent variables, and ∆u(t)/∆t and ∆r(t)/∆t be dependent variables. A multivariate stepwise regression analysis based on least squares computation gives the following model This is a pair of difference equations of which all kinds of statistics including F statistic, P value (or t statistic), variance inflation factor (VIF) value and Durbin-Watson (DW) value can pass the tests at the significance level of α=0.01 (Appendix 1). In this model, c=0. Although we should have b=d in theory, they are not equal in the empirical results. There might be two reasons for this.
One is that the US is not a truly closed system because of mass foreign migration; the other is that the natural growth of the urban population is dependent on the urban-rural interaction. The second reason might be more important. But on the whole, the equations as a special case of the United Nations model can better describe the American urban and rural population migration process in the recent 200 years.
In light of equation (10), the level of urbanization should follow the logistic curve. It is easy to calculate the urbanization ratio using the data in The goodness of fit is R 2 =0.9839. For convenience, we set t=year-1790 (figure 2). Thus we have k=0.02238 as the estimated value of the intrinsic growth rate. On the other hand, we could estimate the original rate of growth k value by equation (15): one is k 1 =b-a=0.03615-0.02584=0.01031, and the other is k 2 =d-a=0.05044-0.02584=0.02460. The intrinsic growth rate should come into between k 1 =0.01031 and k 2 =0.03615 and indeed it does. The parameter values estimated from the dynamical system model, equation (15), are similar to that from the logistic model, equation (16). There are some differences between different estimated results due mainly to three factors. The first is non-closed region, the second imprecise data, and the third the computation error resulting from transformation from continuous equation to discrete expression.
For comparison and selection, we also fit the American rural and urban data to the discretization of the predator-prey interaction model. Let r(t), u(t) and r(t)*u(t) be independent variables and ∆u(t)/∆t or ∆r(t)/∆t dependent variables. The multivariable stepwise regression based on least squares computation gives an abnormal result, which cannot be accepted. If we loosen the requirements, then the American urbanization process could be expressed with the Keyfitz model.
However, this mathematical expression has two vital shortcomings, which defies us to accept the Year Urbanization Level

Figure 4 Numerical simulation curve of American urbanization level (1790-2400)
(Notes: The numerical simulation based on equation (15). The saturation value is 1. The curve is identical in shape to that of logistic growth indicated by equation (16)) To sum up, the American model of urban-rural population interaction can be expressed by equation (2) but the parameter c=0. This is the experimental foundation of theoretical analysis of discrete urbanization dynamics. So far, we have finished the building work of the model of urbanization based on the population observation in the real world. In the following section, we will discuss the complicated behaviors of the above model of urbanization dynamics in the possible world in theory.

Complex Behaviors of Urbanization Dynamics
One of the purposes of this work is to prepare for revealing the essence of complicated behaviors of simple models. The discrete equations of two-population interaction between urban and rural systems can exhibit all the complex dynamics arising from the logistic map, including perioddoubling bifurcation and chaos. Chaos theory is a field on the random behavior and latent order of certain dynamical systems, which are highly sensitive to initial conditions (Malanson, et al, 1990).
Chaos is often defined as intrinsic unpredictability of deterministic systems. In other words, a difference equation is regarded as a chaotic system if the solution to the equation is sensitively dependent on its initial conditions.

The discrete urban-rural interaction model can show richer details of complicated behaviors
than what logistic map does, and especially, it can offer a new way of looking at complex dynamics of simple mathematical models. According as equation (15), the parameter c=0, thus equation (8) can be reduced to where the intrinsic rate of growth is k=b-a. The discretization of equation (17) is a finitedifference equation Defining a new variable x t =kL t /(1+k), we can turn equation (18) into the familiar parabola, i.e., a 1-dimension map x t+1 =(1+k)x t (1-x t ).
As we know, according to May (1976), the quadratic map can present periodic oscillation and even more complicated chaotic behaviors under certain conditions. Since equation (17) is derived from equation (2), the behavior characters of equation (18) should be able to be produced by the discretization of equation (2). For testing this hypothesis, we can perform some numerical simulation experiment by using equation (2), which can be discretized as a 2-dimension map The conversion between differential equation and difference will result in some subtle change of parameter values. But for simplicity, we don't modify the parameter symbols after converting equation (2) into equation (19). The numerical solutions of equation (19) shows that when the difference between b and a increases (please notice k=b-a), the growth curve of urbanization level L t indeed changes from simplicity to complexity, from S shape to periodic oscillation and even to chaos. In short, all the behaviors of logistic map revealed by May (1976) can be exhibited by the discrete two-population interaction model ( Figure 5).

Figure 6 A 1-dimension map and a 2-dimension map reach the same goal by different routes
Further, if we ignore the connection between the urban-rural interaction model and the logistic equation by permitting b≠d, then the behavior features of the dynamical system will become much Interaction determines complex behavior richer. When we fix a, c, and d, the system will exhibit periodic oscillation or even chaos; however, when we fix b, the behavior characters of the system do not change along with the changes of the other parameters. It is obvious that the key parameter that determines the system behavior is b, or strictly speaking, is the difference between a and b. In detail, for instance, let's consider a=0.025, c=0, and d=0.05 according to the aforementioned empirical analysis. The curve of the urbanization level changes along with b is in the same way with the result based on b=d, but the critical values of the period-doubling bifurcation route to chaos increases.
Under the circumstances, when the system comes into the chaotic state, it still presents periodic oscillation. However, the period is not only a multiple of 2 any more, but a random integer. For example, when b=3.2, system will enter into period 5 state (figure 7). More experimental results show that system will present period 3 or period 6 in the chaotic state. This illustrates the wellknown Sharkovsky's theorem, and remind us of Li and Yorke (1975)'s discovery. The proposition "period three implies chaos" can be expressed equivalently as "period any number beyond 2multiple implies chaos". Based on the above simulation analysis, we can reach conclusions as follows. First, the key element of the urbanization process lies in rural population rather than urban population.
According as the dynamical model of urbanization for America, the non-linear term of urban-rural interaction connects the city on one end and the village on the other. It is the difference of parameters a and b that dominates the behavior features of the urbanization dynamics. This implies that it is the rural region and urban-rural interaction that determine the progress of urbanization. Secondly, only when b=d, there is strict mathematical relation between the urbanrural interaction model and the logistic equation. On the one hand, the logistic model parameter derived by mathematics is k=b-a, whose value controls the behavior characters of the logistic map; on the other, numerical simulation shows that it is the value of (b-a) that determines the behavior of the urban-rural interaction model. Evidently, the precondition of connecting the urban-rural interaction model with the logistic equation is b=d. Third, the periodic oscillation and chaotic behavior of urban-rural interaction maybe only belong to the results of the sheer theoretical analysis. As we know, since the number of the urban and rural population can not be negative in reality, namely r(t)≥0 and u(t)≥0, it is certain that L(t)≤1 in terms of equation (3). However, if the dynamic system exhibits periodic oscillation or even chaotic behavior, the simulation value of the rural population will be negative, namely r(t)<0. Thus the numerical simulation shows the abnormal phenomenon of urbanization ratio L(t)>1 (see figures 7). Actually, this case is impossible. So period-doubling bifurcation and chaos of urbanization seem not to happen in the real world, it only appears in the imaginary world as a theoretical product (Chen, 2009).
For the sake of understanding the essence of complex dynamics, let's implement a simple mathematical transformation. Removing the nonlinear terms indicating interaction in equation (2) yields The difference between the growth rates of the urban and the rural population is as follows Taking equation (7) into consideration, we have Thus equation (22) can be transformed into a logistic equation Although equation (20) also leads to logistic equation, the dynamical patterns of the linear differential equations are very simple. The numerical simulation based on the discretization of equation (20) shows no periodic oscillation, say nothing of chaos. This suggests that complicated dynamics such as period-doubling bifurcation and chaos coming from the logistic map is in fact rooted in interaction associated with nonlinearity.
An allometric scaling relation between urban population and rural population can be derived from equation (20) such as in which η=u 0 r 0 -b/a is a proportionality coefficient, and here r 0 and u 0 denote the initial values of rural and urban population respectively. Apparently, the allometric scaling exponent σ is given by where D u refers to the fractal dimension of urban population, and D r to the fractal dimension of rural population. This suggests that the logistic equation and the related transformation may be the mathematical link between the chaos and fractals of urban evolvement.
Another interesting discovery is that the logistic equation comes between the exponential growth models indicating simplicity, equation (20), and the two-population interaction model indicating complexity, equation (2). This reminds us that the logistic model maybe implies a mathematical transform between simple expressions and complex dynamics. As space is limited, it is impossible to make all these questions clear here and the pending questions will be discussed in future reports.

Conclusions
Urbanization is a complex process of spatial dynamics with two-population interaction. The mathematical model based on the US observation data can be proposed to describe the nonlinear evolvement. From the urban-rural interaction model, we can strictly derive the logistic equation about level of urbanization. As logistic growth exists widely in the nature and human society, the two-population interaction model may reflect a kind of ubiquitous dynamical systems. Therefore, some research conclusions can be generalized to other fields, including ecology, economics, and geology. Furthermore, since the logistic model is related to chaos, this implies that the process of urbanization has the potential possibility to bear periodical oscillation and chaotic behavior.
Accordingly, the urban-rural nonlinear interaction models could help us better understand the urban system by chaos theory, and meanwhile comprehend more the nature of chaos by urban evolution. The main conclusions of this article are summarized as follows. Secondly, complicated dynamics such as period-doubling bifurcation and chaos result from interaction instead of sheer logistic processes. The logistic model on the level of urbanization can be derived not only from urban-rural population interaction models, but also from the allometric equations of rural and urban population growth. However, the behavior patterns of the exponential growth models are very simple. In other words, periodical oscillation and chaotic behaviors can never be generated by means of the exponential growth equations. This implies that no complicated dynamics appears without interaction between rural and urban population.
Thirdly, the logistic equation may possibly form a mathematics transform relation between simplicity and complexity, which provide us a new way to look at complexity. As indicated above, the logistic model can be derived from the urban-rural interaction models or from a pair of exponential equations. This suggests that the logistic equation may act as the mathematical transform from the nonlinear interaction models to the simple linear equations. Such a relation may become a bridge connecting simplicity and complexity. In particular, if such transform can be testified to have universal property, it can be developed into a fire-new logistic transform method.
If so, the transform will probably associate the unanalysable nonlinear equation with simple rules

A.1 Regression analysis results of the US urbanization model
The regression analysis results of the US model of urbanization based on the least squares computation are tabulated as follows (Table A1, Table A2). The contents include ANVOA summary, estimated values of model's coefficients and related statistics.
If this model is employed to describe the growth of the rural population, the saturation value of the urbanization ratio will be less than 1, which tallies with the actual situation better. However, this model gives rise to two problems. First, the model cannot avoid multi-collinearity, which could be detected from the VIF value in Table A2. Second, based on the model, the total population will not converge but increase infinitely.

A.2 Regression analysis results based on the Lotka-Volterra model
The multivariable stepwise regression based on least squares computation gives the following The first equation has serious problems. Firstly, the estimated values of the parameters cannot pass logical test. The coefficient of the linear term, u(t) should be positive, but it is here negative. The physical meaning of the negative coefficient is inexplainable. What is more, the coefficient of the non-linear term, i.e., the cross term, r(t)u(t), should be negative, indicating that the urban-rural interaction can transform the rural population into the urban population, but it is positive here.
This conflicts with the symbol of the non-linear term of the second equation, in which the coefficient value of the cross term is positive, too.
Secondly, some values fail to pass the statistic test either. VIF value is far more than 10, which implies that there exists serious multi-collinearity between the three independent variables. If we eliminate the non-linear term, then new problems will rise. The test of serial correlation of residual errors can not be acceptable (DW=1.082), and the symbol problem of the linear term u(t) remains unresolved. If we further remove the linear term u(t), then the DW value will decrease to 0.527, which is more unacceptable (Table A3, Table A4). This suggests that the variables are insufficient, or the serial correlation is serious, either of which is against the basic rules of regression modeling.
As for the second equation, nothing seems wrong statistically, but it cannot be understood in logic.
According to this equation, the rural population will automatically flow into cities without the urban-rural interaction and the urban population will increase without any relation to itself. This mathematical expression has two vital shortcomings. The first is the logical problem. According as the model, urban population, rural population and the total population will increase exponentially without any limit, which is against our common sense. The second is the statistic problem. That is, the second equation can not pass the DW-test (Table A3, Table A4).