Numerical Methods for Differential Games: Capital Structure in an R&D Duopoly

This paper compares two different numerical methods used to solve the same differential game. In differential games strategies of individual players are represented as continuous functions of time and are typically solutions to the optimal control problems of the players. The game is an R&D duopoly with two players: an upstream firm that is primarily engaged in research and development (the R&D firm) and whose value comes from the market valuation of these activities, and a downstream firm primarily engaged in distribution and marketing (the D&M firm). The first method is assumed to be the benchmark since it is based on discretizing the first order conditions of each player’s optimal control problem. The second method is based on making random guesses of the parameters of a second order polynomial and searching for optimal solutions. The results suggest that the second method, which is more automated and has the potential of being applied to games with higher dimensionality, can give approximate solutions to differential games similar to the one considered here. The results also provide an important theoretical outcome. They illustrate that unlike the tradeoff and pecking order models of capital structure there are many markets in which capital structure is not driven by a reversion to a target debt-to-equity ratio or a pecking order, but by maximizing firm value under strategic considerations.


Introduction
The main characteristic of differential games is that the decision space is continuous and the optimal strategies of the players are represented by intertemporal functions.The game considered here is a duopoly market characterized by an upstream firm that is primarily engaged in research and development (the R&D firm) and whose value comes from the market valuation of these activities, and a downstream firm primarily engaged in distribution and marketing (the D&M firm).Given an initial endowment of capital, the R&D firm chooses the level of debt over time that maximizes firm value; and the D&M firm chooses the level of investment in the R&D firm's research that maximizes firm value.A finite planning horizon is assumed for the analysis and a Nash equilibrium is obtained using two different numerical methods.
For differential games there is limited class of models for which analytical solutions can be obtained.For this reason it is useful to have reliable numerical methods to obtain solutions.Here two different methods are considered.The first numerical method discretizes the Hamiltonians that represent the first order conditions for the optimal control problem of each player.Recursive calculations are used to find a solution that satisfies the initial and terminal constraints, and from this the path of the state equation is determined.A Nash equilibrium is obtained when the optimal strategies are consistent in the sense that neither firm has an incentive to change its strategy.
The second method randomly generates parameters for second degree polynomials for the decision variables of the R&D and D&M firms and from this derives an estimate of the state equation and a Nash equilibrium.Since the first method is based on first order conditions of the Hamiltonians, the assumption is it provides a better estimate and is a benchmark for the second method and can be used to determine if the second method provides an adequate approximation to the solution.Because of its iterative nature, the second method lends itself to computerization and can potentially be applied to problems with a higher degree of dimensionality.
The model used here is similar to real world cases in which the initial IPO often represents the firm's only issuance of stock.This model fits high-technology industries that require R&D and the development of new products to sustain itself.Examples include the pharmaceutical, computer software, and medical devices industries.The focus is the R&D firm's choice of debt and how this is reflected in its financial distress and capital structure.This paper builds on an earlier paper by the author (Beach, 2015).The literature discussion in Section 2 draws from this as does the theoretical model of an R&D duopoly in Section 3. Also, the earlier paper applies a collocation method similar to the one used here to obtain a solution.However there are important differences: in this paper there is a target level of R&D capital that constrains the model and adds an additional consideration to the solution algorithm; another numerical method is considered which is based on the first order conditions of the Hamiltonians of the optimal control problems of the two players; and the focus of this paper is a comparison of these two methods.It should be added that part of the motivation for this paper is to answer an issue that arose in the previous paper: how do we know if the estimated solution is accurate?
This paper contributes to the literature on oligopolies and duopolies by suggesting two different solution methods to obtain a Nash equilibrium for differential games similar to the one here.Both methods can be easily implemented and do not require any additional optimization software.Further, the results suggest that there are many markets in which a firm's capital structure evolves not based on a tradeoff theory or a pecking order theory, but rather based on reacting strategically to decisions made by other firms in the market.

Capital Structure and Markets for R&D
How does a firm's capital structure evolve over time?In their now classic paper Modigliani and Miller (1958) argue that, given a number of assumptions including the absence of taxes, financial decisions are independent of operational decisions and free cash flow.Firm value does not depend on whether the firm is financed with debt or equity.This result quickly breaks down in the presence of taxes because of the so-called tax shield and the tax savings that result from deducting interest expenses from taxable income (Mogdigliani & Miller, 1963).This suggests that firm value is maximized when it is mostly funded with debt.Since this is seldom observed in the capital structure of actual firms other theories have been proposed to explain the evolution of a firm's capital structure.The two most popular theories are the tradeoff and the pecking order theories.The tradeoff theory, as represented by Stiglitz (1972) and later Castanias (1983), says there is a tradeoff between the advantage of debt that comes from the tax shield that arises because interest payments are treated as an expense, and the increased probability of financial distress as the level of debt is increased.The pecking order theory, as developed by Myers (1984) and others, argues that firms prefer internal financing whenever possible since outside investors require a premium based on asymmetric information as to the actual financial and operational health of the firm.Other theories approach capital structure from the agency cost and corporate governance perspective.For example, Galai and Masulis (1976), Jensen andMeckling (1976), andStultz (1990) argue that debt encourages decision makers to make riskier investments since the downside is borne disproportionately by bondholders.
All of these models deal with financial distress in one way or another.Financial distress is usually defined as a prelude to a possible bankruptcy in which the firm has trouble meeting its financial obligations.Additional costs due to financial distress include higher costs of capital, increased rates required from suppliers and other supply chain issues, and human resource issues having to do with hiring new personnel and keeping key personnel.See Van Binsbergen, Graham, and Yang (2010) for a thorough discussion of the costs of debt and financial distress.
There is also a body of literature that considers R&D decisions in an oligopoly.As an example, Tishler and Milstein (2009) predict a relationship in which innovation through R&D initially declines to maintain net income as more firms enter the market, but as competition becomes increasingly intense investment in R&D can increase in order to maintain product differentiation and market share.Lambertini and Rossini (2004) look at R&D vertical integration and differentiation in an oligopoly.They ask the question whether a firm in this market should vertically integrate its R&D activities.They find in a Cournot competition where firms decide on the level of output, integration is a dominant strategy.However, Bertrand competition based on price leaves room for non-integration.They also consider the case where there are upstream R&D firms and downstream nonintegrated firms and find that these firms may invest more in R&D and perform better than their integrated counterpart.This paper considers capital structure and financial distress within the context of a duopoly market characterized by an upstream firm that is primarily a research and development firm and whose value comes from the market valuation of these activities, and a larger downstream firm primarily engaged in distribution and marketing.An actual market that fits this model is that of R&D in the pharmaceutical industry.These R&D firms provide a good example because they seldom have positive free cash flow and can go years without generating any revenue at all.Their value comes from the potential of their R&D program and the possibility of a breakthrough resulting in a license agreement, an IPO, or the sale of the firm to a larger, usually downstream firm.This is supported by Higgins and Rodrigues (2006) who show that pharmaceutical companies experiencing pipeline deterioration, due mainly to projected drug treatments failing critical clinical tests, are more likely to acquire other pharmaceutical firms with complimentary pipelines.In a similar vein, Graham and Higgins (2008) argue it can be more efficient for a large, mature pharmaceutical firm to specialize in downstream activities such as distribution and marketing and to use smaller firms engaged in research to develop its drug pipeline.This allows the larger firm to minimize the risk of failed clinical tests by choosing among the more promising R&D firms.

Differential Game Model
The differential game has two players: an R&D firm whose principal value comes from its R&D pipeline and license fees from the sale of some of its research to the D&M firm; and a D&M firm whose value comes from investing in R&D by purchasing licenses or patents from the R&D firm and producing a marketable product from this.
The R&D firm has an initial endowment of capital that comes from an IPO or some other source.Then it must choose the level of debt, D t , that maximizes firm value over a finite planning horizon such that its target level of R&D capital is met.The D&M firm chooses the level of R&D investment to purchase, , that maximizes firm value over the same finite planning horizon.Since these are continuous variables, the levels of R&D debt and D&M investment are continuous functions that span the T-period planning horizon.A Nash equilibrium occurs when neither the R&D firm nor the D&M firm has an incentive to change its strategy.
The argument for a finite planning horizon is twofold.First, a typical exit strategy for investors in small cap, high growth firms is to grow the company and then get acquired by or merge with a larger firm.Usually the investors come in the form of a venture capital fund or some other form of private equity.These funds typically focus on one or two firms and have a life of five to ten years.Second, the benchmark numerical method employed here discretizes the decision space and uses a type of "shooting" algorithm to converge to the decision functions that maximize firm value and achieve the target R&D capital over the planning horizon.A fixed planning horizon facilitates this method and allows for convergence to a Nash equilibrium in which neither firm has an incentive to change its strategy.
The R&D firm's optimal control problem is to maximize the following: where J RD is the value of the R&D firm; ρ is the weighted average cost of capital (WACC) and is the firm's discount rate; K t represents capital assets devoted to R&D at time t; A is a scale coefficient and α is the exponent for the R&D production function which represents the value of the firm's R&D; I t is the D&M firm's investment in R&D projects; D t is new debt; r is the cost of acquiring debt; and the term is the cost of financial distress where the parameter c is positive and determines the level of financial distress and δ is the exponent of the financial distress function.To simplify the model, the market price for R&D is determined by a downward sloping price function: g -hK t .This is meant to capture the effect of an increase in the supply of R&D on the market price.The capital assets here refer specifically to those assets used in the production of R&D.Any other capital created from net earnings is assumed to be available to shareholders at some future point in time.
The D&M firm's optimal control problem is to maximize the following: where J DM represents the value of D&M firm; ω is the firm's WACC; b is a scale coefficient and β is the exponent of the D&M production function which represents the value of a marketable final product; (g-hK t )I t is the cost of investing in R&D; and is a term that allows for additional costs of acquiring intellectual property.All other variables are as defined above.
The state variable for the system given by Equations 1 and 2 defines the instantaneous growth of capital for the R&D firm.That is: with the conditions: (4) where K 0 represents the initial equity of the firm from, say, an IPO and K T represents the target capital at the end of the planning horizon, time T. Equation 3 states that R&D capital increases as the R&D firm takes on more debt and decreases as the D&M firm invests in R&D for the production of a marketable good by purchasing it from the R&D firm.
The Hamiltonian equation and the first order conditions based on Pontryagin's maximum principle for the R&D firm for the system represented by Equations 1 through 4 are stated in Equations 5 through 7 below.
where λ t is the costate variable and all other variables are defined above.
The Hamiltonian equation and the first order conditions for the D&M firm are stated in Equations 8 through 10.
where Φ t is the costate variable and all other variables are defined above.
With respect to the model parameters two key assumptions are made: (i) production functions for both the R&D firm and the D&M firm exhibit diminishing marginal productivity (α < 1 and β < 1), a standard assumption in economic models; and (ii) the financial distress function exhibits increasing marginal costs (δ > 1).For a study that supports this assumption see Van Bingsbergen, Graham, and Yang (2010).

Numerical Solution Results
There are two basic considerations in applying numerical methods to obtain a solution in the form of a Nash equilibrium to a differential game.First, a method to solve each player's optimal control problem is required.Second, given solutions to the optimal control problems, a method of converging to a Nash equilibrium in which the functions for the control variables, the strategies, are consistent with each other and the player's have no incentive to change their strategies.
A number of numerical methods have been developed to solve optimal control problems.These can be classified as those that seek a solution to a system of ordinary differential equations and those that seek a solution to a nonlinear optimization problem.Both approaches have a number of well-established methods that can be applied.
With respect to solving a nonlinear optimization problem, the framework for the approach here, there are two broad categories, indirect and direct.Indirect methods include shooting methods in which initial guesses are made of conditions at one end of the interval and then observing if the terminal conditions are met.If not, another guess is made.Direct methods include a global collocation method in which intertemporal functions for the control variable are generated across the time interval and tested for optimality.
These methods evolved to solve optimal control problems with a large number of variables in aerospace and engineering.Many economic and financial models have a smaller number of variables.Because of the much reduced dimensionality, the methods developed here are more straightforward and do not require optimization software packages such as MATLAB.The first method relies on the first order Hamiltonian conditions.The decision space is discretized and an initial guess of the costate variable of the Hamiltonian is made.Using the recursive nature of the Hamiltonian conditions, the path of the intertemporal function for the control variable and also the path of the state variable is determined.The terminal value is calculated and if the terminal condition is not met a new initial guess of the costate variable is made.This is referred to as the "shooting" method.Because it is based on first order conditions of the Hamiltonian this is taken as the benchmark method.
The second method uses a randomization method to generate guesses for the parameters of a second order polynomial across the time interval.If the terminal condition is not met this guess is thrown out.The parameterization that results in the highest value for the objective function is taken as the solution.This is referred to as the "collocation" method.(See Rao, 2009, for a survey of numerical methods for solving optimal control problems).
Both methods seek a Nash equilibrium in which the R&D firm's optimal debt is consistent with the D&M firm's optimal choice of investment in R&D in the sense that neither firm has an incentive to move from the equilibrium.Both algorithms for the numerical solutions to the differential game are outlined in the appendix.
The parameterization of the model defines a typical growth case.The initial issue of equity from, say, an IPO is set to $50 million.(Although arbitrary, all dollar amounts are expressed in millions of dollars.)The terminal or target value is set to $75 million and the planning horizon is set to ten years.This scenario is roughly similar to a firm emerging from the startup stage and entering a rapid growth stage.Often this is financed by a private equity or venture capital fund with a merger or acquisition exit strategy in mind.These funds typically have a lifespan of five to ten years.
Recall the initial level of capital, K 0 , is equal to the equity raised from, say, an IPO.The terminal condition specifies the target for the state variable at the end of the planning horizon, K T .Hence the R&D firm and the D&M firm maximize firm value given an initial and terminal value for R&D capital.What drives the model is the relationship between R&D debt, D t ; D&M investment, I t ; and R&D capital, K t .This relationship is captured in the state equation, Equation 3. To analyze the impact of financial distress, two cases are considered: low financial distress (c = 0.1) and high financial distress (c = 0.2).The financial distress parameter captures the additional costs incurred as the level of debt and the probability of default increase.The parameters must also meet the two key assumptions discussed above: (i) production functions for both the R&D firm and the D&M firm exhibit diminishing marginal productivity; and (ii) the financial distress function exhibits increasing marginal costs.
Each firm's weighted average cost of capital (WACC) is used as its discount rate.The assumption is the R&D firm is younger and smaller and faces a higher WACC than the D&M firm which is assumed to be larger and with deeper pockets.All parameters, the assigned values, and descriptions are listed in Table 1.
Since the shooting method is used as the benchmark, the basic results from this method are presented in Table 2 and Figures 1 through 4. The results of the comparison between the shooting method and the collocation method are discussed in Section 5 and presented in Tables 3A, 3B, and 4; and Figures 5 through 8.
Firm Value.From Table 2, comparing terminal values at the end of the planning horizon for low financial distress to high financial distress, R&D firm value decreases, going from $1867.62 million in the low financial distress case to $1664.64 million in the high financial distress case, a decrease of 10.86%.The loss of R&D firm value comes from the reduced level of capital formation across the planning horizon.Note.This table compares values at the end of the planning horizon for low financial distress (c = 0.1) to high financial distress (c = 0.2) given T = 10, K0 = $50, KT = $75.Dollars in millions.
Firm value for the D&M firm goes from $1,585.57 million to $1,317.89 million, a decrease of 16.88%.This decrease is primarily due to the increase in the price of R&D that results from the lower level of capital formation in the high financial distress case.Note that this means that an increase in financial distress for the R&D firm also has a negative impact on firm value for the D&M firm.
D&M Investment.From Table 2, total D&M investment decreases going from $88.98 million for low financial distress to $65.81 million for high financial distress, a decrease of 26.04%.From the D&M Investment Function chart in Figure 1, it is clear that the optimal path for the D&M investment function over the planning horizon is somewhat higher under low financial distress than the path for high financial distress.
R&D Debt Formation.From Table 2, total debt is $97.09 million for low financial distress and $76.00 million for high financial distress, a decrease of 21.7%.From the R&D Debt Function chart of Figure 2 the path of the debt function for low financial distress is higher in the earlier years and decreases in the later years as it converges with that for high financial distress.This occurs because under low financial distress the R&D firm is less constrained by the burden of financial distress and can move debt formation and capital formation forward in time and hence subject to less discounting.R&D Capital.From Figure 3, R&D Capital Formation in the low financial distress case increases to a peak of $108 million and then winds down to meet the target of $75 million.The high financial distress path falls below the low financial distress case across the planning horizon peaking at $83 million until meeting the target of $75 million.That R&D capital formation is less in the high financial distress case and only catches up to the low financial distress case at the end of the planning horizon is directly related to the path of the debt function.Under high financial distress, debt formation is pushed further back in the planning horizon as described above.This means that R&D capital increases at a higher rate in the later years in order to reach the terminal condition for R&D capital of $75 million.
Debt Ratio.The debt ratio is calculated by dividing the book value of debt by the book value of total assets.This is the sum of R&D capital and the retained earnings generated by the R&D firm's sale of research to the D&M firm.Considering the terminal debt ratio, it actually goes up from 0.202 for low financial distress to 0.208 for high financial distress, an increase of 2.97%.The higher debt ratio for high financial distress is primarily due to the reduced assets in this case.This comes for the most part from a reduction in the creation of net earnings.
From the debt ratio chart in Figure 4 it can be seen that there is no one optimal debt-to-equity ratio.It varies over time as the R&D firm and the D&M firm make optimal strategic decisions with respect to debt formation and investment.Summary.In comparing the results for low financial distress to those of high financial distress, firm value decreases for both the R&D and D&M firm.Again, this means that not only is the R&D firm affected by increased financial distress but also the D&M firm.Both D&M investment and R&D debt decrease; the path of D&M investment is somewhat higher for low financial distress and grows over the planning horizon; the path of R&D debt is higher in the early periods for low financial distress and converges in the later periods with the curve for high financial distress; the terminal debt ratio, somewhat counterintuitively, is higher under high financial distress due to a lower level of asset formation; and R&D capital formation under low financial distress is higher across the planning horizon.Finally from Figure 4 it is clear that the debt ratio evolves over time as it responds strategically to decisions of the D&M firm.

Comparison of Methods
The collocation method is compared to the shooting method in three ways.First, terminal values are compared for the two methods.Table 3A compares terminal values for both numerical methods in the low financial distress case.Table 3B compares terminal values for both methods in the high financial distress case.Second, a graphical comparison is made for all the cases for D&M investment, R&D Debt, R&D capital, and the debt ratio.Figure 5 compares D&M investment functions for low financial distress and high financial distress for both methods; Figure 6 compares R&D debt functions for low financial distress and high financial distress for both methods; Figure 7 compares the path of R&D capital for low financial distress and high financial distress for both methods; and Figure 8 compares debt ratios for low financial distress and high financial distress for both methods.The comparisons are presented as side-by-side graphs.The closer they match the more the collocation method replicates the shooting method; the more they diverge the less the collocation method replicates the shooting method.As above, points on low financial distress curves are represented by diamonds and points on high financial distress curves are represented by triangles.Third, a metric was developed to measure the closeness of fit of the compared curves.The metric is defined as the sum of the squared differences at each of the discretized points then, to adjust for scale, divided by the average of the values for the shooting method at these points.The larger this number the less the two curves coincide and the worse the approximation for the collocation method.These results are displayed in Table 4. Note.This table compares values at the end of the planning horizon for the shooting method and the collocation method for low financial distress (c = 0.1) given T = 10, K0 = $50, KT = $75.Dollars in millions.Considering terminal values in the low financial distress comparison, Table 3A: the difference for R&D firm value is less than 1%; for D&M firm value less than 10%; for D&M investment less than 10%; for R&D debt less than 1.1%; and for the debt ratio less than 20%.Considering terminal values in the high financial distress, Table 3B: the difference for R&D firm value is less than 2%; for D&M firm value less than 2%; for D&M investment less than 7%; for R&D debt less than 12%; and for the debt ratio less than 10%.
Next, graphical comparisons of the two methods are made.For the D&M investment function, Figure 5, the curves for high financial distress are very similar with respect to the start and end points and the shapes of the curves.For low financial distress, the start point is off (close to 7.0 for the collocation graph as opposed to 4.0 in the shooting method) and the collocation curve is concave up rather than concave down.
Comparing the R&D debt results, Figure 6, the starting and end points in both low financial distress and high financial distress are very close.The one difference is the R&D debt function for high distress in the collocation method is higher in the early years than for the shooting method.
The path of the R&D capital, Figure 7, is very close in all cases for the start and end points and the shape of the curves.This suggests that anomalies discussed above for R&D debt and D&M investment have only a small impact on R&D capital.
The debt ratio for low and high financial distress, seen in Figure 8, are also very similar.The start and end points are very close in all cases, and in both methods rise to a peak in the early years and then trail off as the R&D firm converges to the target capital level of $75 million.The one inconsistency occurs in the high distress case in the collocation method in which the debt ratio rises faster in the early years than is the case for the shooting method.Finally, Table 4 calculates the metric for comparing the graphs in Figures 5 through 8. Based on Table 4, the worst cases are R&D debt under high financial distress and D&M investment under low financial distress with metrics at 3.57 and 5.98 respectively.This can also be seen from the graphs.Recall for R&D debt under high distress, the collocation curve has approximately the same start and end points as the shooting curve, but R&D debt remains higher in the early years.For D&M investment under low distress, the collocation curve has a higher starting point and the curvature is concave down rather than concave up as in the shooting method.Comparison of methods based on the squared differences of the discretized values over the planning horizon.Adjusted for scale by dividing by the average value of the benchmark shooting method.

Conclusion
It is argued for oligopoly models similar to the one considered here numerical methods can provide a path to considering models with no clear analytical solutions.Two approaches are considered here: a shooting method and a collocation method.The shooting method was used as the benchmark since it is based on the first order, Hamiltonian equations of the firm's optimal control problem.The comparisons show that the global collocation method of randomly selecting parameters to a second order polynomial over multiple iterations also provides reasonable estimates to a solution.The feeling is that the collocation method lends itself more to automation and offers more potential for being applied to problems with higher dimensionality, either in terms of the number of players or in terms of the number of decision variables.
The results also provide an important theoretical outcome.They illustrate that unlike the tradeoff and pecking order models, still the most widely accepted and widely studied models of capital structure, there are many markets in which capital structure is not driven by a reversion to a target debt-to-equity ratio or a pecking order, but by maximizing firm value under strategic considerations.

Figure 5 .
Figure 5.Comparison of D&M investment Note.Compares D&M investment for the shooting and collocation methods for low and high financial distress.

Figure 6 .
Figure 6.Comparison of R&D debt Note.Compares R&D Debt for the shooting and collocation methods for low and high financial distress.

Figure 7 .
Figure 7.Comparison of R&D capital Note.Comparison of R&D capital for the shooting and collocation methods for low and high financial distress.

Figure 8 .
Figure 8.Comparison of R&D debt ratios Note.Comparison of R&D debt ratios for shooting and collocation methods for low and high financial distress.

Table 1 .
Parameter valuesThis table lists the model parameters, their values, and a description.

Table 2 .
Terminal values for shooting method

Table 3A .
Terminal values compared: low financial distress

Table 3B .
Terminal values compared: high financial distress This table compares values at the end of the planning horizon for the shooting and collocation method for high financial distress (c = 0.2) given T = 10, K0 = $50, KT = $75.Dollars in millions.

Table 4 .
Comparison of methods