Measuring the Efficiency of Tax Collection among Economic Sectors in Paraíba State Northeastern Brazil (2013-2015)

This study investigates the efficiency of tax collection on operations related to the circulation of goods and interstate services (ICMS) in far east of Brazil, Paraíba State. The efficiency was estimated using quarterly data of the electronic invoices from the period of January 2013 to December 2015. In addition, we aim to identify levy’s key factors among distinct sectors, disaggregated into 489 sub-classes, according to the national classification of economic activities. It was used a stochastic frontier analysis which suggests that the average of the technical efficiency of the tax collection among sectors was 73.75% of the potential tax revenues. The amount of uncollected tax during the studied period were approximately US$7 billion. There is technical inefficiency of tax levy among important sectors of the economy of the state of Paraíba, demonstrated by 88.88% of inefficiency of tax collection itself. The sector comprehends clothing, wholesale of personal care products and leather shoes, among others. We verify that an increase of oversight actions by the tax collection agency helps to inhibit the inefficiency of tax levy.


Introduction
Brazilian tax system is complex, with subnational entities as states and municipalities that share distinct sources of revenues (Varsano et al., 1998;Piancastelli, 2001). Tax management is subject to many types of inefficiency caused by opportunism in different areas of the government (OECD, 2017). This framework brings the opposition of interests and gets worse when we take into account: 1) There is a great distance between the taxpayer and the decisions of government; 2) That local tax offices are responsible for collect the financial resources of local governments (Huang, Yu, Hwang, Wei & Chen, 2017) and; 3) That the political instability also reduce the efficiency of tax collection (Aizenman & Jinjarak, 2008).
Moreover, Barros (2007) and Brun and Diakité (2016) proposes that inefficiency in taxation depends more on policy decisions than on tax administration performance. They also argue that low-income countries have higher tax effort, even if their tax effort decline, while Alm and Duncan (2014) claims that countries considered to be inefficient can improve their fiscal position by using inputs more strategically. In Brazil, the continuous process of increasing tax burden registered in the last decades constrains tax collection by government (Mariano, 2005). Base erosion, which constitutes a serious risk to tax revenues, and recession affect the inefficiency of tax collection and constitute a serious barrier for local development process (Cordero, Dí az-Caro, Pedraja-Chaparro & Tzeremes, 2018). In this scenario, decreasing of public investments and competition for savings are inevitable (Schwengber & Ribeiro, 2000).
However, the measurement of inefficiency of tax levy, as in Brazil as in other countries, is full of complexities (Ferrigno, 2006;Mattos, Rocha & Arvate, 2011;Postali, 2015). It affects the tax framework, as well as the funding of the Brazilian public sector itself. The advent of Brazilian Complimentary Law No 101/2000 -fiscal responsibility lawhas increased the transparency in the revenues' prediction and promote control on public spending. Hence, it is important to analyze the variations of efficiency of tax collection in order to understand its causes and provide information for fiscalization and evasion prevention. In this direction, our paper estimates the technical efficiency of tax collection on operations related to the movement of goods and services in interstate, intermunicipal and communication transport (ICMS) using stochastic frontier analysis.
In Paraí ba, eastern state of Northeast Brazil, the economy is mainly based on the commerce and services sector, that accounts for 94.33% of total tax levy. The innovation of the stochastic frontier analysis developed by Battese and Coelli (1995) is to consider that external random factors also affects the performance of technical efficiency. In a similar strategy followed by Ferrigno (2006), the dataset is composed by the sub-classes of sectors disaggregated by the National Classification of Economic Activities. The panel data of the electronic invoices issued in the state of Paraí ba between 2013 and 2015 was provided by the Secretary of State of Revenue of Paraí ba (SER/PB). The electronic invoice is a system with digital accounting and digital fiscal accounting dimensions. It is part of the new information technologies within the Brazilian federal public administration, integrating the e-Government framework.
Our goal is to improve the strategy of tax collection agency, similar to the Internal Revenue Service in the United States, in the direction of reducing the inefficiency of tax levy. The frontier estimated by sector from selected inputs, e.g. control variables of local government policies, market characteristics, tax system and stochastic variables. The technical inefficiency is decomposed into control and stochastic variables as well. Proxies as the hours worked per sector, the number of tax rates per sector, as the concentration of the economic sectors are used to characterize the complexity of the locus and tax system. The model also let us to test if sectors that receiving more attention from the tax collection agency tend to be more efficient than the others.
The hypotheses of this paper are that (i) sectors with small number of firms and less labor intensive have better efficiency scores; (ii) sectors which receive more attention of tax collection agency have efficient scores of tax levy; (iii) the higher the number of tax rates per sector, the more inefficient the scores of tax levy will be. This paper is divided into five sections, including this brief introduction. Section 2 details the empirical strategy. Section 3 deals with the design of the sample and data processing. The results are presented in section 4, and section 5 exhibit the conclusions regarding the efficiency of tax collection.

Stochastic Frontier Analysis
This paper uses the estimation of efficiency of tax collection initially proposed by Aigner, Lovell and Schmidt (1977) and Meeusen and Van den Broeck (1977). The efficiency score is decomposing into technical and allocative price efficiency (Charnes, Cooper & Rhodes, 1978). Generally, non-parametric methods for obtaining these scores do not require a specification of the functional form in the estimation process. Although non-parametric methods are more flexible in terms of their algorithm to obtain the scores, their results are generally inelastic to the stochastic factors (Syrjä nen, Bogetoft & Agrell, 2006). Thus, the main difference between the models to estimate efficiency scores is whether or not to consider random errors inherent to the model estimation process (Coelli & Battese, 1998).
The theoretic original specification of efficiency involves a production function with a two-component error term one to capture technological inefficiency, and another to represent the random effects. A limitation of such models is the high number of hypotheses necessary to be defined, e.g. the estimator type, the functional form to fits the data, and the type of distribution of the error term μ i,t . Our concern goes in the direction of an estimator that incorporates the technical inefficiency component of production for panel data, as first proposed by Pitt and Lee (1981). In addition to estimating the frontier, Battese and Coeli (1995) suggests analyzing inefficiency by decomposing it into a vector of variables. It is a regression model estimate by maximum likelihood with an error term considered asymmetric and not normal.

Frontier Estimation
In their model, Battese and Coeli (1995) decomposes the error term (e i,t ) in two components: a non-negative random variable, representing technical inefficiency, and μ i,t , a random error term that reflects the stochastic influences on cross-section sectoral units that may not be controlled. The main advantage of this model is the possibility of doing analysis of statistical inference. The equation for the score of efficiency of tax levy adapted from Battese and Broca (1997) is: wherein R i,t is the actual revenue, x i,t is a vector (1×k) factors that influence the ability of tax levy, C i,t (x i,t ) is the tributary capacity and Iefic i,t ϵ [0, 1] is the efficiency of tax collection index. In this model, the observed tax revenue is also affected by random factors, generally captured by a white noise error. Therefore, the function (1) is now shown as follows: where v i,t denotes random errors independent and identically distributed (i.i.d.); μ i,t is a non-negative random variable; Iefic i,t = exp (-μ i,t ) is the differentiated efficiency index for each firm at a given time; β i,t is a vector (k × 1) of unknown parameters to be estimated; and x i is a vector (1 × k) of values corresponding to a function of the institutional and structural variables, i.e., of the inputs of the tax-producing function. The random error μ i,t is based on the conditional expectation of exp (-μ i,t ), given Tax collection deviations in relation to the frontier function also reflect failures in the optimization of the public administration process. In order to measure it properly, the error term μ i,t has been decomposed into a symmetric term, which captures random factors out of control of the tax collection agency, and a term that represents the technical inefficiency of each sector. More specifically, the degree of an industry's efficiency of tax collection relative to its potential will be estimated using (4), which indicates the radial distance from the origin and the frontier (Kumbhakar & Lovell, 2000). This measure will be written using the ratio and will be considered efficient if the metric approximates on the unit value: where f(x i,t, β i,t )e v i,t is the time-variant stochastic production frontier, and v i,t is an unrestricted signal random variable. The deterministic frontier f(x i,t, β i,t ) is common to all sectors, and the term e v i,t has the purpose of capturing the effect of random shocks that are outside the control of the tax collection agency for each sub-class of economic activities. This relationship means that if TE i,t is the product vector (tax levy) and y i,t is the percentage of the maximum possible tax levy. Thus, consider that f(x i,t, β i,t ) is linear in logarithms, we have the following model: where μ i,t = -lnTE ≥ 0. In equation (5), the deviation between the deterministic part of the production frontier and the level of tax collection y i,t , while v i,t is a symmetric error term. It captures any random shock outside the control of the tax collection agency (Kumbhakar & Lovell, 2000).
It is assumed that the v i,t is i.i.d., with normal distribution of zero mean and variance ζ 2 , and μ i,t is an error term that captures the effect of technical efficiency. It does not assume negative values, and has a normal truncated extended distribution, where the conditional mean of the inefficiency parameterization is a linear function of the proposed regressors. Their calculation may be denoted as the frontier tax collection deviations. In addition, the stochastic frontier parameters and the technical inefficiency are estimated by maximum likelihood (Wang & Schmidt, 2002). Whereas e i,t = v i,t -μ i,t , we have : This framework is defined by a system of two equations. The first one (Equation I) is given by equation (4), i.e. the estimation of the stochastic frontier. The Equation II consists in estimating the efficiency deviations in relation to the control variables, using linear regression. The functional form of stochastic frontier analysis is defined as a Cobb-Douglas type function, which is justified (i) because most of the functions violate one or more desirable econometric properties when conditioned to the envelope functional form; (ii) to avoid bias problems in the parameter estimates and (iii) to avoid problems of multicollinearity that often occur in the translog-type functions.

Empirical Model
The parameters of equation (7) represent the estimated empirical form (Equation I), obtained by maximum likelihood, which allows the calculation of scores of efficiency of tax collection by sub-classes of sectors. Hence, it will be possible to identify and improve the fiscalization strategies for inefficient sectors that have a greater impact on tax levy.
where the variables are TAX, as the value of tax collection by sector; ALIQ is the average of the different rates existing in each sector; DIS represents the discharge percentage over the total value of sales; EXIT is the fraction of the goods whose destination are other Brazilian states in relation to the book value of total sales; AGRI is a dummy variable that tries to control the results for the sectors related to agriculture; the IND is a dummy variable for industry-related sectors; WHO characterizes the sectors related to wholesale commerce; the variable L represents the weekly worked hours on average per sector; B is the ratio of the basis of calculation of ICMS by sector in relation to the total value of the products sold; and S is given by the Herfindahl index, calculated by the ratio between the number of firms/business and the total book value of sales by sector.
The ALIQ characterize how each sector responds to taxation. However, this relationship may indicate some concavity, similar to that proposed by the Laffer Curve (Laffer, 1986). According to Allingham and Sandmo (1972), an increase in tax rate reduces the revenues, similar to a risky asset, ceteris paribus, the risk averse agent will take less risk and to avoid tax evasion. The average tax rate also has the function of determining the potential tax collection in each sector. The percentage of discharge determines the potential tax levy (Ferrigno, 2006), and is consider to be non-collected tax due to the particularity of tax levy legislation.
The average weekly hours worked per sector was used to verify if whether the more labor intensive and less capital intensive are, the more inefficient tax collection is. The Herfindahl-Hirschman or IHH index (Herfindahl, 1950) measures the concentration of the economic sectors. It represents the market share of each firm i from sector j. We expect that the higher the index, the higher is tax levy, since this means that the tax collection agency (SER/PB) would have fewer firms and fiscalization costs. Moreover, we observe how the fiscalization efforts and tax collection are related to the inefficiencies not explain by (8). Equation II is given by a simultaneous linear regression, where dependent variable is the residuals of (8), i.e. μ i,t : where PEN is the number of penalties generated by the fiscalization carried out; the OS is the number of orders of fiscalization by sector, i.e., the number of fiscalization of tax collection agency by sector; and NALIQ represents the number of different tax rates by sector.
The fines generated by the fiscalization carried out and the number of orders of fiscalization are proxies for the probability of cross-sectoral fiscalization. We proceed the test if sectors that receive more attention from the tax collection agency increase the efficiency of tax levy. The number of different tax rates per sector is also a proxy to characterize the complexity of the tax system. According to Fenochietto and Pessino (2013), the more complex the tax system is, the more it will incorrectly label a multi-output firm, i.e. that produces more than one type of good or service, causing inefficiency in tax collection. The quarterly natural logarithm 5277 observations' dataset provided by the SER/PB has 489 sub-classes of sector -from total of 597 -between 2013 and 2015. The data was deflated through the broad consumer price index (IPCA). The information of worked hours per sector were obtained from the National Survey by Household Sample (PNAD/IBGE) from 2013 to 2015. Electronic invoices were implemented in Paraí ba from 2013. Thus, we used a reduced window of time electronic invoice data less subject to negative bias, in contrast to using data from the taxpayer's own declarations. Table 2 illustrates the descriptive statistics: Source: SER/PB and IBGE, Brazil. Table 3 illustrates the estimation of the correlation matrix between the variables. It was observed that they have a weak correlation with each other, except for the tax base (BASE) and the Herfindahl index (S) that which exhibit a correlation of 74.86%; and the total input, which also indicates correlation with BASE, S and the number of tax rates by sector (NALIQ), respectively.

Results
Similar to Mattos, Rocha, and Toporcov (2013), it was also not possible to verify any evidence that electronic invoices have promoted the reduction of tax collection inefficiency. However, this technology helps to organize databases for identification of relevant information about tax levy and production of reports, strategies and policies improvements. Table 4 exhibits the results of the estimation of the tax collection stochastic frontier (7) using the R Statistics software: The ALIQ was positively related to tax levy, showing that the higher the tax rate, the higher its tax levy. The relationship between the exit of goods and services to other states denoted by EXIT, the BASE and S were positively related to tax levy. The results also demonstrate that the greater the EXIT the higher the BASE, and higher the tax levy. The ratio of total discharges (DIS), the EXIT and tax collection is negative, i.e. the larger the discharges by sector, the lower the tax levy. This suggests that eventual reductions in tax inefficiency and increases economic activity promoted by discharges are not sufficient to balance the tax levy losses. The parameter of worked hours appears with the positive sign, indicating that the more labor intensive the sector, the less the tax levy. This outcome suggests the difficulty faced by the tax collection agency to monitoring firms or sectors that are based on the added value by the human work.
The dummy variables AGR, IND and WHO present parameters statistically significant, at 5%, 10% and 1% respectively, and points out that the sectors of wholesale commerce have a greater marginal effect (0.8020) on increasing efficiency of tax levy. The estimated value for γ (Gamma) 0.8888 is statistically significant and means that 88.88% of the error resulting from (7) is caused by the technical inefficiency of collection, and the other 11.12% are attributed to random events. It indicates that tax revenues depend on tax administration performance, also found in Brun and Diakité (2016). The γ different from zero means that the inclusion of the control variables by the tax collection agency to reduce the inefficiency could be strategic. Moreover, the parameter ζ 2 μ (0.23570), i.e. the classic idiosyncratic error variance, demonstrates the variance of tax levy inefficiency. Table 5 presents the results of (8): For the F test at 1% significance, the technical inefficiencies (8) rejects the hypothesis that all the parameters are statistically equal to zero. The adjusted coefficient R 2 shows that 12.18% of the deviations in the efficiency scores. In other words, the error term estimated in (7), could be explain by the selected regressors. Used to control the scores of inefficiencies by sectors, the parameters of PEN and OS are both negative. This suggests that the larger the number of fiscalization and the higher the fine imposed on the supervised firms, the lower the efficiency of tax levy. The parameter of NALIQ appears negatively related to TAX, as expected. It means that there are sectors with different tax rates, which registered at maximum 40 tax rates, as found in Table 2. Hence, the higher the tax system complexity, the less is tax levy.
The average efficiency of tax collection during the period from 2013 to 2015 was 0.7376. This means that the tax authority was able to collect about 73.76% of the total potential tax. It was just close to the average efficiency estimated by Ferrigno (2006) for Federal District, Brazil, which was 74.78%. The histogram exhibited in Figure  1 point out that there are many sectors whose efficiency concentrates close to 80%. However, there are also sectors, even with low frequency, in all lower efficiency quantiles.   Some sectors have an expressive efficiency of tax levy, as the sectors of distribution of electric power and cellular mobile telephony, since the number of firms is small (9 and 22, respectively), as illustrated in Table 1. This facilitates the fiscalization by the tax collection agency. In addition, where tax is collected through the instrument of tax substitution, e.g. fuel manufacturing industries, the wholesale trade of alcohol, biodiesel and gasoline; the distribution of electric energy; fixed-switched telephone services; cellular mobile telephony; and wholesale cigarette trade, there is redundant efficiency above 90%.
In order to not overestimate the stochastic frontier, these sectors were removed from the sample, resulting in the 489 sub-classes used to obtain the efficiency scores. The sum of uncollected tax values from 2013 to 2015 all sectors is approximately US$7 billion. There is a greater difficulty of fiscalization in other sectors, such as the retail trade in clothing, furniture and footwear, also responsible for higher tax collection revenues, where the tax collection was lower. There are also the sectors responsible for high participation in tax levy, but which demands more attention from the tax collection agency, due to their small efficiency scores. They are manufacturing sectors of leather shoes and footwear manufacturing with non-leather material, with efficiency of tax collection of 66.66 % and 50.86%, respectively.
However, these sector occupy the 7th and 10th largest collections, although with an efficiency below the overall average efficiency of 73.75%, as wholesale of personal care products and drugs for human use and wholesale trade in general merchandise, respectively. This result may be explained by the fact that it is more difficult to control the large number of small firms, often distributed in private family-owned houses, located in the countryside of the state. Following Ferrigno (2006), the worst results of the efficiency of tax collection were found in the sectors of retail trade in sanitizing products and wholesale trade of bags, suitcases and travel articles, with 15.38% and 17.31% of efficiency score, respectively. The determinants of the inefficiency of tax collection indicates that the sectors with the highest number of fiscalization were above the efficiency of tax levy average. This suggests that better-supervised sectors evade less taxes than others.

Conclusion
This paper estimates the technical efficiencies of tax levy of each subclass of sectors of state of Paraí ba, Brazil. It was also provided empirical evidence related to the collection increasing, such as the fiscalization, the application of fines of the infraction notices, the quantity of fiscalization orders and the number of tax rates by sector. It was observed that the tax collection is negatively related to discharges. Moreover, the technical inefficiency, given by Gamma (88.88%), may be reduced by increasing the number of fiscalization and strategic actions by the tax collection agency. In addition to confirm our initial two hypotheses, the empirical outcomes also suggest that the complexity of tax legislation and the high number of small firms by sector make it impossible to implement policies that may cover all sectors of the economy.
Therefore, these characteristics demonstrate that it is necessary in-depth researches prior to develop proper tax collection strategies for each of the distinct economic activities in state of Paraí ba. The estimated inefficiency of tax collection suggests that the tax collection agency has an important role in sense to reduce the dependence on the state from federal transfers, i.e. the State Participation Fund. Taking into account that 1) the goal of this fund is to reduce (but not eliminate) the inequality between Brazilian states, and 2) that the determinant criterion of revenue allocation is the tax collection capacity of the state, it is extremely important that each Brazilian federative entity establish tax collection policies to reduce their inefficiency.
Both the tax discharges and their effect on tax revenue and the growth of indirect taxes, especially those that focus on profit, have privileged less competitive firms. Largely known, the current Brazilian tax structure needs to be reformed. However, this reform clearly demands a new federative pact to establishes incentives to economic activities being based on the comparative and competitive advantages of each state and region. Thus, a tax reform agenda must include (i) more equitable distribution and transparent assessment criteria of oil royalties and State Participation Fund, (ii) transparent indexation of state public debts, and (iii) update the classification of new business activities due to the rapid expansion of new markets and technological innovations to face base erosion and expose tax revenues.
Our paper reinforces the evidence found in international literature and point to policy revision to strength governance environment framework. It is necessary to reduce tax distortions, which has promoted incentives for tax wars between regions in Brazil. This tax could lead to reduction of taxation on capital. For future research, we recommend to estimate the efficiency of tax collection in other regions and countries and testing other efficiency measures. A map with more detailed tax efficiency information from each region could be used for fiscal policy, mainly to vulnerable states, e.g. Minas Gerais, Rio de Janeiro, Rio Grande do Norte and Rio Grande do Sul.