High-Order Neural Networks are Equivalent to Ordinary Neural Networks

Equivalence of computational systems can assist in obtaining abstract systems, and thus enable better understanding of issues related their design and performance. For more than four decades, artificial neural networks have been used in many scientific applications to solve classification problems as well as other problems. Since the time of their introduction, multilayer feedforward neural network referred as Ordinary Neural Network (ONN), that contains only summation activation (Sigma) neurons, and multilayer feedforward High-order Neural Network (HONN), that contains Sigma neurons, and product activation (Pi) neurons, have been treated in the literature as different entities. In this work, we studied whether HONNs are mathematically equivalent to ONNs. We have proved that every HONN could be converted to some equivalent ONN. In most cases, one just needs to modify the neuronal transfer function of the Pi neuron to convert it to a Sigma neuron. The theorems that we have derived clearly show that the original HONN and its corresponding equivalent ONN would give exactly the same output, which means; they can both be used to perform exactly the same functionality. We also derived equivalence theorems for several other non-standard neural networks, for example, recurrent HONNs and HONNs with translated multiplicative neurons. This work rejects the hypothesis that HONNs and ONNs are different entities, a conclusion that might initiate a new research frontier in artificial neural network research.


Introduction
Inspired by the biological neuronal systems, several computational intelligent based classification systems have been developed in the past few decades and are widely known as computational neural networks.These computational networks, which possess powerful nonlinear classification ability, are also known in the literature with many other names, such as, artificial neural networks, statistical neural networks, first order neural networks, and multi-layer perceptrons (Minsky & Papert, 1969).In this work, we refer to multilayer feedforward artificial neural net-works that contain only summation activation (Sigma) neurons as ordinary neural networks (ONNs), where the term ordinary refers to first order in this work.When ONNs contain product activation (Pi) neurons, they are frequently called high-order neural networks (HONNs).
HONNs were originally proposed in the 1960s for performing nonlinear discrimination but were discarded due to the tremendous amount of high-order terms (Minsky & Papert, 1969).Starting from the mid-nineties of the last century, several researchers relied on HONNs rather than ONNs to solve classification problems (Jeffries, 1995), (Foltyniewicz, 1995), and (Kosmatopoulos, Polycarpou, Christodoulou, & Ioannou, 1995).Prior to that, (Giles & Maxwell, 1987) discussed HONNs and they stated that high-order weights capture high-order correlations in data.They introduced their HONN to solve a challenging computer vision problem known as invariance.(Hughen & Hollon, 1991) stated that HONNs have the advantage of ease training over multilayer perceptrons and claimed to achieve better classification for radar data than a Gaussian classifier.(Jeffries, 1995) employed HONNs for tracking, code recognition and memory management.(Foltyniewicz, 1995) developed a Pi-Sigma-Pi network structure for effective recognition of human faces in gray scale irrespective to their position, orientation and scale, and he claimed that it has small number of adjustable weights, rapid learning convergence, and excellent generalization properties.(Abdelbar, 1998) showed, and claimed, that Sigma-Pi HONNs perform better than ONNs in classifying age groups of abalone shellfish benchmark.(Kosmatopoulos, Polycarpou, Christodoulou, & Ioannou, 1995) showed that by allowing enough high-order connections to a recurrent HONN, they were able to use it in approximating arbitrary dynamical systems.Their explanation is that the dynamic components distribute throughout the network in the form of dynamic neurons.(Rovithakis, Robustifying nonlinear systems using highorder neural network controllers, 1999) employed HONNs in control and then (Rovithakis, Chalkiadakis, & Zervakis, High-order neural network structure selection for function approximation applications using genetic algorithms, 2004) presented an algorithm to determine the structure of HONNs for function-approximation.A HONN with a prior knowledge on the training of binary patterns reduces computation time and memory (Artyomov & Yadid-Pecht, 2005).(Al-Rawi & Tarakji, An improved neural network builder that includes graphical networks and PI nodes., 2005) developed a software tool that can be used interactively to build HONNs, then, a HONN that learns the recognition of affine transformed images without any prior knowledge of the imaging geometry was presented in (Al-Rawi, Learning affine invariant pattern recognition using high order neural networks., 2006).Recently, HONNs were used in diverse classification applications; see for example (Dunis, Laws, & Sermpinis, 2010).
We can see from the literature that several researchers prefer HONNs on ONNs after performing a comparison on the problem at hand, but how is that justifiable without further mathematical analysis?As we noted earlier, the major difference between HONNs and ONNs is how activations are calculated in each of their neurons, i.e., only Sigma neurons are used to construct ONNs, while Sigma and Pi neurons or just Pi neurons are used to construct HONNs.In computer architecture, however, a multiplication operation can be implemented via an algorithm that implements several addition operations (Knuth, 1997) and (Kulisch, 2002).In fact, multiplications are defined for the whole numbers in terms of repeated addition and even multiplications of real numbers are defined by systematic generalization of this basic idea.With this in mind, one could, theoretically and/or hypothetically, convert a HONN to a very complicated, large-sized, constrained web-like ONN, but does that mean they are equivalent?In such a case, it is unfair to compare the computational cost and/or the expressive-power of some HONN to an ONN that has the same number of neurons and synaptic connections and claim superiority of HONNs over ONNs.
Despite the fact that many ideas have been proposed to use HONNs in solving various pattern classification problems and compare their classification accuracies to ONNs, there were no theoretical studies of how they are related to each other.Different modifications to HONNs are heuristic, imaginational, and/or model based inspired.The case of many HONNs researches is that experimental results are usually obtained to support the proposed HONN and show its superiority upon ONN, we, however; decided to take the comparison into a mathematical conversion model.Thus, the rational of this work is that, if any HONN is equivalent to some ONN, then we can save lots of unnecessary HONN optimization modeling (that are usually very complicated) by just using its equivalent ONN along with the available ONN optimization algorithms.The equivalence would also provide answers to many controversial comparative issues between HONNs and ONNs.As for the structure of this work, in section 2, we will show how to define the activation of a Pi neuron, that of a Sigma neuron, and we shall discuss how to overcome the numerical instability in the HONN generalized form that we have adopted in this work.The definition of neural net-work equivalence suitable for this work and how a Pi neuron is related to a Sigma neuron will be the topic of section 3.Then, we will show in section 4 using simple mathematical analyses that it is possible to convert any HONN to a similar size equivalent ONN, then, we will give a systematic conversion method in section 5. We will also demonstrate other nonstandard HONNs and few conversion examples in sections 6 and 7 respectively, and finally we wrap up the conclusions in section 8.

High-Order Neural Networks Versus Ordinary Neural Networks
ONNs have only Sigma neurons, for example, the output of a  − 2 − 1 ONN (i.e., having d inputs with a bias, two hidden neurons, and one output) can be written as (Giles & Maxwell, 1987): where { , } are the weights that connect from neurons in the input layer to neurons in the hidden layer,  is the value at the i th input unit, for the input  = [ ,  , . . .,  ].In contrast, a HONN must at least have one Pi neuron, thus for example, the output of an up to the second order HONN can be written as (Giles & Maxwell, 1987): (2) The above shown equation is a very simple non-general HONN example.The basic units in any HONN, however, are Pi neurons.Many forms of Pi neurons have been used to construct HONNs and this work will adopt a generalized Pi neuron form that has an adaptable exponent weight, as the one that we shall give below.

HONNs that We Have Adopted in this Work
Within the context discussed in this work, highly expressive HONNs are given in a form such that their weights are adjustable real-valued numbers (on the contrary to most of the previous works were HONN weights were assumed as non-negative integers).Since HONNs may also contain Sigma neurons, and we will need this Sigma neuron to prove the equivalence later, it is necessary first to give a definition to this type of neuron.

Definition I:
The activation of a Sigma neuron For an input  = [ ,  , . . .,  ], each Sigma neuron has the following activation: where  is the index of Sigma neuron such that  = 1, . . ., , for a total of  Sigma neurons in a layer,  ∈  is the synaptic weight between the previous i th neuron and the current j th Sigma neuron, and  is the set of real numbers.The output of a Sigma neuron is usually obtained by modulating its activation by a nonlinear sigmoid activation (transfer) function  (⋅), thus, we can write the output of a Sigma neuron as  =  ( (Sigma) ()).

Definition II: The activation of a Pi neuron
For an input  = [ ,  , . . .,  ], each Pi neuron has the following activation: where  is the index of Pi neuron such that  = 1,2, . . ., , for a total of  Pi neurons,  ∈  is the synaptic weight between the previous i th neuron and the j th Pi neuron and  is the set of real numbers.The output of a Pi neuron is usually obtained by modulating its activation by a nonlinear sigmoid activation function  (⋅), thus, we can write the output of a Pi neuron as  =  ( (Pi) ()).
Many forms of HONNs have been proposed in the literature, they are all based on some structure of Pi and Sigma neurons, and the way their synaptic weights are represented as fixed or adjustable, real-valued or positive integervalued.To discuss the general case, we shall consider in this work Pi neurons with adaptable real-valued exponent weights, as shown in (4).

A Complex Valued Synapse in a HONN
The general form used in this work, which (4) shows, to construct HONNs would yield complex values.To demonstrate this issue, let the value of a two dimensional input vector to the Pi order neuron defined in (4) be given by  = [−1,1], then, the its activation would be given by: and since the weights are real-values, the resultant activation would be complex value, e.g.,  = 0.25 gives (−1) .= 0.707 + 0.707√−1.If on the other hand, the input vector has a zero value, e.g.,  = [−1,0], this will end up in suppressing the activation It is therefore necessary for the input to the Pi neuron of the form shown in (4) to be above zero.We shall introduce a normalization scheme to the input to handle this problem, as well as a modification to the activation function so that the output of any neuron is above zero.

Handling numerical extremes in HONN implementation:
Due to multiplications and exponents used in HONNs, several numerical issues might arise at their computation.These extreme cases are likely to occur in any HONN that implements a Pi neuron of the form shown in (4).In this section, we shed light on two of the most important computational issues in this regards.
In any HONN of the form given in (4), it is possible to have complex neuron output, e.g., when the input to a Pi neuron is negative.This problem can be easily resolved by making all signals propagating inside any HONN to have positive values.There are two sides of positifying the signals of any HONN, the first concerns the input patterns, and the second concerns all other signals propagating inside the network.In the first issue, very easy to normalize the inputs to lie within the (nonnegative) hypercube  where ( = (0,1]), or to be above zero in general.A support of this type of normalization is given in the patter classification bible by (Duda, Hart, & Stork, 2000) where they hinted that; "In any particular problem, we can always scale the input region to lie in a hypercube, and this condition is not limiting."which means that the normalization process is mostly applicable.In any case, the normalization may simply be performed by just shifting the inputs to have above zero values.
To make sure that all signals propagating inside the network are positive, it is important to carefully choose the activation function(s) so that to always have positive value(s) output, preferably above zero.For example, we propose the following positive-valued logistic activation-function: where  is the temperature of the neuron,  is the activation of the neuron,  > 0 is a parameter that we used to make sure that the range of the function is always above zero, and  > 0 is a necessary parameter that we use to ensure the sigmoid-shape of the activation function when the input is positive, in this case the input is .The value of  can be fixed or variable, which cab be found using backpropagation of error along with the other weights of the network.The value of  affects the sigmoid function, for very high values its shape is similar to a step function, while, for low values its shape will be close to linear.Typical values for these parameters are  = 0.5,  = 1, and  = 0.1.If one chooses another squash function, e.g. based on ℎ( ⋅), then, to ensure above zero values,  > 1 should be used since the lower bound of ℎ( ⋅) is −1.

On the Equivalence of Neurons
Conversion and/or equivalence methods exist in many mathematical models as well as computer science models.For example, equivalence methods are found in automata theory as the case of the equivalence (or conversion) between nondeterministic finite automata and deterministic finite automata (Linz, 2006).They are also found in studying complex systems (Abu Dalhoum, 2004), and computational models of natural processes (Madain, Abu Dalhoum, & Sleit, 2018).Before diving into artificial neural networks equivalence theorems, it is necessary to give few definitions.

Definition III: Neural networks equivalence
Two neural networks are equivalent if they both satisfy the following criteria; have the same number of weights, have the same total-number of neurons, they give exactly the same output, but their activation (transfer) functions may differ, and, regardless of the neuron types in each network.
Thus, mathematically converting every Pi neuron (that one may encounter in a HONN) to a Sigma neuron may help in proving HONN to ONN equivalence.This can be done by showing that a Pi neuron is related in some way to a sigma neuron, i.e., one is a function of another.In what follows, we shall provide few theorems as well as conversion methodologies that indicate the equivalence between HONN and ONN with respect to Definition III.
Theorem I: The activation of every Pi neuron is nonlinearly related to the activation of a Sigma neuron Proof: By making use of  = ( ( )), where  is any dummy variable, equation ( 4) can be rewritten as: which we further reduce to: and if we assume that  =   and  = [  ,   , . . . .,   ], then we can rewrite (8) as follows: or using the more compact form, which fulfills the proof.
Equation ( 10) clearly shows that the activation of a Pi neuron relates nonlinearly to the activation of a Sigma neuron.Any Sigma neuron can be converted to a Pi after taking ( ⋅) to its inputs, and taking ( ⋅) to the result of the summation.
Corollary I: From Theorem I, any Pi neuron can be converted and/or reduced to a Sigma neuron.
Theorem II: The activation of a Sigma neuron is nonlinearly related to the activation of a Pi neuron.
Proof: The proof is similar to that of Theorem I after making use of  = ( ( )), where  is any dummy variable.In such case, equation ( 3) can be rewritten as: (Sigma) () =  ∏  (  ) , ( 12) and by making use of the form shown in (4), we can rewrite ( 14) as follows: where  = [ , . . . .,  , . . . .,  ], such that  = (  ).Equation ( 15) shows that the activation of a Sigma neuron is nonlinearly related to the activation of a Pi neuron, which fulfils the proof.
It is quite interesting, and surprising, to see that starting with the Sigma neuron we ended with the Pi generalized form that we have adopted in this work, this was way beyond our intention.
Corollary II: From Theorem II, any Sigma neuron can be converted and/or reduced to a Pi neuron.
Remark I: Theorem II will not be used in this work, but was stated here for complement.Theorem II can be used to state that any ordinary neural network can be converted to a high order neural network.
In what follows, we shall make use of Theorem I to convert high-order synaptic connections to ordinary synaptic connections, which can be used to convert HONNs to ONNs.

Converting High-Order Interconnections to Ordinary Interconnections
In this section, we will show how to convert an interconnection containing Pi neurons (that we shall call highorder interconnection) to an interconnection that contains only Sigma neurons (that we shall call ordinary interconnection) and this concept can be used later to convert any HONN to its equivalent ONN.If we consider a HONN with various Pi and Sigma neurons, then, four different interconnections are probable between each pair of neurons: Sigma-Sigma, Sigma-Pi, Pi-Sigma, and Pi-Pi.Accordingly, ONNs should only have Sigma-Sigma interconnections.In theory, one can convert any HONN to an ONN by converting all of its high-order interconnections to Sigma-Sigma interconnection.
Definition IV: For any neuron 1 -neuron 2 interconnection, neuron 1 refers to the current-neuron that is connected to neuron 2 which refers to the next-neuron.
Theorem III: Every Sigma-Pi interconnection can be converted to a Sigma-Sigma interconnection.
Proof: Let the input vector to the current Sigma neuron be , then the output of the current Sigma neuron is given by: and the output of the next Pi neuron is given by:  =   (Pi) () .( 17) Now, substituting (10) into (17) yields:  =    (Sigma) () , (18) or, where  =  (( ⋅)),  = { , . . .,  , . . . .} such that  =   .Now, by moving ( ⋅) of this Pi neuron to the previous neuron (previous with respect to Pi neuron is current Sigma, see Figure 1 for illustration), then, the following equation defines  : where  = (  (⋅))is the function resulted from the composition of ( ⋅) and  (⋅).It is obvious from ( 19) and ( 20) that a Sigma-Pi interconnection is reducible and equivalent to a Sigma-Sigma interconnection.Proof: Let the input vector to the current Pi neuron be , then the output of the current Pi neuron will be given by: the output  is then feedforward connected to the next Sigma neuron that has the following output: (22) Now, by substituting ( 10) into ( 21) we get: where as shown previously  =   .By defining the composite function  =  (( ⋅)), it would be more appropriate to write (23) as follows: It is clear from ( 24) and ( 22) that a Pi-Sigma interconnection is equivalent to a Sigma-Sigma interconnection.

Theorem V: Every Pi-Pi interconnection can be converted to a Sigma-Sigma interconnection
Proof: Let  be a vector denoting inputs to the current Pi neuron.Analog to the proof shown above that demonstrated the conversion of Pi-Sigma and Sigma-Pi interconnections to a Sigma-Sigma interconnection each, it is straightforward to deduce using (10) that a Pi-Pi interconnection, which has current-neuron that has an output  =  ( (Pi) ()) connected to a next-neuron that has an output  =  ( (Pi) ()), can be written as follows: =   (Sigma) () , ( 26) where ζ = ln( f (exp( ⋅))), ψ = f (exp( ⋅)), Y is the output of the current-neuron after conversion, and z is the output of the next-neuron after conversion.Thus, equations ( 25) and ( 26) clearly show that a Pi-Pi interconnection can be converted to a Sigma-Sigma interconnection, which fulfills the proof.
Above we showed that it is possible to convert any high-order interconnection to ordinary interconnection.Does this mean, however, that every HONN has an equivalent ONN?In the next section, we will give a very simple method to convert any high-order interconnection to an ordinary interconnection.

A Simple Method to Convert A High-Order Neural Network to An Ordinary Neural Network
We proved in section 4 that one could directly convert any high-order interconnection to an equivalent ordinary interconnection by just manipulating the activation function acting at each Pi and/or Sigma neurons.

The Conversion Method
To convert a HONN to an equivalent ONN (let it be denoted as ONN (E) ) in an easy manner, we propose using the following method: Define a network with the same number of layers and neurons as that of a HONN that you want to convert.Let  be the set containing the activation functions acting at each neuron in the k th layer of the newly defined network, and { (⋅)} be the set of the activation functions acting at each neuron in the k th layer of the HONN that we seek to convert to ONN (E) .Hence, according to the previous theorems, the major goal is finding  as a composite function of elements from the set {. . .,  (⋅),  (⋅),  (⋅), . . . .} , and by just doing this, the conversion is complete.Using the equivalence theorems presented in the previous section, the activation function at each Sigma neuron in the newly defined ONN (E) network can be found according to the following criteria: Case 1: Current neuron in the k th layer is Sigma, connected to a next Pi neuron in the (k+1) th layer:  = (  (⋅)).
(29) Thus, the newly defined network after finding  is ONN (E) .We note that ( ⋅) should be taken to the input patterns whenever they are connected to a Pi neuron at the first hidden layer.Another important issue is that the conversion criteria is a bottom-up approach, i.e., one should start from the input layer and sequentially move to the output layer.

The Conversion Method and Training with Backpropagation of Error
In training with backpropagation-of-error, one needs the first derivative of each of the above activation functions.For a transfer function denoted as  , let its derivative be denoted by  .For convince to other interested researchers, we list the derivatives of the conversion cases shown in ( 27), (28), and (29) below: Case 1: Current neuron in the k th layer is Sigma, connected to a next Pi neuron in the (k+1) th layer: (32) If the output layer however contains a Pi neuron, one can assume that a hypothetical next neuron is Sigma thus uses a Pi to Sigma conversion rule.Finally yet importantly, there is no need to do any conversion for any Sigmato-Sigma interconnection if the HONN that we want to convert contains any.

Sigma and Pi Neurons in the Same Layer
Although uncommon in the literature, any HONN may have both Pi and Sigma neurons in the same layer.Fortunately, it is possible to convert HONNs to ONNs when they have this kind of mixed-Sigma-Pi layer (MSP-HONN) topology.The major problem of having two types of neurons in the same layer is that our proposed conversion methodology is based on the current and the next neuron types, thus, it is not possible to have one type of activation function in the resultant ONN (E) (since it is possible that the next neuron be either Sigma or Pi).To illustrate this issue, let us assume that the current neuron is Pi, and the next layer contains both Pi and Sigma neurons.In this case, one will need two conversions for the current neuron, one is Pi-Pi, and the other is Pi-Sigma that results of two possible types of activation functions acting at the current converted neuron.To resolve this issue, we propose using a new type of neurons which we shall call dual-output neurons.Thus, we will have to find the equivalent activation functions for all the neurons of ONN (E) in a way similar to what we have described in section 5.The following method shows how one can directly convert MSP-HONN to ONN (E) : -Define a network with the same number of layers and neurons as the MSP-HONN.
-Find the activation function  of each neuron of the newly defined network topology in a way similar to that of section 5.However, by converting MSP-HONN to ONN, each neuron will have dual-output activation function depending on the current neuron and the next neuron of MSP-HONN.For feasible and easy conversion, we deduced from the rules of section 5 the following: Case 2: Current neuron is Pi Thus, we showed that even a HONN with mixed Sigma and Pi neurons is equivalent to an ONN and the conversion can be performed easily with the method we proposed here.The derivatives for these forms can be found from ( 30)-(32), for example, case 1 will have the following derivative form: 5.3 Does the neuron output in the equivalent ONN have a sigmoid shape?
The last issue we discus in this section, is to illustrate the characteristics of a typical activation function, such as the one we proposed in ( 6  1 we can see that this is an asymmetric sigmoid-shaped

Other Non-Standard Honns From the Literature
In this section, we provide few equivalence theorems on nonstandard HONNs and show how they can be converted to ONNs.
Theorem VI: Every high-order recurrent neural network can be converted to an ordinary recurrent neural network Proof: The conversion method demonstrated in section 5.3 would be directly useful to convert a recurrent HONN to a recurrent ONN (E) .This is because the activation functions at any of ONN (E) 's feedback connections could be found using MSP-HONN conversion rules.
This case is depicted in Fig. 3 where the product neuron P 1 has an input from x 1 and a feedback from S 2 .This S 2 neuron should have dual-output activation function after conversion due to S 2 -S 3 and S 2 -P 1 interconnections.If, however; S 2 is feedforward connected to a next layer that only contains Pi neurons, say P 3 , then S 2 will have the usual unary-output activation neuron.
Figure 3.A recurrent high-order interconnection that may appear in recurrent high-order neural networks.Here we have a case of connection from S 2 to S 3 and S 2 to P 1 .This means that converting a recurrent high-order neural network is similar in a way similar to that of converting mixed Sigma and Pi neurons in the same layer as previously shown.
Theorem VII: Every higher-order Functional Neural Network (Giles & Maxwell, 1987) can be converted to a four-layer ordinary neural network.
Proof.Higher Order Functional Neural Network HOFNN (Giles & Maxwell, 1987) is used as a one-layer neural network, a multiplication of the inputs that are connected to output units.The output of a neuron of a second order HOFNN, which is the same as the definition in (2), can be written as follows.(first hidden Sigma layer), (40) where  = [  ,   , . . .,   ] is the augmented input vector, and  = ( ⋅).This fulfills the proof that a second order HOFNN is a four layer ONN.We can see from (39) and (40) that we needed to restructure the network to have another hidden layer.Moreover, the double summation that appears in (39) adds to problem at all, it is always possible to expand it linearly as  =  ,  , +  ,  , +. . .+ ,  , , which is the normal layer shape in any artificial neural network.
The output layer of an up to the  -th order HOFFN has the form  =   +  +  +  +. . .+ , . . .,  .Using the same method above for up to the second order HOFFN, we can find  ,  ,…etc, then using mathematical induction we found that the HOFNN up to the  -th order will have the following form:

Figure 1 .
Figure 1.The left part shows Sigma-Pi interconnection converted to a Sigma-Sigma interconnection, which is shown at the right part of the graph
) in relation to the conversion methodology that we propose.Let us consider a neuron that appears after converting a Sigma-Pi interconnection into a Sigma-Sigma interconnection; () = [1/(1 + ( −  ( − )) + ].Fig.2shows the characteristics of this function for different  and  values where we can see in Fig.2-a that the activation function kept its sigmoid shape for  = 1 and  = 1.For  = 1 and  = 2 Fig.2-bshows that the lower part is smoother than the upper part.A rather more complicated activation function is found after converting a Pi-Pi interconnection into a Sigma-Sigma interconnection which is () = [1/(1 + ( −  (  − ))) + ], its characteristic graph is shown in Fig.2-c, where in this case the activation function is asymmetric squash function.

Figure 2 .
Figure 2. Activation functions that can be used to yield positive outputs.a) The activation function that may result from converting a Sigma-Pi interconnection into a Sigma-Sigma interconnection is () = [1/(1 + ( −  ( − )) + ] using  = 1 and  = 1, b) the same activation function of (a) is shown sum of several Pi neurons represented as  , purpose, we have converted the Pi neuron to Sigma neuron using the method we have shown in Theorem I, thus, we can rewrite (38) as follows:

Figure. 4
Figure. 4. a) High-order neural network with mixed Pi and Sigma neurons in each layer, b) Converting the network in (a) to an equivalent ordinary neural network using the method described in this work, { ,  ,  ,  ,  }is the set of activation functions that depend on { ,  ,  ,  ,  }.