Gradient Descent Optimization of Fuzzy Functions
Dan Simon
Innovatia Software
dan@innovatia.com
Copyright 1998–2007 Innovatia Software. All Rights Reserved.


Preliminaries

A previous white paper by Innovatia Software discussed the topic of fuzzy filtering in general terms. This paper discusses how the fuzzy membership functions of a fuzzy filter can be optimized using gradient descent. As before, the application is fuzzy filtering for phase-locked loops (PLLs) in a Global Positioning System (GPS) receiver, although the approach is general enough to be applied to most estimation problems. See the Hybrid Kalman / H-infinity filtering paper for some general information about GPS and PLLs.


Gradient Descent

Gradient descent is a function optimization method which uses the derivative of the function and the idea of steepest descent. The derivative of a function is simply the slope. So if we know the slope of a function, then it stands to reason that all we have to do is somehow move the function in the negative direction of the slope, and that will reduce the value of the function. Gradient descent is an interative method, so the idea is as follows:

§        Compute the derivative of the function with respect to its independent variables. We can denote this derivative as dF(x), where F(x)) is the function to be minimized, and x is the vector of independent variables.

§        Change the value of x as follows: xn+1 = xn – h dF(xn), where the subscript n refers to the iteration number, and h is a step size which must be chosen so that we don't take too big or too small of a step. Too big of a step will overshoot the function minimum, and too small of a step will result in a long convergence time.

§        Repeat the above two steps until we converge to a minimum of the function F(x).

Gradient descent is an attractive optimization method in that it is conceptually straightforward and often converges quickly. Its drawbacks include the fact that the derivative of the function must be available, and it converges to a local minimum rather than a global minimum.


Gradient Descent Optimization of a Fuzzy Filter

Consider that problem of PLL phase estimation using fuzzy logic as discussed in a previous white paper. Now say that we have N training samples with which we can "train" our fuzzy filter. Our training samples might include a set of simulations which typify situations in which we want good performance from our filter. Since we are running the simulations, we know the "true" GPS phase, so we can run a fuzzy filter and figure out the error between our phase estimate and the "true" phase for each training sample.

Eq = eq - tq (q = 1, . . . , N)

where eq is our phase estimate and tq is the true phase. Then we can compute a measure of the goodness of our fuzzy filter by combining all of the Eq's.

E = (E12 + E22 + . . . + EN2)

Now we want to minimize E using gradient descent. If we minimize E, then we have a fuzzy filter which gives the "smallest" possible error between the true phase and the estimated phase. The trick is to figure out the derivative of E with respect to the parameters that characterize the fuzzy membership functions.

As it turns out, if the fuzzy membership functions are triangular (as shown in the previous fuzzy filtering white paper), then we can derive an analytic expression for the partial derivative of E with respect to: (a) the centroids of the input fuzzy membership functions; (b) the widths of the input fuzzy membership functions; and (c) the centroids of the output fuzzy membership functions. The mathematics are too messy to repeat here, but they can be found in the papers referred to at the end of this white paper.

So the bottom line is that we can perform the following steps to find the "best" fuzzy PLL filter.

§        Compute the derivative of E with respect to a, b, and c. Vector a represents the vector of all of the centroids of the output fuzzy membership funtions, b represents the vector of all the widths of the input fuzzy membership functions, and c represents the vector of all the centroids of the input fuzzy membership functions. We denote these derivatives as dE(a), dE(b), and dE(c).

§        Change the values of a, b and c as follows:
an+1 = an – h dE(an)
bn+1 = bn – h dE(bn)
cn+1 = cn – h dE(cn)
where the subscript n refers to the iteration number, and the h's are step sizes which must be chosen so that we don't take too big or too small of steps.

§        Repeat the above two steps until we converge to a minimum of the function E.


Simulation Results

So how does it work? The method outlined above was simulated for a PLL in a GPS receiver located on a missile flying a test flight from California to the South Pacific. The behavior of the PLL was investigated by examining its ability to track the phase of the GPS carrier between the missile and one GPS satellite during the first 60 seconds of boost. The filter rate was fixed at 50 Hz. The relative trajectory between the GPS receiver and the GPS satellite is depicted in the figure below.

Trajectory

The fuzzy PLL consisted of seven membership functions for the first input (the Error parameter in the previous Fuzzy Filtering paper), seven membership functions for the second input (the Change in Error parameter in the previous Fuzzy Filtering paper), and seven membership functions for the output.

The N training samples discussed above consisted of the time between 45 and 55 seconds following launch (the most dynamic 10-second interval of the flight, corresponding to burnout of the first rocket stage). The GPS carrier-to-noise ratio was fixed at 18 dB-Hz. The gradient descent step size parameters h were all fixed at 0.3. The fuzzy membership functions were constrained to be symmetric triangles, and the error function E was minimized with respect to 35 parameters: the centroids of each of the membership functions (21 total), and the widths of each of the input membership functions (14 total). As training progressed and PLL performance improved, more and more simulations were used in the training procedure. For the first 200 iterations, a single simulation was used for each update of the parameters, so N in the equations above was 500 (10 seconds * 50 Hz = 500). After a while, the training size was increased to 10 simulations, then 30 simulations, and finally 100 simulations. So N gradually increased from 500 to 5,000 to 15,000 to 50,000. The figure below shows the decrease of the error function E as training progressed.

Training Progress

The sudden jumps in the value of the error function at iterations 200, 300, and 500 reflect the increase in the number of simulations in the training sample.

So how does the optimized fuzzy PLL perform? The table below shows the performance of the optimized PLL filter using the trajectory shown above. The Nominal Fuzzy PLL is based on using common-sense triangular membership functions for the inputs and output of the PLL. The Gradient Fuzzy PLL is the filter described in this white paper. The Gradient PLL was trained for a carrier-to-noise ratio (CNR) of 18 db-Hz, but the performance of the filters is shown in the table for higher CNRs also.

 

Probability of Loss of Lock for Various PLL Methods (Percent)

PLL METHOD

CNR = 18

CNR = 19

CNR = 20

Nominal Fuzzy

73

45

24

Gradient Fuzzy

38

18

9

The goal of the PLL is to maintain lock on the GPS carrier phase. In view of this goal, the Gradient Fuzzy filter performs admirably well. When compared with the Nominal Fuzzy filter, we can see how much improvement results from the optimization of the membership functions.

Conclusion

Gradient descent is an optimization method which can be applied to any differentiable function. It is an attractive approach to optimization due to its conceptual simplicity and its robustness. However, it converges to a local minimum rather than the global minimum. This can be compared to finding the bottom of a valley even though a much lower valley lies just over the next hill. This paper has shown the feasibility of fuzzy filtering and the benefits which can accrue from gradient descent optimization of the membership functions. Because of the differentiability condition, this approach only works for triangular fuzzy membership functions.
 

References

Innovatia Software's previous white paper on fuzzy filtering has several references (both hardcopy and web-based) on the general topic of fuzzy logic. As far as gradient descent goes, any good numerical methods text discusses the subject. (Look up Optimization in the Table of Contents or the Index of the text.) In particular, the following text gives a good overview of the method.

·         K. Atkinson, "An Introduction to Numerical Analysis," John Wiley & Sons, 1989.

The specific application of gradient descent to fuzzy logic as discussed in this paper can be explored further in the following papers, which delve into the topic in more detail.

·         D. Simon and H. El-Sherief, "Fuzzy Logic for Digital Phase-Locked Loop Filter Design," IEEE Transactions on Fuzzy Systems, May 1995.

·         D. Simon and H. El-Sherief, "Fuzzy Phase-Locked Loops," IEEE Position, Location and Navigation Symposium, April 1994.


Home         Credentials         Publications       White Papers

Copyright 1998–2007 Innovatia. All Rights Reserved.
Email Address:
dan@innovatia.com
Phone Number: (216)687-5407


Last Revised: May 29, 2007