Gradient Descent Optimization of Fuzzy Functions
Dan Simon
Innovatia Software
dan@innovatia.com
Copyright 1998–2007 Innovatia Software. All Rights Reserved.
Preliminaries
A previous white paper by Innovatia
Software discussed the topic of fuzzy filtering in general terms. This paper discusses how the fuzzy
membership functions of a fuzzy filter can be optimized using gradient descent.
As before, the application is fuzzy filtering for phase-locked loops (PLLs) in
a Global Positioning System (GPS) receiver, although the approach is general
enough to be applied to most estimation problems. See the Hybrid Kalman / H-infinity filtering paper for some general information
about GPS and PLLs.
Gradient Descent
Gradient descent is a
function optimization method which uses the derivative of the function and the
idea of steepest descent. The derivative of a function is simply the slope. So
if we know the slope of a function, then it stands to reason that all we have
to do is somehow move the function in the negative direction of the slope, and
that will reduce the value of the function. Gradient descent is an interative
method, so the idea is as follows:
§
Compute
the derivative of the function with respect to its independent variables. We
can denote this derivative as dF(x), where F(x)) is the function to be minimized, and x is the vector of
independent variables.
§
Change
the value of x as follows: xn+1 = xn
– h dF(xn), where the subscript n refers to the iteration
number, and h is a step size which must be chosen so that we don't take too big
or too small of a step. Too big of a step will overshoot the function minimum,
and too small of a step will result in a long convergence time.
§
Repeat
the above two steps until we converge to a minimum of the function F(x).
Gradient descent is an
attractive optimization method in that it is conceptually straightforward and
often converges quickly. Its drawbacks include the fact that the derivative of
the function must be available, and it converges to a local minimum rather than
a global minimum.
Gradient Descent
Optimization of a Fuzzy Filter
Consider that problem of
PLL phase estimation using fuzzy logic as discussed in a previous white paper. Now say that we have N training samples with
which we can "train" our fuzzy filter. Our training samples might
include a set of simulations which typify situations in which we want good
performance from our filter. Since we are running the simulations, we know the
"true" GPS phase, so we can run a fuzzy filter and figure out the error
between our phase estimate and the "true" phase for each training
sample.
Eq = eq
- tq (q = 1, . . . , N)
where eq is our
phase estimate and tq is the true phase. Then we can compute a
measure of the goodness of our fuzzy filter by combining all of the Eq's.
E = (E12
+ E22 + . . . + EN2)
Now we want to minimize E
using gradient descent. If we minimize E, then we have a fuzzy filter which
gives the "smallest" possible error between the true phase and the
estimated phase. The trick is to figure out the derivative of E with respect to
the parameters that characterize the fuzzy membership functions.
As it turns out, if the
fuzzy membership functions are triangular (as shown in the previous fuzzy filtering white paper), then we can derive an analytic
expression for the partial derivative of E with respect to: (a) the
centroids of the input fuzzy membership functions; (b) the widths of the input
fuzzy membership functions; and (c) the centroids of the output fuzzy
membership functions. The mathematics are too messy to repeat here, but they
can be found in the papers referred to at the end of this white paper.
So the bottom line is that
we can perform the following steps to find the "best" fuzzy PLL
filter.
§
Compute
the derivative of E with respect to a, b,
and c. Vector a represents the vector of all of the
centroids of the output fuzzy membership funtions, b represents
the vector of all the widths of the input fuzzy membership functions, and c
represents the vector of all the centroids of the input fuzzy membership
functions. We denote these derivatives as dE(a), dE(b),
and dE(c).
§
Change
the values of a, b and c as follows:
an+1 = an – h dE(an)
bn+1 = bn – h dE(bn)
cn+1 = cn – h dE(cn)
where the subscript n refers to the iteration number, and the h's are step
sizes which must be chosen so that we don't take too big or too small of steps.
§
Repeat
the above two steps until we converge to a minimum of the function E.
Simulation Results
So how does it work? The
method outlined above was simulated for a PLL in a GPS receiver located on a
missile flying a test flight from California to the South Pacific. The behavior
of the PLL was investigated by examining its ability to track the phase of the
GPS carrier between the missile and one GPS satellite during the first 60
seconds of boost. The filter rate was fixed at 50 Hz. The relative trajectory
between the GPS receiver and the GPS satellite is depicted in the figure below.

The fuzzy PLL consisted of
seven membership functions for the first input (the Error parameter in
the previous Fuzzy Filtering paper), seven membership functions for
the second input (the Change in Error parameter in the previous Fuzzy Filtering paper), and seven membership functions
for the output.
The N training
samples discussed above consisted of the time between 45 and 55 seconds
following launch (the most dynamic 10-second interval of the flight,
corresponding to burnout of the first rocket stage). The GPS carrier-to-noise
ratio was fixed at 18 dB-Hz. The gradient descent step size parameters h were
all fixed at 0.3. The fuzzy membership functions were constrained to be
symmetric triangles, and the error function E was minimized with respect
to 35 parameters: the centroids of each of the membership functions (21 total),
and the widths of each of the input membership functions (14 total). As
training progressed and PLL performance improved, more and more simulations
were used in the training procedure. For the first 200 iterations, a single
simulation was used for each update of the parameters, so N in the
equations above was 500 (10 seconds *
50 Hz = 500). After a while, the training size was increased to 10 simulations,
then 30 simulations, and finally 100 simulations. So N gradually
increased from 500 to 5,000 to 15,000 to 50,000. The figure below shows the
decrease of the error function E as training progressed.

The sudden jumps in the
value of the error function at iterations 200, 300, and 500 reflect the
increase in the number of simulations in the training sample.
So how does the optimized
fuzzy PLL perform? The table below shows the performance of the optimized PLL
filter using the trajectory shown above. The Nominal Fuzzy PLL is based on
using common-sense triangular membership functions for the inputs and output of
the PLL. The Gradient Fuzzy PLL is the filter described in this white paper.
The Gradient PLL was trained for a carrier-to-noise ratio (CNR) of 18 db-Hz,
but the performance of the filters is shown in the table for higher CNRs also.
Probability of Loss of Lock for Various PLL Methods
(Percent)
|
PLL METHOD |
CNR = 18 |
CNR = 19 |
CNR = 20 |
|
Nominal Fuzzy |
73 |
45 |
24 |
|
Gradient Fuzzy |
38 |
18 |
9 |
The goal of the PLL is to
maintain lock on the GPS carrier phase. In view of this goal, the Gradient
Fuzzy filter performs admirably well. When compared with the Nominal Fuzzy
filter, we can see how much improvement results from the optimization of the
membership functions.
Conclusion
Gradient descent is an
optimization method which can be applied to any differentiable function. It is
an attractive approach to optimization due to its conceptual simplicity and its
robustness. However, it converges to a local minimum rather than the global
minimum. This can be compared to finding the bottom of a valley even though a
much lower valley lies just over the next hill. This paper has shown the feasibility
of fuzzy filtering and the benefits which can accrue from gradient descent
optimization of the membership functions. Because of the differentiability
condition, this approach only works for triangular fuzzy membership functions.
References
Innovatia Software's
previous white paper on fuzzy filtering has several references (both hardcopy and web-based) on the
general topic of fuzzy logic. As far as gradient descent goes, any good
numerical methods text discusses the subject. (Look up Optimization in
the Table of Contents or the Index of the text.) In particular, the following
text gives a good overview of the method.
·
K.
Atkinson, "An Introduction to Numerical Analysis," John Wiley &
Sons, 1989.
The specific application of
gradient descent to fuzzy logic as discussed in this paper can be explored
further in the following papers, which delve into the topic in more detail.
·
D.
Simon and H. El-Sherief, "Fuzzy Logic for Digital Phase-Locked Loop Filter
Design," IEEE Transactions on Fuzzy Systems, May 1995.
·
D.
Simon and H. El-Sherief, "Fuzzy Phase-Locked Loops," IEEE
Position, Location and Navigation Symposium, April 1994.
Home
Credentials
Publications White Papers
Copyright 1998–2007 Innovatia. All Rights
Reserved.
Email Address: dan@innovatia.com
Phone Number: (216)687-5407
Last Revised: May 29, 2007