NLREG Special Applications
Omitted Dependent Variable
There is a class of nonlinear regression problems that can
be best expressed by omitting the dependent variable (i.e., the variable on the
left of the equal sign). To understand
what this means, first consider the normal regression case with a dependent
variable. For each observation the
function is evaluated and the computed value is subtracted from the
corresponding value of the dependent variable for that observation. This residual value is then squared and
added to the other squared residual values.
The goal is to minimize the total sum of squared residuals. In the case where the dependent variable is
omitted, the function is computed for each observation and the value of the
function is squared (i.e., it is treated as the residual) and added to the
other squared values. The goal is to
minimize the sum of the squared values of the function. Thus, for a perfect fit the computed value
of the function for every observation would be zero.
To perform this type of analysis, omit the dependent
variable and equal sign from the left side of the function specification.
As an example of this type of analysis consider the problem
of fitting a circle to a set of points that form a roughly circular pattern
(i.e., a "circular regression'''').
Our goal is to determine the center point of the circle (Xc,Yc) and the radius (R) which will make the circle best fit
the points so that the sum of the squared distances between the points and the
perimeter of the circle is minimized (the points are as close to the perimeter
of the circle as possible).
For this problem, we have three parameters whose values are
to be determined: Xc, Yc, and R. There will be one data
observation for each point to which the circle is being fitted. For each point there are two variables, Xp and Yp, the X and Y coordinates of the point''s position.
Since our goal is to minimize the sum of the squared
distances from the points to the perimeter of the circle, we need a function
that will compute this distance for each point. If the center of the circle is at (Xc,Yc) and the position of a point is (Xp,Yp) then, from the theorem of Pythagoras, we know the distance
from the center to the point is
sqrt((Xp-Xc)^2 + (Yp-Yc)^2)
But we are interested in the distance from the perimeter to
the point. Since the radius of the
circle is R, the distance from the
perimeter to the point (along a straight line from the center to the point) is
sqrt((Xp-Xc)^2 + (Yp-Yc)^2) - R
That is, the distance from the perimeter to the point is
equal to the distance from the center to the point less the distance from the
center to the perimeter (the radius).
The distance will be positive or negative depending on whether the point
is outside or inside the circle but this does not matter since the value is
squared as part of the minimization process.
The NLREG statements for this analysis are as follows:
sqrt((Xp-Xc)^2 + (Yp-Yc)^2) - R;
Note that there is no dependent variable or equal sign to
the left of the function. NLREG will
determine the values of the parameters Yc,
Yc, and R such that the sum of the squared values of the function (i.e.,
the sum of the squared distances) is minimized. The CIRCLE.NLR file contains a full example of this analysis.
As a second example similar to the first one, consider a
town that is trying to decide where to place a fire station. The location should be central such that the
sum of the squared distances from the station to each house is minimized. NLREG can be used to determine the
coordinates of the station (Xc,Yc)
given a set of coordinates for each house location (Xh,Yh) by using a slightly simpler function than the first example:
sqrt((Xh-Xc)^2 + (Yh-Yc)^2);
Root Finding and Expression Minimization
Although it is designed for nonlinear regression
analysis, NLREG can also be used to find the root (zero point) or minimum
absolute value of a nonlinear expression. To use NLREG in this fashion follow
- Do not use any variable statements.
- Use parameter statements to specify the names and optional starting values for the
parameters whose values are to be determined as the roots or minimum value of
- Use the function statement to specify the expression whose roots or minimum value
is to be found; do not specify a dependent variable and equal sign -- specify
only the expression that is to be minimized.
- Do not include any data records after the data statement; it simply signals the
end of the program file and causes the analysis to begin.
The following is an example program file to find the root
of the expression sin(x)-log(x):
Function sin(x) - log(x);
Notice that the "variable'''' in the expression, X, is
not declared to be a variable but rather a parameter. This example is included in the file MINSL.NLR that you can run.
For this type of analysis, NLREG determines the values of
the parameters that minimize the absolute value of the expression. If the expression has a zero value (i.e., a
root), that value is found since that is the smallest possible absolute value. If the expression does not have a zero
point, NLREG determines the values of the parameters that produce the smallest
absolute value of the expression. For
example, the expression 2*X^2-3*X+10 does not have a root but reaches a minimum
value of 8.875 when X is 0.75. The
MINPAROB.NLR program file contains this example.
There are a number of cautions that you should keep in mind
when using NLREG to find roots or minimum values:
- NLREG will find only one root or minimum value per analysis. For example, the
expression 9-X2 has two roots: -3 and +3. NLREG will find one of the
roots; which one it finds depends on the starting value specified for X.
- NLREG will find only real roots, not complex.
- If the expression contains a local minimum, NLREG may find it rather than the
global minimum or root. Of course, if you are looking for a local minimum in a
certain region this could be considered a feature. For example, the expression
0.5*X3+5*(X-2)2+15 has a local minimum at X=1.61 and a
root at X=-13.38. If the starting value of X is less than -8.3 the root is
found; if the starting value is greater than -8.3, the local minimum is found.
The sweep statement can be used to try multiple starting values when searching
for a global minimum.
NLREG home page