
Contents |
A general nonlinear model can be expressed as follows:
| (1) |
|---|
where
is the independent variables and
is the parameters.
The aim of nonlinear fitting is to estimate the parameter values which best describe the data. The standard way of finding the best fit is to choose the parameters that would minimize the deviations of the theoretical curve(s) from the experimental points. This method is also called chi-square minimization, defined as follows:
| (2) |
|---|
where
is the row vector for the ith (i = 1, 2, ... , n) observation.
To estimate the
value with the least square method, we need to solve the normal equations which are set to be zero for the partial derivatives of
with respect to each
.
|
|
(3) |
Since there are no explicit solutions to the normal equations, we employ an iterative strategy to estimate the parameter values. This process starts with some initial values,
. With each iteration, a
value is computed and then the parameter values are adjusted to reduce the
. When the
values computed in two successive iterations are small enough (compared with the tolerance), we can say that the fitting procedure has converged. In the NLFit output messages, you can see the reduced chi-square, which is the mean deviation for all data points, as shown below:
|
|
(4) |
Origin uses the Levenberg-Marquardt (L-M) algorithm to adjust the parameter values in the iterative procedure. This algorithm, which combines the Gauss-Newton method and the steepest descent method, works for most cases. You may wish to consult other sources for details of the L-M algorithm. Origin's fitter additionally offers the Simplex method.
When the measurement errors are unknown,
are set to 1 for all i, and the curve fitting is performed without weighting. However, when the experimental errors are known, we can treat these errors as weights and use weighted fitting. In this case, the chi-square can be written as:
|
|
(5) |
There are a number of weighting methods available in Origin. Please read Fitting with Errors and Weighting in the Origin Help file for more details.
The fit-related formulas are summarized here:
Computing the fitted values in nonlinear regression is an iterative procedure. You can read a brief introduction in the above section (How Origin Fits the Curve), or see the below-referenced material for more detailed information.
During L-M iteration, we need to calculate the partial derivatives matrix F, whose element in ith row and jth column is:
|
|
(6) |
Then we can get the Variance-Covariance Matrix for parameters
by:
|
|
(7) |
where s2 is the mean residual variance, or the Deviation of the Model, and can be calculated as follows:
|
|
(8) |
The square root of a main diagonal value of this matrix is the Standard Error of the corresponding parameter
|
|
(9) |
where Cii is the element in ith row and ith column of the matrix C. Cij is the covariance between θi and θj.
You can choose whether to exclude s2 when calculating the covariance matrix. This will affect the Standard Error values. When excluding s2, clear the Use reduce Chi-Sqr check box on the Advanced page. The covariance is then calculated by:
|
|
(10) |
So the Standard Error now becomes:
|
|
(11) |
The parameter standard errors can give us an idea of the precision of the fitted values. Typically, the magnitude of the standard error values should be lower than the fitted values. If the standard error values are much greater than the fitted values, the fitting model may be overparameterized.
Origin estimates the standard errors for the derived parameters according to the Error Propagation formula, which is an approximate formula.
Let
be the function with a combination (linear or non-linear) of
variables
.
The general law of error propagation is:
where
is the covariance value for
, and
.
For example, using three variables
we get:
Now, let the derived parameter be
, and let the fitting parameters be
. The standard error for the derived parameter
is
.
One assumption in regression analysis is that data is normally distributed, so we can use the standard error values to construct the Parameter Confidence Intervals. For a given significance level, α, the (1-α)x100% confidence interval for the parameter is:
|
|
(12) |
The parameter confidence interval indicates how likely the interval is to contain the true value.
The confidence interval illustrated above is Asymptotic, which is the most frequently used method to calculate the confidence interval. The "Asymptotic" here means it is an approximate value. If you need more accurate values, you can use the Model Comparison Based method to estimate the confidence interval in the Advanced page.
If the Model Comparison method is used, the upper and lower confidence limits will be calculated by searching for the values of each parameter p that makes RSS(θj) (minimized over the remaining parameters) greater than RSS by a factor of (1+F/(n-p)).
|
|
(13) |
where F = Ftable(α,1,n-p)and RSS is the minimum residual sum of square found during the fitting session.
You can choose to perform a t-test on each parameter to see whether its value is equal to 0. The null hypothesis of the t-test on the jth parameter is:
|
|---|
And the alternative hypothesis is:
|
|---|
The t-value can be computed as:
|
|
(14) |
The probability that H0 in the t test above is true.
|
|
(15) |
where tcdf(t, df) computes the lower tail probability for Student's t distribution with df degree of freedom.
If the equation is overparameterized, there will be mutual dependency between parameters. The dependency for the ith parameter is defined as:
|
|
(16) |
and (C-1)ii is the (i, i)th diagonal element of the inverse of matrix C. If this value is close to 1, there is strong dependency.
The Confidence Interval Half Width is:
|
|
(17) |
where UCL and LCL is the Upper Confidence Interval and Lower Confidence Interval, respectively.
Several fit statistics formulas are summarized below:
The Error degree of freedom. Please refer to the ANOVA Table for more details.
The residual sum of squares:
|
|
(18) |
The Reduced Chi-square value, which equals the residual sum of square divided by the degree of freedom.
|
|
(19) |
The R2 value shows the goodness of a fit, and can be computed by:
|
|
(20) |
where TSS is the total sum of square, and RSS is the residual sum of square.
The adjusted R2 value:
|
|
(21) |
The R value is the square root of R2:
|
|
(22) |
For more information on R2, adjusted R2 and R, please see Goodness of Fit.
Root mean square of the error, or the Standard Deviation of the model, equal to the square root of reduced χ2:
|
|
(23) |
The ANOVA Table:
| df | Sum of Squares | Mean Square | F Value | Prob > F | |
|---|---|---|---|---|---|
| Model |
p |
SSreg = TSS - RSS |
MSreg = SSreg / p |
MSreg / MSE |
p-value |
| Error |
n - p |
RSS |
MSE = RSS / (n - p) | ||
| Uncorrected Total |
n |
TSS | |||
| Corrected Total |
n-1 |
TSScorrected |
Note: In nonlinear fitting, Origin outputs both corrected and uncorrected total sum of squares:
Corrected model:
|
|
(24) |
Uncorrected model:
|
|
(25) |
The confidence interval for the fitting function says how good your estimate of the value of the fitting function is at particular values of the independent variables. You can claim with 100α% confidence that the correct value for the fitting function lies within the confidence interval, where α is the desired level of confidence. This defined confidence interval for the fitting function is computed as:
|
|
(26) |
where:
|
|
(27) |
The prediction interval for the desired confidence level α is the interval within which 100α% of all the experimental points in a series of repeated measurements are expected to fall at particular values of the independent variables. This defined prediction interval for the fitting function is computed as:
|
|
(28) |