
Contents |
Polynomial fit for LabTalk usage
1. fitpoly (1,2) 2 3 (4,5); // 2nd order, put coeff into col(3) and fit into col(4) (5)
| Display Name | Variable Name | I/O and Type | Default Value | Description |
|---|---|---|---|---|
| Input | iy |
Input XYRange | | This variable specifies the input data range. |
| Polynomial Order | polyorder |
Input int | | This variable specifies the order of polynomial to be fitted. |
| Polynomial Coefficients | coef |
Output vector | | It specifies where to output the polynomial coefficients, e.g. coef:=3, which means to output the polynomial coefficients to column 3. |
| Output | oy |
Output XYRange | | This variable specifies the Output range |
| Number of Points | N |
Output int | | This variable specifies how many points to be used when fit this polynomial. |
| Adjusted residuel sum of squares | AdjRSq |
Output double | | This variable specifies the adjusted residual sum of squares. |
| Coefficient of determination (R^2) | RSqCOD |
Output double | | This variable specifies the coefficient of residual sum of squares. |
| Polynomial Coefficients Errors | err |
Output vector | | Output the error of polynomial coefficients |
Polynomial regression fits a given data set to the following model:
.
where γi are the coefficients and ε is the error term. The error term represents the unexpected or unexplained variation in the dependent variable. It is assumed that the mean of the random variable ε is equal to zero.
Parameters are estimated using a weighted least-square method. This method minimizes the sum of the squares of the deviations between the theoretical curve and the experimental points for a range of independent variables. After fitting, the model can be evaluated using hypothesis tests and by plotting residuals.
It is worth noting that the higher order terms in polynomial equation have the greatest effect on the dependent variable. Consequently, models with high order terms (higher than 4) are extremely sensitive to the precision of coefficient values, where small differences in the coefficient values can result in a larges differences in the computed y value. We mention this because, by default, the polynomial fitting results are rounded to 5 decimal places. If you manually plug these reported worksheet values back into the fitted curve, the slight loss of precision that occurs in rounding will have a marked effect on the higher order terms, possibly leading you to conclude wrongly, that your model is faulty. If you wish to perform manual calculations using your best-fit parameter estimates, make sure that you use full-precision values, not rounded values. Note that while Origin may round reported values to 5 decimal places (or other), these values are only for display purposes. Origin always uses full precision (double(8)) in mathematical calculations unless you have specified otherwise. For more information, see Numbers in Origin.
Generally speaking, any continuous function can be fitted to a higher order polynomial model. However, higher order terms may not have much practical significance.
For more examples, please refer to XF Script Dialog (press F11).
Regression model:
For a given dataset (xi , yi ), i = 1,2,...n, where X is the independent variable and Y is the dependent variable, a polynomial regression fits data to a model of the following form:
where k is the degree and, in Origin, it is a positive number that is less than 10. The error term ε is assumed to be independent and normally distributed N(0, σ2).
To fit the model, assume that the residuals:
Are normally distributed with the mean equal to 0 and the variance equal to
. Then the maximum likelihood estimates for the parameters βican be obtained by minimizing the Chi-square, which is defined as:
If the error is treated as weight, the Chi-square minimizing equation can be written as:
and:
where σiare the measurement errors. If they are unknown, they should all be set to 1.
Coefficient estimation by matrix calculation:
The calculation of the estimated coefficients is a procedure of matrix calculation. First, we can rewrite the regression model in the matrix form
where:
The estimate of the vector B is the solution to the linear equations, and can be expressed as:
where X'is the transpose of X.
Inference in polynomial regression:
The ANOVA for the polynomial regression is summarized in the following table:
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
| ||
|
|
|
|
(Note: If intercept is included in the model, n*=n-1. Otherwise, n*=n and the total sum of square is uncorrected.)
Where the total sum of square, TSS, is
And the residual sum of square (RSS) or sum of square error (SSE), which is actually the sum of the squares of the vertical deviations from each data point to the fitted line. It can be computed as:
The result of the F-test is presented in the ANOVA table. The null hypothesis of the F test is that all of the partial coefficients are equal to zero, i.e.
Thus, the alternative hypothesis is:
0With the computed F-value, we can decide whether or not to reject the null hypothesis. Usually, for a given confidence level α, we can reject H0 when F > Fα, or the significance of F (the computed p-value) is less than α.
For the inference, we need to know the standard error of partial slopes, which may be computed as:
where cjj is the jth diagonal element of (X'X)-1. And sε is the residual standard deviation (also called td dev, tandard error of estimate, or oot MSE) computed as:
If the regression assumptions hold, we can perform the t-tests for the regression coefficients with the null hypotheses and the alternative hypotheses:
0,The t-value can be computed as:

With the t-values, we can decide whether or not to reject the null hypotheses. Usually, for a given confidence level α, we can reject H0 when |t| > tα / 2, or when the significant p-value less than α.
Confidence and Prediction interval:
For a particular value xp, the 100(1-α)% confidence interval for the mean value of y at x=xp is:
And the 100(1-α)% prediction interval for the mean value of y at x=xp is:
Coefficient of Determination:
The goodness of fit can be evaluated by coefficient of determination, R2, which is given by:
The adjusted R2 is used to adjust the R2 value for the degree of freedom. It can be computed as:
Then we can compute the R-value, which is simply the square root of R2:
Covariance and Correlation matrix:
The covariance matrix of the polynomial regression can be calculated as:
And the correlation between any two parameters is:
1. Bruce Bowerman, Richard T. O'Connell. 1997. Applied Statistics: Improving Business Processes. The McGraw-Hill Companies, Inc.
2. Sanford Weisberg. 2005. Applied Linear Regression, 2nd ed. John Wiley & Son, Inc., Hoboken, New Jersey.
3. William H. Press.; et al. 2002. Numerical Recipes in C++, 2nd ed. Cambridge University Press: New York.