>Mergemill Tags Guide>Statistical Functions Tags
Statistics |
<?SMAX([ValueField])?>
<?SMIN([ValueField])?>
<?SCNT([ValueField],[FrequencyField])?>
<?SSUM([ValueField],[FrequencyField])?>
<?SAVG([ValueField],[FrequencyField])?>
<?SSDS([ValueField],[FrequencyField])?>
<?SSDP([ValueField],[FrequencyField])?>
<?LRCT([InputFieldX],[ResponseFieldY])?>
<?LRRC([InputFieldX],[ResponseFieldY])?>
<?LRCC([InputFieldX],[ResponseFieldY])?>
The statistical functions may be placed in your template in tag form, and as expression operands without the tag markers <? and ?>. All field attributes in the functions are ignored, but you may specify a result number format, like <?SAVG([ValueField],[FrequencyField])@NumFormat?>.
The functions used in a template are calculated at the start of a stream, and their results remain the same for the corresponding page set whether the fields specified are in-loop or out-loop. The "Outloop fields data loop back" option is not applicable to statistical functions. If a field involved in a function runs out of data streams, zero is returned.
Basic Statistics
The first set of five functions applies to any numerical data column in the current stream. SMAX and SMIN respectively returns the maximum and minimum of all data values for the specified field. SCNT returns the total count of data values, assuming the frequency of each to be 1 if FrequencyField is not specified. SSUM takes the data values and their corresponding frequencies to calculate the sum. SVG first calculates SCNT and SSUM if they have not been computed, and then returns the average value.
The second set of two functions are for normally distributed numerical data. SSDS returns the sample standard deviation of the values in the current data stream, and SSDP returns the population standard deviation.
If the count of each data value is 1, you don't need to provide a frequency field and may skip the comma as well in the tags. So <?SSUM([ValueField])?> can be used to insert the sum into the output.
Linear Regression
Linear Regression is an important approach of modeling the relationship between a variable y and a number of variables x1, x2, and so on, such that there is a linear dependence of y on the x parameters. If all x parameters except one can be kept constant, the linear regression model gives you a simple relationship between y and x: y = a + bx, where a is the constant term and b is the regression coefficient. The model also gives you a correlation coefficient that indicates how well the supplied data fit into the linear relationship.
There are two important uses of linear regression. One is prediction. After developing such a model, you may forecast the value of y when given any valid value of x, using y = a + bx. Or, using x = (y - a) / b, you may have a desired response y and use the model to estimate the value of x that would most likely produce it.
Another important use of linear regression is to quantify and compare the strength of dependence of y on each of the x parameters. You may end up finding that some x parameters can be ignored for certain practical applications because they are relatively insignificant.
Linear regression has a wide range of practical applications, especially in the areas of environmental, biological and social sciences. You may also apply it to study the simple trend of some quantity over a period of time.
Mergemill provides three linear regression functions. LRCT returns the constant term a, LRRC returns the regression coefficient b, and LRCC the correlation coefficient r. A data pair is included in the calculations only when both are non-empty.
Statistics |
Features::Downloads::Buy Now::Support::Tutorials::Tags Guide::Site Map