Linear regression algorithms

The following linear regression algorithms may be selected when computing U-Pb ages or fitting a regression line to arbitrary x-y data.

Classical

This routine emulates the default Isoplot line fitting routine [LUDWIG2012]. Firstly, a linear regression is performed using the model 1 algorithm. By default, if MSWD is within its one-sided 85% confidence limit for the given degrees of freedom, (equivalent to a ‘probability of fit’ value above 0.15), then the fit is accepted as is. If the MSWD is between the 85% and 95% one-sided confidence limits (equivalent to a ‘probability of fit’ value between 0.15 - 0.05), then the slope and y-intercept values are retained, but uncertainties are expanded as per the model 1x fit. If the MSWD exceeds the one-sided 95% confidence limit then a linear regression is instead performed using the model 2 algorithm for concordia intercept datasets, or the model 3 for “classical” isochron datasets. Note that the model 3 algorithm parametrises ‘excess scatter’ as a Gaussian distributed component of scatter in the initial Pb isotope ratio. This assumption may not be applicable to all datasets and should be carefully considered.

Spine

The robust line fitting algorithm described in [POWELL2020]. This algorithm converges to the classical model 1 for ‘well-behaved’ datasets, but for more scattered data sets, down-weights data points lying away from the central ‘spine’ of data according to the Huber loss function. The spine-width parameter, s, gives an indication of how well resolved the central linear “spine” of data is, while accounting for assigned uncertainties. Comparing s with the upper one-sided 95% confidence interval, derived via simulation of Gaussian distributed data sets, provides a means of assessing whether the ‘spine’ of data is sufficiently well-defined to obtain accurate results with this algorithm. The spine algorithm may yield unreliable results for datasets where s clearly exceeds this upper limit.

Robust model 2

A robust version of the Isoplot model 2 (details provided in Appendix C of the manuscript).

Model 1

Equivalent to the Isoplot model 1. Regression parameters and analytical errors are calculated via the algorithm of [YORK2004], which yields equivalent results to the original algorithm of [YORK1969] with errors calculated according to [TITT1979]. Confidence intervals on the slope and y-intercept are computed based on assigned analytical errors alone and are not inflated according observed scatter, since any apparent excess scatter is not deemed statistically significant.

Model 1x

Equivalent to the Isoplot model 1 with “excess scatter”. Regression parameters and analytical errors are calculated via the York algorithm as above for the model 1. These analytical errors are then multiplied by \(\sqrt{\mathrm{MSWD}}\) to account for excess scatter, and further multiplied by the 95th percentile of a Student’s t distribution (with n – 2 degrees of freedom) to obtain 95% confidence limits following [BROOKS1972].

Model 2

Equivalent to Isoplot model 2. The regression line slope is computed as the geometric mean of a y on x ordinary least-squares regression, and that of x on y (see [POWELL2020]). Uncertainties are calculated following McSaveney in [FAURE1977] and these are then multiplied by \(\sqrt{\mathrm{MSWD}}\) and the 95th percentile of a Student’s t distribution (with n – 2 degrees of freedom) to obtain 95 % confidence limits.

Model 3

Equivalent to Isoplot Model 3. This algorithm iteratively adds a uniform component of Gaussian distributed scatted in y to each data point until MSWD converges to 1. This component of excess scatter is returned as an additional model parameter and may have physical significance in some cases. Once a solution is found, slope and y-intercept uncertainties are calculated as per the York algorithm, but including the additional component of scatter, and then multiplied by the 95th percentile of a Student’s t distribution (with n – 2 degrees of freedom) to obtain 95 % confidence limits.