Linear regression algorithms

The following linear regression algorithms may be selected when computing U-Pb ages or fitting a regression line to arbitrary x-y data.

Classical

This routine emulates the default Isoplot line fitting routine [LUDWIG2012]. Firstly, a linear regression is performed using the model 1 algorithm. By default, if MSWD is within its one-sided 85% confidence limit for the given degrees of freedom, (equivalent to a ‘probability of fit’ value above 0.15), then the fit is accepted as is. If the MSWD is between the 85% and 95% one-sided confidence limits (equivalent to a ‘probability of fit’ value between 0.15 - 0.05), then the slope and y-intercept values are retained, but uncertainties are expanded as per the model 1x fit. If the MSWD exceeds the one-sided 95% confidence limit then a linear regression is instead performed using the model 2 algorithm for concordia intercept datasets, or the model 3 for “classical” isochron datasets. Note that the model 3 algorithm parametrises ‘excess scatter’ as a Gaussian distributed component of scatter in the initial Pb isotope ratio. This assumption may not be applicable to all datasets and should be carefully considered.

Spine

The robust line fitting algorithm described in [POWELL2020]. This algorithm converges to the classical model 1 for ‘well-behaved’ datasets, but for more scattered data sets, down-weights data points lying away from the central ‘spine’ of data according to the Huber loss function. The spine-width parameter, s, gives an indication of how well resolved the central linear “spine” of data is, while accounting for assigned uncertainties. Comparing s with the upper one-sided 95% confidence interval, derived via simulation of Gaussian distributed data sets, provides a means of assessing whether the ‘spine’ of data is sufficiently well-defined to obtain accurate results with this algorithm. The spine algorithm may yield unreliable results for datasets where s clearly exceeds this upper limit.

Robust model 2

A robust version of the Isoplot model 2 (details provided in Appendix C of the manuscript).

Model 1

Equivalent to the Isoplot model 1. Regression parameters and analytical errors are calculated via the algorithm of [YORK2004], which yields equivalent results to the original algorithm of [YORK1969] with errors calculated according to [TITT1979]. Confidence intervals on the slope and y-intercept are computed based on assigned analytical errors alone and are not inflated according observed scatter, since any apparent excess scatter is not deemed statistically significant.

Model 1x

Equivalent to the Isoplot model 1 with “excess scatter”. Regression parameters and analytical errors are calculated via the York algorithm as above for the model 1. These analytical errors are then multiplied by \(\sqrt{\mathrm{MSWD}}\) to account for excess scatter, and further multiplied by the 95th percentile of a Student’s t distribution (with n – 2 degrees of freedom) to obtain 95% confidence limits following [BROOKS1972].

Model 2

Equivalent to Isoplot model 2. The regression line slope is computed as the geometric mean of a y on x ordinary least-squares regression, and that of x on y (see [POWELL2020]). Uncertainties are calculated following McSaveney in [FAURE1977] and these are then multiplied by \(\sqrt{\mathrm{MSWD}}\) and the 95th percentile of a Student’s t distribution (with n – 2 degrees of freedom) to obtain 95 % confidence limits.

Model 3

Equivalent to Isoplot Model 3. This algorithm iteratively adds a uniform component of Gaussian distributed scatted in y to each data point until MSWD converges to 1. This component of excess scatter is returned as an additional model parameter and may have physical significance in some cases. Once a solution is found, slope and y-intercept uncertainties are calculated as per the York algorithm, but including the additional component of scatter, and then multiplied by the 95th percentile of a Student’s t distribution (with n – 2 degrees of freedom) to obtain 95 % confidence limits.