Linear regression algorithms ============================== The following linear regression algorithms may be selected when computing U-Pb ages or fitting a regression line to arbitrary x-y data. .. _classical-regression: Classical ----------- This routine emulates the default Isoplot line fitting routine [LUDWIG2012]_. Firstly, a linear regression is performed using the :ref:`model 1 ` algorithm. If MSWD is within its one-sided 85% confidence limit (by default) for the given degrees of freedom, (equivalent to a ‘probability of fit’ value above 0.15), then the fit is accepted as is. If the MSWD is between the 85% and 95% one-sided confidence limits (equivalent to a ‘probability of fit’ value between 0.15 - 0.05), then the slope and y-intercept values are retained, but uncertainties are expanded as per the :ref:`model 1x fit `. If the MSWD exceeds the one-sided 95% confidence limit then a linear regression is instead performed using the :ref:`model 2 ` algorithm for concordia intercept datasets, or the :ref:`model 3 ` for "classical" isochron datasets. Note that the model 3 algorithm parametrises ‘excess scatter’ as a Gaussian distributed component of scatter in the initial Pb isotope ratio. This assumption may not be applicable to all datasets and should be carefully considered. Spine ------ The robust line fitting algorithm described in [POWELL2020]_. This algorithm converges to the classical model 1 for ‘well-behaved’ datasets, but for more scattered data sets, down-weights data points lying away from the central ‘spine’ of data according to the Huber loss function. The spine-width parameter, *s*, gives an indication of how well resolved the central linear “spine” of data is, while accounting for assigned uncertainties. Comparing *s* with the upper one-sided 95% confidence interval, derived via simulation of Gaussian distributed data sets, provides a means of assessing whether the ‘spine’ of data is sufficiently well-defined to obtain accurate results with this algorithm. The spine algorithm may yield unreliable results for datasets where *s* clearly exceeds this upper limit. Robust model 2 --------------- A robust version of the Isoplot model 2 (details provided in Appendix C of the |manuscript|_). .. _model-1: Model 1 --------- Equivalent to the Isoplot model 1. Regression parameters and analytical errors are calculated via the algorithm of [YORK2004]_, which yields equivalent results to the original algorithm of [YORK1969]_ with errors calculated according to [TITT1979]_. Confidence intervals on the slope and y-intercept are computed based on assigned analytical errors alone and are not inflated according observed scatter, since any apparent excess scatter is not deemed statistically significant. .. _model-1x: Model 1x ---------- Equivalent to the Isoplot model 1 with "excess scatter". Regression parameters and analytical errors are calculated via the York algorithm as above for the model 1. These analytical errors are then multiplied by :math:`\sqrt{\mathrm{MSWD}}` to account for excess scatter, and further multiplied by the 95th percentile of a Student’s t distribution (with n – 2 degrees of freedom) to obtain 95% confidence limits following [BROOKS1972]_. .. _model-2: Model 2 -------- Equivalent to Isoplot model 2. The regression line slope is computed as the geometric mean of a y on x ordinary least-squares regression, and that of x on y (see [POWELL2020]_). Uncertainties are calculated following McSaveney in [FAURE1977]_ and these are then multiplied by :math:`\sqrt{\mathrm{MSWD}}` and the 95th percentile of a Student’s t distribution (with n – 2 degrees of freedom) to obtain 95 % confidence limits. .. _model-3: Model 3 -------- Equivalent to Isoplot Model 3. This algorithm iteratively adds a uniform component of Gaussian distributed scatted in y to each data point until MSWD converges to 1. This component of excess scatter is returned as an additional model parameter and may have physical significance in some cases. Once a solution is found, slope and y-intercept uncertainties are calculated as per the York algorithm, but including the additional component of scatter, and then multiplied by the 95th percentile of a Student’s t distribution (with n – 2 degrees of freedom) to obtain 95 % confidence limits.