What limitations should we place on the max % of left-censored data when using Tool #1? I have a found references that indicates using >80% left-censored data is not appropriate… Is this correct?
Huynh T, Ramachandran G, Banerjee S, Monteiro J, Stenzel M, Sandler DP, Engel LS, Kwok RK, Blair A, Stewart PA. Comparison of methods for analyzing left-censored occupational exposure data. Ann Occup Hyg. 2014 Nov;58(9):1126-42. doi: 10.1093/annhyg/meu067. Epub 2014 Sep 26. PMID: 25261453; PMCID: PMC4271092.
In theory, the bayesian models can work with whaterver information is provided to them. For example, you can have expostats calculate the various metrics without any observation. In that case, the results will only reflect the prior distributions set up in our models, with gigantic confidence intervals.
So you can indeed have expostats calculate your metrics with 100% ND. However it makes intuitive sense that the best model cannot perform very well at estimating en entire distribution with only information of this type.
Huynh et al’s paper is one of the most interesting papers on censorship as it compares bayesian models with others more traditionnals. Their recommendation makes sense to me, others have been a little more cautious and provided lower thresholds. In some of our tools we require at least 3 detects in the dataset.
Sorry for the non definitive answer
PS: oops, mistook Huynh’s paper with another from the same group : Huynh et al (2016) A comparison of the beta substitution method and a bayesian method for analyzing left censored data. Ann Occup Hyg 60(1)56-73
The paper you mention by Huynh 2014 did not examine any bayesian models. Maximum likelihood estimation, Kaplan Meier and beta-substitution methods were the three methods compared and as you point out showed up unacceptable bias at extreme levels of censoring (or sample sizes <5). The paper Jerome noted does compare bayesian method with beta substitution, however one point to note with beta substitution is the authors point out that multiple detection limits are averaged, whereas in the bayesian model the individual LODs are used.