Parameter Confidence Intervals
Usage on several models can be seen in the examples section, such as for the Logistic Model.
LikelihoodBasedProfileWiseAnalysis.check_univariate_parameter_coverage — Functioncheck_univariate_parameter_coverage(data_generator::Function,
generator_args::Union{Tuple, NamedTuple},
model::LikelihoodModel,
N::Int,
θtrue::AbstractVector{<:Real},
θs::AbstractVector{<:Int64},
θinitialguess::AbstractVector{<:Real}=θtrue;
<keyword arguments>)Performs a simulation to estimate the coverage of univariate confidence intervals for parameters in θs given a model by:
- Repeatedly drawing new observed data using
data_generatorfor fixed true parameter values, θtrue. - Fitting the model and univariate confidence intervals.
- Checking whether the confidence interval for each of the parameters of interest contain the true parameter value in
θtrue. The estimated coverage is returned with a default 95% confidence interval within a DataFrame.
Arguments
data_generator: a function with two arguments which generates data for fixed time points and true model parameters corresponding to the log-likelihood function contained inmodel. The two arguments must be the vector of true model parameters,θtrue, and a Tuple or NamedTuple,generator_args. Outputs adataTuple or NamedTuple that corresponds to the log-likelihood function contained inmodel.generator_args: a Tuple or NamedTuple containing any additional information required by both the log-likelihood function anddata_generator, such as the time points to be evaluated at. If evaluating the log-likelihood function requires more than just the simulated data, arguments for thedataoutput ofdata_generatorshould be passed in viagenerator_args.model: aLikelihoodModelcontaining model information, saved profiles and predictions.N: a positive number of coverage simulations.θtrue: a vector of true parameters values of the model for simulating data with.θs: a vector of parameters to profile, as a vector of model parameter indexes.θinitialguess: a vector containing the initial guess for the values of each parameter. Used to find the MLE point in each iteration of the simulation. Default isθtrue.
Keyword Arguments
confidence_level: a number ∈ (0.0, 1.0) for the confidence level to evaluate the confidence interval coverage at. Default is0.95(95%).profile_type: whether to use the true log-likelihood function or an ellipse approximation of the log-likelihood function centred at the MLE (with optional use of parameter bounds). Available profile types areLogLikelihood,EllipseApproxandEllipseApproxAnalytical. Default isLogLikelihood()(LogLikelihood).θlb_nuisance: a vector of lower bounds on nuisance parameters, requireθlb_nuisance .≤ model.core.θmle. Default ismodel.core.θlb.θub_nuisance: a vector of upper bounds on nuisance parameters, requireθub_nuisance .≥ model.core.θmle. Default ismodel.core.θub.coverage_estimate_confidence_level: a number ∈ (0.0, 1.0) for the level of a confidence interval of the estimated coverage. Default is0.95(95%).optimizationsettings: aOptimizationSettingscontaining the optimisation settings used to find optimal values of nuisance parameters for a given interest parameter value. Default ismissing(will usedefault_OptimizationSettings()(seedefault_OptimizationSettings).show_progress: boolean variable specifying whether to display progress bars on the percentage of simulation iterations completed and estimated time of completion. Default ismodel.show_progress.distributed_over_parameters: boolean variable specifying whether to distribute the workload of the simulation across simulation iterations (false) or across the individual confidence interval calculations within each iteration (true). Default isfalse.
Details
This simulated coverage check is used to estimate the performance of parameter confidence intervals. The simulation uses Distributed.jl to parallelise the workload.
For a 95% confidence interval of a interest parameter θi it is expected that under repeated experiments from an underlying true model (data generation) which are used to construct a confidence interval for θi using the method used in univariate_confidenceintervals!, 95% of the intervals constructed would contain the true value for θi. In the simulation where the values of the true parameters, θtrue, are known, this is equivalent to whether the confidence interval for θi contains the value θtrue[θi].
The uncertainty in estimates of the coverage under the simulated model will decrease as the number of simulations, N, is increased. Confidence intervals for the coverage estimate are provided to quantify this uncertainty. The confidence interval for the estimated coverage is a Clopper-Pearson interval on a binomial test generated using HypothesisTests.jl.
Calculating the coverage of simultaneous confidence intervals is not currently supported (i.e. for dof ≠ 1)
- If the number of processes available to use is significantly greater than the number of model parameters or only a few model parameters are being checked for coverage,
falseis recommended. - If system memory or model size in system memory is a concern, or the number of processes available is similar or less than the number of model parameters being checked,
truewill likely be more appropriate. - When set to
false, a separateLikelihoodModelstruct will be used by each process, as opposed to only one when set totrue, which could cause a memory issue for larger models.
The current implementation only considers two extremes of the log-likelihood and whether the truth is between these two points. If the profile likelihood function is bimodal, it's possible the method has only found one set of correct confidence intervals (estimated coverage will be correct, but less than expected) or found one extrema on distinct sets (estimated coverage may be incorrect and will either be larger than expected or much lower than expected).