convexLSE {convexHaz}R Documentation

Compute the nonparamateric LSE of a convex hazard

Description

This function computes the nonparametric LSE of a convex hazard function on [0,TT].

Usage

convexLSE(x, TT, M = 100, GRIDLESS = 0, tol = 1e-05, tol.SR = 1e-08, type = 1, 
max.loop = 100, max.loop.SR = 250, range = c(0, TT), solve.tol = .Machine$double.eps)

Arguments

x vector containing the data.
TT value specifying TT in the interval [0,TT] over which the LSE is found.
M size of the grid used in the implementation of the (support reduction) algorithm. The default value is M=100.
GRIDLESS boolean variable. If GRIDLESS=1, then a 'gridless' implementation of the algorithm is used. The default is GRIDLESS=0.
tol tolerance value of the bisection algorithm. The default value is tol=1e-05.
tol.SR tolerance value of the support reduction algorithm. The default value is tol.SR=1e-08.
type integer between 1 and 3 indicating the exit criterion function used by the bisection algorithm. The default is type=1.
max.loop maximum number of iterations in the bisection algorithm. The default is max.loop=100. To be used for troubleshooting.
max.loop.SR maximum number of iteration in the support reduction algorithm. The default is max.loop.SR=250. To be used for troubleshooting.
range vector of length two describing the range of antimodes over which the (bisection) algorithm searches. The default is range=[0,TT]. The range should always be contained within [0,TT]. To be used in troubleshooting.
solve.tol tolerance value used in solve() invoked by the algorithm. The default is solve.tol=.Machine$double.eps, which is also the default of the solve() function. To be used in troubleshooting.

Details

Let HHn be the empirical cumulative hazard function for the data x. Then the LSE is the convex hazard function which is closest to the hazard function whose integral is given by HHn, in the least squares sense. See Jankowski and Wellner (2007) for further details.

The function consists of an outer, bisection algorithm loop and an inner, support reduction algorithm loop. The goal is to minimize the least squares criterion function over the space of positive convex hazards. This is done in two steps: the support reduction algorithm minimizes the least squares criterion function (phi) over the space of positive convex hazard with minimum (antimode) at a using the function srLSE; the bisection algorithm searches over the possible values of a in [0,TT].

The bisection algorithm ends once the exit criterion function is smaller than tol. The exit criterion function may be specified via the option type:

type=1 sum((cur.phi-min(old.phi))^2)

type=2 sqrt(sum((cur.phi - old.phi)^2))

type=3 sort(cur.phi)[2]+sort(cur.phi)[3]-2*sort(cur.phi)[1]

where cur.phi is the vector of the least squares criterion function phi along different values of antimode a, and old.phi is the same but in the previous iteration.

The support reduction algorithm was first described in Groeneboom, Jongbloed and Wellner (2007). It is an iterative optimization procedure which exits when a sufficient tolerance level tol.SR is reached. The algorithm uses a gridded implementation (see Jankowski and Wellner (2008) for the details) described by the parameter M. The idea is that any convex function may be written as a mixture of elbow functions, and the grid describes the possible support of the mixing distribution. (The documentation for the function h provides a little more information; see also Jankowski and Wellner (2007, 2008).) The larger M is, the more accurate, but slower, the algorithm. Setting GRIDLESS=1 invokes an implementation which tries to overcome the grid, and increases the accuracy again. Decreasing tol and/or tol.SR may also increase the accuracy of the algorithm, but not as much as manipulating the grid size.

Value

A list which includes
lse list containing the support supp and mixing measure mix for the computed LSE. The function h can translate these into a hazard function.
ls value of the least squares criterion function at the LSE. See Jankowski and Wellner (2007) for its exact form.
antimode value of the antimode a at which the LSE of the hazard has its minimum.
conv boolean indicating if the (bisection) algorithm has converged.
iter number of iterations of the bisection algorithm.
"convexLSE.out" The algorithm also creates a file called convexLSE.out in R's current working directory. The file contains three variables: (1) phi the value of the smallest least squares criterion function at the antimode found by the support reduction algorithm, (2) antimode the value of the antimode, and (3) conv a boolean indicating if the support reduction algorithm converged. If convexLSE.out already exists, then the algorithm will overwrite it.

Diagnostics

The file "convexLSE.out" contains information which may be used to determine how well the algorithm has worked. Plotting phi against antimode should result in a u-shaped (first decreasing, then increasing) function, with minimum lying at the antimode of the LSE. This is only true though for the values of phi (and antimode) where the srLSE algorithm converged (ie. conv=TRUE). Plotting the output in "convexLSE.out" can be used to check this (see 'Examples'), and to check where convergence of srLSE failed.

Troubleshooting

If the bisection algorithm fails to converge, then max.loop may be increased, or changing the range may be used temporarily to diagnose the issue.

Alternatively, the support reduction algorithm may fail to converge. The file "convexLSE.out" shows which of the calls of the support reduction algorithm by the bisection loop converged. If too many of these failed, then changing the values of max.loop, max.loop.SR, or solve.tol may fix this.

At each iteration, the support reduction algorithm does a finite dimensional quadratic optimization using the function solve(). Occasionally (quite rarely for the LSE), the matrix passed to solve() is computationally singular. Several built-in catches exist in the code to handle this, but if the problem occurs too frequently, the algorithm will abort. In fact, this is by far the most common cause of nonconvergence. The larger the value of M, the more likely the problem is, but this also depends on the data set. Setting solve.tol to a smaller value may fix the problem.

Author(s)

Hanna Jankowski: hkj@mathstat.yorku.ca, Ivy Wang, Hugh McCague, Jon A. Wellner

References

Groeneboom, Jongbloed and Wellner (2008). The support reduction algorithm for computing nonparametric function estimates in mixture models. Scan. J. Statist. 35, 385–399.

Jankowski and Wellner (2007). Nonparametric estimation of a convex bathtub-shaped hazard function. Technical Report 521, Department of Statistics, University of Washington.

Jankowski and Wellner (2008). Computation of nonparametric convex hazard estimators via profile methods. Technical Report 542, Department of Statistics, University of Washington.

See Also

convexMLE srLSE srMLE h

Examples

# Generate sample data:
set.seed(1111, kind="default")
x 	<- rweibull(50, 3) 

# Set the value of TT:
TT	<- 1

# Find the lse	
lse 	<- convexLSE(x, TT=TT)

# And the lse has minimum at
lse$antimode

# Check the bisection algorithm
diag	<- read.table("convexLSE.out", header=TRUE)
diag

plot(diag$antimode, diag$phi, pch=16, col="blue", cex=1, main="diagnostics", ylab="phi", xlab="antimode")


# plot the true hazard function and the lse
h.true 	<- function(x) 3*x^2
tt	<- c(0,sort(lse$lse$supp$tau), sort(lse$lse$supp$eta), TT)
yy	<- sapply(tt, h, supp=lse$lse$supp, hpar=lse$lse$mix)


plot(h.true, xlim=c(0,TT), lwd=2, col="blue", ylab="hazard", xlab="time", main="LSE of convex hazard")
lines(tt,yy, lwd=2, col="red")
legend("topleft", c("true","estimated"),col=c("blue", "red"), lwd=2, inset = .05)

[Package convexHaz version 0.2 Index]