Sensitivity Functions¶
Hyperparameter sensitivity linear approximation¶
-
class
vittles.sensitivity_lib.HyperparameterSensitivityLinearApproximation(objective_fun, opt_par_value, hyper_par_value, validate_optimum=False, hessian_at_opt=None, cross_hess_at_opt=None, hyper_par_objective_fun=None, grad_tol=1e-08)[source]¶ Linearly approximate dependence of an optimum on a hyperparameter.
Suppose we have an optimization problem in which the objective depends on a hyperparameter:
\[\hat{\theta} = \mathrm{argmin}_{\theta} f(\theta, \lambda).\]The optimal parameter, \(\hat{\theta}\), is a function of \(\lambda\) through the optimization problem. In general, this dependence is complex and nonlinear. To approximate this dependence, this class uses the linear approximation:
\[\hat{\theta}(\lambda) \approx \hat{\theta}(\lambda_0) + \frac{d\hat{\theta}}{d\lambda}|_{\lambda_0} (\lambda - \lambda_0).\]In terms of the arguments to this function, \(\theta\) corresponds to
opt_par, \(\lambda\) corresponds tohyper_par, and \(f\) corresponds toobjective_fun.Methods
set_base_values: Set the base values, \(\lambda_0\) and \(\theta_0 := \hat\theta(\lambda_0)\), at which the linear approximation is evaluated. get_dopt_dhyper: Return the Jacobian matrix \(\frac{d\hat{\theta}}{d\lambda}|_{\lambda_0}\) in flattened space. get_hessian_at_opt: Return the Hessian of the objective function in the flattened space. predict_opt_par_from_hyper_par: Use the linear approximation to predict the value of opt_parfrom a value ofhyper_par.-
__init__(self, objective_fun, opt_par_value, hyper_par_value, validate_optimum=False, hessian_at_opt=None, cross_hess_at_opt=None, hyper_par_objective_fun=None, grad_tol=1e-08)[source]¶ Parameters: - objective_fun : callable
The objective function taking two positional arguments, -
opt_par: The parameter to be optimized (numpy.ndarray (N,)) -hyper_par: A hyperparameter (numpy.ndarray (N,)) and returning a real value to be minimized.- opt_par_value : numpy.ndarray (N,)
The value of
opt_parat whichobjective_funis optimized for the given value ofhyper_par_value.- hyper_par_value : numpy.ndarray (M,)
The value of
hyper_parat whichopt_paroptimizesobjective_fun.- validate_optimum : bool, optional
When setting the values of
opt_parandhyper_par, check thatopt_paris, in fact, a critical point ofobjective_fun.- hessian_at_opt : numpy.ndarray (N,N), optional
The Hessian of
objective_funat the optimum. If not specified, it is calculated using automatic differentiation.- cross_hess_at_opt : numpy.ndarray (N, M)
Optional. The second derivative of the objective with respect to
input_valthenhyper_val. If not specified it is calculated at initialization.- hyper_par_objective_fun : callable, optional
The part of
objective_fundepending on bothopt_parandhyper_par. The arguments must be the same asobjective_fun: -opt_par: The parameter to be optimized (numpy.ndarray (N,)) -hyper_par: A hyperparameter (numpy.ndarray (N,)) This can be useful if only a small part of the objective function depends on bothopt_parandhyper_par. If not specified,objective_funis used.- grad_tol : float, optional
The tolerance used to check that the gradient is approximately zero at the optimum.
-
Hyperparameter sensitivity Taylor series approximation¶
-
class
vittles.sensitivity_lib.ParametricSensitivityTaylorExpansion(estimating_equation, input_val0, hyper_val0, order, hess_solver, forward_mode=True, max_input_order=None, max_hyper_order=None, force=False)[source]¶ Evaluate the Taylor series of an optimum on a hyperparameter.
This is a class for computing the Taylor series of eta(eps) = argmax_eta objective(eta, eps) using forward-mode automatic differentation.
Note
This class is experimental and should be used with caution.
-
__init__(self, estimating_equation, input_val0, hyper_val0, order, hess_solver, forward_mode=True, max_input_order=None, max_hyper_order=None, force=False)[source]¶ Parameters: - estimating_equation : callable
A vector-valued function function of two arguments, (input, output), where the length of the vector is the same as the length of input, and which is (approximately) the zero vector when evaluated at (input_val0, hyper_val0).
- input_val0 : numpy.ndarray (N,)
The value of
input_parat the optimum.- hyper_val0 : numpy.ndarray (M,)
The value of
hyper_parat whichinput_val0was found.- order : int
The maximum order of the Taylor series to be calculated.
- hess_solver : function
A function that takes a single argument, v, and returns
\[\frac{\partial G}{\partial \eta}^{-1} v,\]where \(G(\eta, \epsilon)\) is the estimating equation, and the partial derivative is evaluated at \((\eta, \epsilon) =\) (input_val0, hyper_val0).
- forward_mode : bool
Optional. If True (the default), use forward-mode automatic differentiation. Otherwise, use reverse-mode.
- max_input_order : int
Optional. The maximum number of nonzero partial derivatives of the objective function gradient with respect to the input parameter. If None, calculate partial derivatives of all orders.
- max_hyper_order : int
Optional. The maximum number of nonzero partial derivatives of the objective function gradient with respect to the hyperparameter. If None, calculate partial derivatives of all orders.
- force: `bool`
Optional. If True, force the instantiation of potentially expensive reverse mode derivative arrays. Default is False.
-
evaluate_input_derivs(self, dhyper, max_order=None)[source]¶ Return a list of the derivatives dkinput / dhyperk dhyper^k
-
evaluate_taylor_series(self, new_hyper_val, add_offset=True, max_order=None, sum_terms=True)[source]¶ Evaluate the derivative
d^k input / d hyper^kin the direction dhyper.Parameters: - new_hyper_val: `numpy.ndarray` (N, )
The new hyperparameter value at which to evaluate the Taylor series.
- add_offset: `bool`
Optional. Whether to add the initial constant input_val0 to the Taylor series.
- max_order: `int`
Optional. The order of the Taylor series. Defaults to the
orderargument to__init__.- sum_terms: `bool`
If
True, add the terms in the Taylor series. IfFalse, return the terms as a list.
Returns: - The Taylor series approximation to
input_val(new_hyper_val)if add_offsetisTrue, or toinput_val(new_hyper_val) - input_val0ifFalse. Ifsum_termsisTrue, then a vector of the same length asinput_valis returned. Otherwise, an array of shapemax_order + 1, len(input_val)is returned containing the terms of the Taylor series approximation.
-
evaluate_taylor_series_terms(self, new_hyper_val, add_offset=True, max_order=None)[source]¶ Return the terms in a Taylor series approximation.
-
classmethod
optimization_objective(objective_function, input_val0, hyper_val0, order, hess0=None, forward_mode=True, max_input_order=None, max_hyper_order=None, force=False)[source]¶ Parameters: - objective_function : callable
The optimization objective as a function of two arguments (eta, eps), where eta is the parameter that is optimized and eps is a hyperparameter.
- hess0 : numpy.ndarray (N, N)
Optional. The Hessian of the objective at (
input_val0,hyper_val0). If not specified it is calculated at initialization.- The remaining arguments are the same as for the `__init__` method.
-
Optimum checking¶
-
class
vittles.bivariate_sensitivity_lib.OptimumChecker(estimating_equation, solver, input_base, hyper_base)[source]¶ -
__init__(self, estimating_equation, solver, input_base, hyper_base)[source]¶ Estimate the error in sensitivity due to incomplete optimization.
Parameters: - estimating_equation : callable
A function taking arguments (input, hyper) and returning a vector, typically the same length as the input. The idea is that estimating_equation(input_base, hyper_base) = [0, …, 0].
- solver : callable
A function of a single vector variable v solving \(H^{-1} v\), where H is the Hessian of the estimating equation with respect to the input variable at input_base, hyper_base.
- input_base : numpy.ndarray
The base value of the parameter to be optimized
- hyper_base : numpy.ndarray
The base value of the hyperparameter.
-
correction(self, hyper_new, dinput_dhyper=None, newton_step=None)[source]¶ Return the first-order correction to the change in dinput_dhyper as you take a Newton step.
-
evaluate(self, hyper_new, dinput_dhyper=None, newton_step=None)[source]¶ Return the first-order approximation to the change in dinput_dhyper as you take a Newton step.
-
-
class
vittles.bivariate_sensitivity_lib.CrossSensitivity(estimating_equation, solver, input_base, hyper1_base, hyper2_base, term_ii=True, term_i1=True, term_i2=True, term_12=True)[source]¶ Calculate a second-order derivative of an optimum with resepct to two hyperparameters.
Given an estimating equation \(G(\theta, \epsilon_1, \epsilon_2)\), with \(G(\hat\theta(\epsilon_1, \epsilon_2), \epsilon_1, \epsilon_2) = 0\), this class evaluates a directional derivatives
\[\frac{d^2\hat{\theta}}{d\epsilon_1 d\epsilon_2} \Delta \epsilon_1 \Delta \epsilon_2.\]
Linear response covariances¶
-
class
vittles.lr_cov_lib.LinearResponseCovariances(objective_fun, opt_par_value, validate_optimum=False, hessian_at_opt=None, factorize_hessian=True, grad_tol=1e-08)[source]¶ Calculate linear response covariances of a variational distribution.
Let \(q(\theta | \eta)\) be a class of probability distribtions on \(\theta\) where the class is parameterized by the real-valued vector \(\eta\). Suppose that we wish to approximate a distribution \(q(\theta | \eta^*) \approx p(\theta)\) by solving an optimization problem \(\eta^* = \mathrm{argmin} f(\eta)\). For example, \(f\) might be a measure of distance between \(q(\theta | \eta)\) and \(p(\theta)\). This class uses the sensitivity of the optimal \(\eta^*\) to estimate the covariance \(\mathrm{Cov}_p(g(\theta))\). This covariance estimate is called the “linear response covariance”.
In this notation, the arguments to the class mathods are as follows. \(f\) is
objective_fun, \(\eta^*\) isopt_par_value, and the functioncalculate_momentsevaluates \(\mathbb{E}_{q(\theta | \eta)}[g(\theta)]\) as a function of \(\eta\).Methods
set_base_values: Set the base values, \(\eta^*\) that optimizes the objective function. get_hessian_at_opt: Return the Hessian of the objective function evaluated at the optimum. get_hessian_cholesky_at_opt: Return the Cholesky decomposition of the Hessian of the objective function evaluated at the optimum. get_lr_covariance: Return the linear response covariance of a given moment. -
__init__(self, objective_fun, opt_par_value, validate_optimum=False, hessian_at_opt=None, factorize_hessian=True, grad_tol=1e-08)[source]¶ Parameters: - objective_fun: Callable function
A callable function whose optimum parameterizes an approximate Bayesian posterior. The function must take as a single argument a numeric vector,
opt_par.- opt_par_value:
The value of
opt_parat whichobjective_funis optimized.- validate_optimum: Boolean
When setting the values of
opt_par, check thatopt_paris, in fact, a critical point ofobjective_fun.- hessian_at_opt: Numeric matrix (optional)
The Hessian of
objective_funat the optimum. If not specified, it is calculated using automatic differentiation.- factorize_hessian: Boolean
If
True, solve the required linear system using a Cholesky factorization. IfFalse, use the conjugate gradient algorithm to avoid forming or inverting the Hessian.- grad_tol: Float
The tolerance used to check that the gradient is approximately zero at the optimum.
-
get_lr_covariance(self, calculate_moments)[source]¶ Get the linear response covariance of a vector of moments.
Parameters: - calculate_moments: Callable function
A function that takes the folded
opt_paras a single argument and returns a numeric vector containing posterior moments of interest.
Returns: - Numeric matrix
If
calculate_moments(opt_par)returns \(\mathbb{E}_q[g(\theta)]\) then this returns the linear response estimate of \(\mathrm{Cov}_p(g(\theta))\).
-
get_lr_covariance_from_jacobians(self, moment_jacobian1, moment_jacobian2)[source]¶ Get the linear response covariance between two vectors of moments.
Parameters: - moment_jacobian1: 2d numeric array.
The Jacobian matrix of a map from a value of
opt_parto a vector of moments of interest. Following standard notation for Jacobian matrices, the rows should correspond to moments and the columns to elements of a flattenedopt_par.- moment_jacobian2: 2d numeric array.
Like
moment_jacobian1but for the second vector of moments.
Returns: - Numeric matrix
If
moment_jacobian1(opt_par)is the Jacobian of \(\mathbb{E}_q[g_1(\theta)]\) andmoment_jacobian2(opt_par)is the Jacobian of \(\mathbb{E}_q[g_2(\theta)]\) then this returns the linear response estimate of \(\mathrm{Cov}_p(g_1(\theta), g_2(\theta))\).
-
get_moment_jacobian(self, calculate_moments)[source]¶ The Jacobian matrix of a map from
opt_parto a vector of moments of interest.Parameters: - calculate_moments: Callable function
A function that takes the folded
opt_paras a single argument and returns a numeric vector containing posterior moments of interest.
Returns: - Numeric matrix
The Jacobian of the moments.
-