Sensitivity Functions¶
Hyperparameter sensitivity linear approximation¶
-
class
vittles.sensitivity_lib.
HyperparameterSensitivityLinearApproximation
(objective_fun, opt_par_value, hyper_par_value, validate_optimum=False, hessian_at_opt=None, cross_hess_at_opt=None, hyper_par_objective_fun=None, grad_tol=1e-08)[source]¶ Linearly approximate dependence of an optimum on a hyperparameter.
Suppose we have an optimization problem in which the objective depends on a hyperparameter:
\[\hat{\theta} = \mathrm{argmin}_{\theta} f(\theta, \lambda).\]The optimal parameter, \(\hat{\theta}\), is a function of \(\lambda\) through the optimization problem. In general, this dependence is complex and nonlinear. To approximate this dependence, this class uses the linear approximation:
\[\hat{\theta}(\lambda) \approx \hat{\theta}(\lambda_0) + \frac{d\hat{\theta}}{d\lambda}|_{\lambda_0} (\lambda - \lambda_0).\]In terms of the arguments to this function, \(\theta\) corresponds to
opt_par
, \(\lambda\) corresponds tohyper_par
, and \(f\) corresponds toobjective_fun
.Methods
set_base_values: Set the base values, \(\lambda_0\) and \(\theta_0 := \hat\theta(\lambda_0)\), at which the linear approximation is evaluated. get_dopt_dhyper: Return the Jacobian matrix \(\frac{d\hat{\theta}}{d\lambda}|_{\lambda_0}\) in flattened space. get_hessian_at_opt: Return the Hessian of the objective function in the flattened space. predict_opt_par_from_hyper_par: Use the linear approximation to predict the value of opt_par
from a value ofhyper_par
.-
__init__
(self, objective_fun, opt_par_value, hyper_par_value, validate_optimum=False, hessian_at_opt=None, cross_hess_at_opt=None, hyper_par_objective_fun=None, grad_tol=1e-08)[source]¶ Parameters: - objective_fun : callable
The objective function taking two positional arguments, -
opt_par
: The parameter to be optimized (numpy.ndarray (N,)) -hyper_par
: A hyperparameter (numpy.ndarray (N,)) and returning a real value to be minimized.- opt_par_value : numpy.ndarray (N,)
The value of
opt_par
at whichobjective_fun
is optimized for the given value ofhyper_par_value
.- hyper_par_value : numpy.ndarray (M,)
The value of
hyper_par
at whichopt_par
optimizesobjective_fun
.- validate_optimum : bool, optional
When setting the values of
opt_par
andhyper_par
, check thatopt_par
is, in fact, a critical point ofobjective_fun
.- hessian_at_opt : numpy.ndarray (N,N), optional
The Hessian of
objective_fun
at the optimum. If not specified, it is calculated using automatic differentiation.- cross_hess_at_opt : numpy.ndarray (N, M)
Optional. The second derivative of the objective with respect to
input_val
thenhyper_val
. If not specified it is calculated at initialization.- hyper_par_objective_fun : callable, optional
The part of
objective_fun
depending on bothopt_par
andhyper_par
. The arguments must be the same asobjective_fun
: -opt_par
: The parameter to be optimized (numpy.ndarray (N,)) -hyper_par
: A hyperparameter (numpy.ndarray (N,)) This can be useful if only a small part of the objective function depends on bothopt_par
andhyper_par
. If not specified,objective_fun
is used.- grad_tol : float, optional
The tolerance used to check that the gradient is approximately zero at the optimum.
-
Hyperparameter sensitivity Taylor series approximation¶
-
class
vittles.sensitivity_lib.
ParametricSensitivityTaylorExpansion
(estimating_equation, input_val0, hyper_val0, order, hess_solver, forward_mode=True, max_input_order=None, max_hyper_order=None, force=False)[source]¶ Evaluate the Taylor series of an optimum on a hyperparameter.
This is a class for computing the Taylor series of eta(eps) = argmax_eta objective(eta, eps) using forward-mode automatic differentation.
Note
This class is experimental and should be used with caution.
-
__init__
(self, estimating_equation, input_val0, hyper_val0, order, hess_solver, forward_mode=True, max_input_order=None, max_hyper_order=None, force=False)[source]¶ Parameters: - estimating_equation : callable
A vector-valued function function of two arguments, (input, output), where the length of the vector is the same as the length of input, and which is (approximately) the zero vector when evaluated at (input_val0, hyper_val0).
- input_val0 : numpy.ndarray (N,)
The value of
input_par
at the optimum.- hyper_val0 : numpy.ndarray (M,)
The value of
hyper_par
at whichinput_val0
was found.- order : int
The maximum order of the Taylor series to be calculated.
- hess_solver : function
A function that takes a single argument, v, and returns
\[\frac{\partial G}{\partial \eta}^{-1} v,\]where \(G(\eta, \epsilon)\) is the estimating equation, and the partial derivative is evaluated at \((\eta, \epsilon) =\) (input_val0, hyper_val0).
- forward_mode : bool
Optional. If True (the default), use forward-mode automatic differentiation. Otherwise, use reverse-mode.
- max_input_order : int
Optional. The maximum number of nonzero partial derivatives of the objective function gradient with respect to the input parameter. If None, calculate partial derivatives of all orders.
- max_hyper_order : int
Optional. The maximum number of nonzero partial derivatives of the objective function gradient with respect to the hyperparameter. If None, calculate partial derivatives of all orders.
- force: `bool`
Optional. If True, force the instantiation of potentially expensive reverse mode derivative arrays. Default is False.
-
evaluate_input_derivs
(self, dhyper, max_order=None)[source]¶ Return a list of the derivatives dkinput / dhyperk dhyper^k
-
evaluate_taylor_series
(self, new_hyper_val, add_offset=True, max_order=None, sum_terms=True)[source]¶ Evaluate the derivative
d^k input / d hyper^k
in the direction dhyper.Parameters: - new_hyper_val: `numpy.ndarray` (N, )
The new hyperparameter value at which to evaluate the Taylor series.
- add_offset: `bool`
Optional. Whether to add the initial constant input_val0 to the Taylor series.
- max_order: `int`
Optional. The order of the Taylor series. Defaults to the
order
argument to__init__
.- sum_terms: `bool`
If
True
, add the terms in the Taylor series. IfFalse
, return the terms as a list.
Returns: - The Taylor series approximation to
input_val(new_hyper_val)
if add_offset
isTrue
, or toinput_val(new_hyper_val) - input_val0
ifFalse
. Ifsum_terms
isTrue
, then a vector of the same length asinput_val
is returned. Otherwise, an array of shapemax_order + 1, len(input_val)
is returned containing the terms of the Taylor series approximation.
-
evaluate_taylor_series_terms
(self, new_hyper_val, add_offset=True, max_order=None)[source]¶ Return the terms in a Taylor series approximation.
-
classmethod
optimization_objective
(objective_function, input_val0, hyper_val0, order, hess0=None, forward_mode=True, max_input_order=None, max_hyper_order=None, force=False)[source]¶ Parameters: - objective_function : callable
The optimization objective as a function of two arguments (eta, eps), where eta is the parameter that is optimized and eps is a hyperparameter.
- hess0 : numpy.ndarray (N, N)
Optional. The Hessian of the objective at (
input_val0
,hyper_val0
). If not specified it is calculated at initialization.- The remaining arguments are the same as for the `__init__` method.
-
Optimum checking¶
-
class
vittles.bivariate_sensitivity_lib.
OptimumChecker
(estimating_equation, solver, input_base, hyper_base)[source]¶ -
__init__
(self, estimating_equation, solver, input_base, hyper_base)[source]¶ Estimate the error in sensitivity due to incomplete optimization.
Parameters: - estimating_equation : callable
A function taking arguments (input, hyper) and returning a vector, typically the same length as the input. The idea is that estimating_equation(input_base, hyper_base) = [0, …, 0].
- solver : callable
A function of a single vector variable v solving \(H^{-1} v\), where H is the Hessian of the estimating equation with respect to the input variable at input_base, hyper_base.
- input_base : numpy.ndarray
The base value of the parameter to be optimized
- hyper_base : numpy.ndarray
The base value of the hyperparameter.
-
correction
(self, hyper_new, dinput_dhyper=None, newton_step=None)[source]¶ Return the first-order correction to the change in dinput_dhyper as you take a Newton step.
-
evaluate
(self, hyper_new, dinput_dhyper=None, newton_step=None)[source]¶ Return the first-order approximation to the change in dinput_dhyper as you take a Newton step.
-
-
class
vittles.bivariate_sensitivity_lib.
CrossSensitivity
(estimating_equation, solver, input_base, hyper1_base, hyper2_base, term_ii=True, term_i1=True, term_i2=True, term_12=True)[source]¶ Calculate a second-order derivative of an optimum with resepct to two hyperparameters.
Given an estimating equation \(G(\theta, \epsilon_1, \epsilon_2)\), with \(G(\hat\theta(\epsilon_1, \epsilon_2), \epsilon_1, \epsilon_2) = 0\), this class evaluates a directional derivatives
\[\frac{d^2\hat{\theta}}{d\epsilon_1 d\epsilon_2} \Delta \epsilon_1 \Delta \epsilon_2.\]
Linear response covariances¶
-
class
vittles.lr_cov_lib.
LinearResponseCovariances
(objective_fun, opt_par_value, validate_optimum=False, hessian_at_opt=None, factorize_hessian=True, grad_tol=1e-08)[source]¶ Calculate linear response covariances of a variational distribution.
Let \(q(\theta | \eta)\) be a class of probability distribtions on \(\theta\) where the class is parameterized by the real-valued vector \(\eta\). Suppose that we wish to approximate a distribution \(q(\theta | \eta^*) \approx p(\theta)\) by solving an optimization problem \(\eta^* = \mathrm{argmin} f(\eta)\). For example, \(f\) might be a measure of distance between \(q(\theta | \eta)\) and \(p(\theta)\). This class uses the sensitivity of the optimal \(\eta^*\) to estimate the covariance \(\mathrm{Cov}_p(g(\theta))\). This covariance estimate is called the “linear response covariance”.
In this notation, the arguments to the class mathods are as follows. \(f\) is
objective_fun
, \(\eta^*\) isopt_par_value
, and the functioncalculate_moments
evaluates \(\mathbb{E}_{q(\theta | \eta)}[g(\theta)]\) as a function of \(\eta\).Methods
set_base_values: Set the base values, \(\eta^*\) that optimizes the objective function. get_hessian_at_opt: Return the Hessian of the objective function evaluated at the optimum. get_hessian_cholesky_at_opt: Return the Cholesky decomposition of the Hessian of the objective function evaluated at the optimum. get_lr_covariance: Return the linear response covariance of a given moment. -
__init__
(self, objective_fun, opt_par_value, validate_optimum=False, hessian_at_opt=None, factorize_hessian=True, grad_tol=1e-08)[source]¶ Parameters: - objective_fun: Callable function
A callable function whose optimum parameterizes an approximate Bayesian posterior. The function must take as a single argument a numeric vector,
opt_par
.- opt_par_value:
The value of
opt_par
at whichobjective_fun
is optimized.- validate_optimum: Boolean
When setting the values of
opt_par
, check thatopt_par
is, in fact, a critical point ofobjective_fun
.- hessian_at_opt: Numeric matrix (optional)
The Hessian of
objective_fun
at the optimum. If not specified, it is calculated using automatic differentiation.- factorize_hessian: Boolean
If
True
, solve the required linear system using a Cholesky factorization. IfFalse
, use the conjugate gradient algorithm to avoid forming or inverting the Hessian.- grad_tol: Float
The tolerance used to check that the gradient is approximately zero at the optimum.
-
get_lr_covariance
(self, calculate_moments)[source]¶ Get the linear response covariance of a vector of moments.
Parameters: - calculate_moments: Callable function
A function that takes the folded
opt_par
as a single argument and returns a numeric vector containing posterior moments of interest.
Returns: - Numeric matrix
If
calculate_moments(opt_par)
returns \(\mathbb{E}_q[g(\theta)]\) then this returns the linear response estimate of \(\mathrm{Cov}_p(g(\theta))\).
-
get_lr_covariance_from_jacobians
(self, moment_jacobian1, moment_jacobian2)[source]¶ Get the linear response covariance between two vectors of moments.
Parameters: - moment_jacobian1: 2d numeric array.
The Jacobian matrix of a map from a value of
opt_par
to a vector of moments of interest. Following standard notation for Jacobian matrices, the rows should correspond to moments and the columns to elements of a flattenedopt_par
.- moment_jacobian2: 2d numeric array.
Like
moment_jacobian1
but for the second vector of moments.
Returns: - Numeric matrix
If
moment_jacobian1(opt_par)
is the Jacobian of \(\mathbb{E}_q[g_1(\theta)]\) andmoment_jacobian2(opt_par)
is the Jacobian of \(\mathbb{E}_q[g_2(\theta)]\) then this returns the linear response estimate of \(\mathrm{Cov}_p(g_1(\theta), g_2(\theta))\).
-
get_moment_jacobian
(self, calculate_moments)[source]¶ The Jacobian matrix of a map from
opt_par
to a vector of moments of interest.Parameters: - calculate_moments: Callable function
A function that takes the folded
opt_par
as a single argument and returns a numeric vector containing posterior moments of interest.
Returns: - Numeric matrix
The Jacobian of the moments.
-