diff --git a/previews/PR118/.documenter-siteinfo.json b/previews/PR118/.documenter-siteinfo.json index 376004d..c56d1d5 100644 --- a/previews/PR118/.documenter-siteinfo.json +++ b/previews/PR118/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.9.3","generation_timestamp":"2023-10-13T09:23:39","documenter_version":"1.1.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.9.3","generation_timestamp":"2023-10-15T17:42:45","documenter_version":"1.1.1"}} \ No newline at end of file diff --git a/previews/PR118/implementer_guide/index.html b/previews/PR118/implementer_guide/index.html index d60daa0..501d208 100644 --- a/previews/PR118/implementer_guide/index.html +++ b/previews/PR118/implementer_guide/index.html @@ -1,4 +1,4 @@ Implementer guide · AbstractDifferentiation.jl

Implementer guide

Work in progress

Come back later!

The macro @primitive

To implement the AbstractDifferentiation interface for your backend, you only need to provide a "primitive" from which the rest of the functions can be deduced. However, for performance reasons, you can implement more of the interface to make certain calls faster.

At the moment, the only primitives supported are AD.pushforward_function and AD.value_and_pullback_function. The AD.@primitive macro uses the provided function to implement AD.jacobian, and all the other functions follow.

AD.@primitive function AD.myprimitive(ab::MyBackend, f, xs...)
     # write your code here
-end

See the backend-specific extensions in the ext/ folder of the repository for example implementations.

Function dependency graph

These details are not part of the public API and are expected to change. They are just listed here to help readers figure out the code structure:

  • jacobian has no default implementation
  • derivative calls jacobian
  • gradient calls jacobian
  • hessian calls jacobian and gradient
  • value_and_jacobian calls jacobian
  • value_and_derivative calls value_and_jacobian
  • value_and_gradient calls value_and_jacobian
  • value_and_hessian calls jacobian and gradient
  • value_gradient_and_hessian calls value_and_jacobian and gradient
  • pushforward_function calls jacobian
  • value_and_pushforward_function calls pushforward_function
  • pullback_function calls value_and_pullback_function
  • value_and_pullback_function calls gradient
+end

See the backend-specific extensions in the ext/ folder of the repository for example implementations.

Function dependency graph

These details are not part of the public API and are expected to change. They are just listed here to help readers figure out the code structure:

diff --git a/previews/PR118/index.html b/previews/PR118/index.html index b42edb1..f485e1d 100644 --- a/previews/PR118/index.html +++ b/previews/PR118/index.html @@ -4,4 +4,4 @@ author={Sch{\"a}fer, Frank and Tarek, Mohamed and White, Lyndon and Rackauckas, Chris}, journal={NeurIPS 2021 Differentiable Programming Workshop}, year={2021} -} +} diff --git a/previews/PR118/user_guide/index.html b/previews/PR118/user_guide/index.html index b592412..5c6fe52 100644 --- a/previews/PR118/user_guide/index.html +++ b/previews/PR118/user_guide/index.html @@ -6,4 +6,4 @@ julia> f(x) = log(sum(exp, x)); julia> AD.gradient(backend, f, collect(1:3)) -([0.09003057317038046, 0.2447284710547977, 0.665240955774822],)

The following backends are temporarily made available by AbstractDifferentiation as soon as their corresponding package is loaded (thanks to weak dependencies on Julia ≥ 1.9 and Requires.jl on older Julia versions):

AbstractDifferentiation.ReverseDiffBackendType
ReverseDiffBackend

AD backend that uses reverse mode with ReverseDiff.jl.

Note

To be able to use this backend, you have to load ReverseDiff.

source
AbstractDifferentiation.ReverseRuleConfigBackendType
ReverseRuleConfigBackend

AD backend that uses reverse mode with any ChainRulesCore.jl-compatible reverse-mode AD package.

Constructed with a RuleConfig object:

backend = AD.ReverseRuleConfigBackend(rc)
Note

On Julia >= 1.9, you have to load ChainRulesCore (possibly implicitly by loading a ChainRules-compatible AD package) to be able to use this backend.

source
AbstractDifferentiation.FiniteDifferencesBackendType
FiniteDifferencesBackend{M}

AD backend that uses forward mode with FiniteDifferences.jl.

The type parameter M is the type of the method used to perform finite differences.

Note

To be able to use this backend, you have to load FiniteDifferences.

source
AbstractDifferentiation.ZygoteBackendFunction
ZygoteBackend()

Create an AD backend that uses reverse mode with Zygote.jl.

It is a special case of ReverseRuleConfigBackend.

Note

To be able to use this backend, you have to load Zygote.

source
AbstractDifferentiation.ForwardDiffBackendType
ForwardDiffBackend{CS}

AD backend that uses forward mode with ForwardDiff.jl.

The type parameter CS denotes the chunk size of the differentiation algorithm. If it is Nothing, then ForwardiffDiff uses a heuristic to set the chunk size based on the input.

See also: ForwardDiff.jl: Configuring Chunk Size

Note

To be able to use this backend, you have to load ForwardDiff.

source
AbstractDifferentiation.TrackerBackendType
TrackerBackend

AD backend that uses reverse mode with Tracker.jl.

Note

To be able to use this backend, you have to load Tracker.

source

In the long term, these backend objects (and many more) will be defined within their respective packages to enforce the AbstractDifferentiation interface. This is already the case for:

For higher order derivatives, you can build higher order backends using AD.HigherOrderBackend.

AbstractDifferentiation.HigherOrderBackendType
AD.HigherOrderBackend{B}

Let ab_f be a forward-mode automatic differentiation backend and let ab_r be a reverse-mode automatic differentiation backend. To construct a higher order backend for doing forward-over-reverse-mode automatic differentiation, use AD.HigherOrderBackend((ab_f, ab_r)). To construct a higher order backend for doing reverse-over-forward-mode automatic differentiation, use AD.HigherOrderBackend((ab_r, ab_f)).

Fields

  • backends::B
source

Derivatives

The following list of functions can be used to request the derivative, gradient, Jacobian or Hessian without the function value.

AbstractDifferentiation.derivativeFunction
AD.derivative(ab::AD.AbstractBackend, f, xs::Number...)

Compute the derivatives of f with respect to the numbers xs using the backend ab.

The function returns a Tuple of derivatives, one for each element in xs.

source
AbstractDifferentiation.gradientFunction
AD.gradient(ab::AD.AbstractBackend, f, xs...)

Compute the gradients of f with respect to the inputs xs using the backend ab.

The function returns a Tuple of gradients, one for each element in xs.

source
AbstractDifferentiation.jacobianFunction
AD.jacobian(ab::AD.AbstractBackend, f, xs...)

Compute the Jacobians of f with respect to the inputs xs using the backend ab.

The function returns a Tuple of Jacobians, one for each element in xs.

source
AbstractDifferentiation.hessianFunction
AD.hessian(ab::AD.AbstractBackend, f, x)

Compute the Hessian of f wrt the input x using the backend ab.

The function returns a single matrix because hessian currently only supports a single input.

source

Value and derivatives

The following list of functions can be used to request the function value along with its derivative, gradient, Jacobian or Hessian. You can also request the function value, its gradient and Hessian for single-input functions.

AbstractDifferentiation.value_and_derivativeFunction
AD.value_and_derivative(ab::AD.AbstractBackend, f, xs::Number...)

Return the tuple (v, ds) of the function value v = f(xs...) and the derivatives ds = AD.derivative(ab, f, xs...).

See also AbstractDifferentiation.derivative.

source
AbstractDifferentiation.value_and_gradientFunction
AD.value_and_gradient(ab::AD.AbstractBackend, f, xs...)

Return the tuple (v, gs) of the function value v = f(xs...) and the gradients gs = AD.gradient(ab, f, xs...).

See also AbstractDifferentiation.gradient.

source
AbstractDifferentiation.value_and_jacobianFunction
AD.value_and_jacobian(ab::AD.AbstractBackend, f, xs...)

Return the tuple (v, Js) of the function value v = f(xs...) and the Jacobians Js = AD.jacobian(ab, f, xs...).

See also AbstractDifferentiation.jacobian.

source
AbstractDifferentiation.value_and_hessianFunction
AD.value_and_hessian(ab::AD.AbstractBackend, f, x)

Return the tuple (v, H) of the function value v = f(x) and the Hessian H = AD.hessian(ab, f, x).

See also AbstractDifferentiation.hessian.

source
AbstractDifferentiation.value_gradient_and_hessianFunction
AD.value_gradient_and_hessian(ab::AD.AbstractBackend, f, x)

Return the tuple (v, g, H) of the function value v = f(x), the gradient g = AD.gradient(ab, f, x), and the Hessian H = AD.hessian(ab, f, x).

See also AbstractDifferentiation.gradient and AbstractDifferentiation.hessian.

source

Jacobian-vector products

This operation goes by a few names, like "pushforward". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pushforward operator pf_f is equivalent to applying the function v -> J * v on a (tangent) vector v.

The following functions can be used to request a function that returns the pushforward operator/function. In order to request the pushforward function pf_f of a function f at the inputs xs, you can use either of:

AbstractDifferentiation.pushforward_functionFunction
AD.pushforward_function(ab::AD.AbstractBackend, f, xs...)

Return the pushforward function pf of the function f at the inputs xs using backend ab.

The pushfoward function pf accepts as input a Tuple of tangents, one for each element in xs. If xs consists of a single element, pf can also accept a single tangent instead of a 1-tuple.

source
AbstractDifferentiation.value_and_pushforward_functionFunction
AD.value_and_pushforward_function(ab::AD.AbstractBackend, f, xs...)

Return a function that, given tangents ts, computes the tuple (v, p) of the function value v = f(xs...) and the output p of the pushforward function AD.pushforward_function(ab, f, xs...) applied to ts.

See also AbstractDifferentiation.pushforward_function.

source

Vector-Jacobian products

This operation goes by a few names, like "pullback". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pullback operator pb_f is equivalent to applying the function v -> v' * J on a (co-tangent) vector v.

The following functions can be used to request the pullback operator/function with or without the function value. In order to request the pullback function pb_f of a function f at the inputs xs, you can use either of:

AbstractDifferentiation.pullback_functionFunction
AD.pullback_function(ab::AD.AbstractBackend, f, xs...)

Return the pullback function pb of the function f at the inputs xs using backend ab.

The pullback function pb accepts as input a Tuple of cotangents, one for each output of f. If f has a single output, pb can also accept a single input instead of a 1-tuple.

source
AbstractDifferentiation.value_and_pullback_functionFunction
AD.value_and_pullback_function(ab::AD.AbstractBackend, f, xs...)

Return a function that, given cotangents ts, computes the tuple (v, p) of the function value v = f(xs...) and the output p of the pullback function AD.pullback_function(ab, f, xs...) applied to ts.

See also AbstractDifferentiation.pullback_function.

source

Lazy operators

You can also get a struct for the lazy derivative/gradient/Jacobian/Hessian of a function. You can then use the * operator to apply the lazy operator on a value or tuple of the correct shape. To get a lazy derivative/gradient/Jacobian/Hessian use any one of:

AbstractDifferentiation.lazy_derivativeFunction
AD.lazy_derivative(ab::AbstractBackend, f, xs::Number...)

Return an operator ld for multiplying by the derivative of f at xs.

You can apply the operator by multiplication e.g. ld * y where y is a number if f has a single input, a tuple of the same length as xs if f has multiple inputs, or an array of numbers/tuples.

source
AbstractDifferentiation.lazy_gradientFunction
AD.lazy_gradient(ab::AbstractBackend, f, xs...)

Return an operator lg for multiplying by the gradient of f at xs.

You can apply the operator by multiplication e.g. lg * y where y is a number if f has a single input or a tuple of the same length as xs if f has multiple inputs.

source
AbstractDifferentiation.lazy_jacobianFunction
AD.lazy_jacobian(ab::AbstractBackend, f, xs...)

Return an operator lj for multiplying by the Jacobian of f at xs.

You can apply the operator by multiplication e.g. lj * y or y' * lj where y is a number, vector or tuple of numbers and/or vectors. If f has multiple inputs, y in lj * y should be a tuple. If f has multiple outputs, y in y' * lj should be a tuple. Otherwise, it should be a scalar or a vector of the appropriate length.

source
AbstractDifferentiation.lazy_hessianFunction
AD.lazy_hessian(ab::AbstractBackend, f, x)

Return an operator lh for multiplying by the Hessian of the scalar-valued function f at x.

You can apply the operator by multiplication e.g. lh * y or y' * lh where y is a number or a vector of the appropriate length.

source

Index

+([0.09003057317038046, 0.2447284710547977, 0.665240955774822],)

The following backends are temporarily made available by AbstractDifferentiation as soon as their corresponding package is loaded (thanks to weak dependencies on Julia ≥ 1.9 and Requires.jl on older Julia versions):

AbstractDifferentiation.ReverseDiffBackendType
ReverseDiffBackend

AD backend that uses reverse mode with ReverseDiff.jl.

Note

To be able to use this backend, you have to load ReverseDiff.

source
AbstractDifferentiation.ReverseRuleConfigBackendType
ReverseRuleConfigBackend

AD backend that uses reverse mode with any ChainRulesCore.jl-compatible reverse-mode AD package.

Constructed with a RuleConfig object:

backend = AD.ReverseRuleConfigBackend(rc)
Note

On Julia >= 1.9, you have to load ChainRulesCore (possibly implicitly by loading a ChainRules-compatible AD package) to be able to use this backend.

source
AbstractDifferentiation.FiniteDifferencesBackendType
FiniteDifferencesBackend{M}

AD backend that uses forward mode with FiniteDifferences.jl.

The type parameter M is the type of the method used to perform finite differences.

Note

To be able to use this backend, you have to load FiniteDifferences.

source
AbstractDifferentiation.ZygoteBackendFunction
ZygoteBackend()

Create an AD backend that uses reverse mode with Zygote.jl.

It is a special case of ReverseRuleConfigBackend.

Note

To be able to use this backend, you have to load Zygote.

source
AbstractDifferentiation.ForwardDiffBackendType
ForwardDiffBackend{CS}

AD backend that uses forward mode with ForwardDiff.jl.

The type parameter CS denotes the chunk size of the differentiation algorithm. If it is Nothing, then ForwardiffDiff uses a heuristic to set the chunk size based on the input.

See also: ForwardDiff.jl: Configuring Chunk Size

Note

To be able to use this backend, you have to load ForwardDiff.

source
AbstractDifferentiation.TrackerBackendType
TrackerBackend

AD backend that uses reverse mode with Tracker.jl.

Note

To be able to use this backend, you have to load Tracker.

source

In the long term, these backend objects (and many more) will be defined within their respective packages to enforce the AbstractDifferentiation interface. This is already the case for:

For higher order derivatives, you can build higher order backends using AD.HigherOrderBackend.

AbstractDifferentiation.HigherOrderBackendType
AD.HigherOrderBackend{B}

Let ab_f be a forward-mode automatic differentiation backend and let ab_r be a reverse-mode automatic differentiation backend. To construct a higher order backend for doing forward-over-reverse-mode automatic differentiation, use AD.HigherOrderBackend((ab_f, ab_r)). To construct a higher order backend for doing reverse-over-forward-mode automatic differentiation, use AD.HigherOrderBackend((ab_r, ab_f)).

Fields

  • backends::B
source

Derivatives

The following list of functions can be used to request the derivative, gradient, Jacobian or Hessian without the function value.

AbstractDifferentiation.derivativeFunction
AD.derivative(ab::AD.AbstractBackend, f, xs::Number...)

Compute the derivatives of f with respect to the numbers xs using the backend ab.

The function returns a Tuple of derivatives, one for each element in xs.

source
AbstractDifferentiation.gradientFunction
AD.gradient(ab::AD.AbstractBackend, f, xs...)

Compute the gradients of f with respect to the inputs xs using the backend ab.

The function returns a Tuple of gradients, one for each element in xs.

source
AbstractDifferentiation.jacobianFunction
AD.jacobian(ab::AD.AbstractBackend, f, xs...)

Compute the Jacobians of f with respect to the inputs xs using the backend ab.

The function returns a Tuple of Jacobians, one for each element in xs.

source
AbstractDifferentiation.hessianFunction
AD.hessian(ab::AD.AbstractBackend, f, x)

Compute the Hessian of f wrt the input x using the backend ab.

The function returns a single matrix because hessian currently only supports a single input.

source

Value and derivatives

The following list of functions can be used to request the function value along with its derivative, gradient, Jacobian or Hessian. You can also request the function value, its gradient and Hessian for single-input functions.

AbstractDifferentiation.value_and_derivativeFunction
AD.value_and_derivative(ab::AD.AbstractBackend, f, xs::Number...)

Return the tuple (v, ds) of the function value v = f(xs...) and the derivatives ds = AD.derivative(ab, f, xs...).

See also AbstractDifferentiation.derivative.

source
AbstractDifferentiation.value_and_gradientFunction
AD.value_and_gradient(ab::AD.AbstractBackend, f, xs...)

Return the tuple (v, gs) of the function value v = f(xs...) and the gradients gs = AD.gradient(ab, f, xs...).

See also AbstractDifferentiation.gradient.

source
AbstractDifferentiation.value_and_jacobianFunction
AD.value_and_jacobian(ab::AD.AbstractBackend, f, xs...)

Return the tuple (v, Js) of the function value v = f(xs...) and the Jacobians Js = AD.jacobian(ab, f, xs...).

See also AbstractDifferentiation.jacobian.

source
AbstractDifferentiation.value_and_hessianFunction
AD.value_and_hessian(ab::AD.AbstractBackend, f, x)

Return the tuple (v, H) of the function value v = f(x) and the Hessian H = AD.hessian(ab, f, x).

See also AbstractDifferentiation.hessian.

source
AbstractDifferentiation.value_gradient_and_hessianFunction
AD.value_gradient_and_hessian(ab::AD.AbstractBackend, f, x)

Return the tuple (v, g, H) of the function value v = f(x), the gradient g = AD.gradient(ab, f, x), and the Hessian H = AD.hessian(ab, f, x).

See also AbstractDifferentiation.gradient and AbstractDifferentiation.hessian.

source

Jacobian-vector products

This operation goes by a few names, like "pushforward". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pushforward operator pf_f is equivalent to applying the function v -> J * v on a (tangent) vector v.

The following functions can be used to request a function that returns the pushforward operator/function. In order to request the pushforward function pf_f of a function f at the inputs xs, you can use either of:

AbstractDifferentiation.pushforward_functionFunction
AD.pushforward_function(ab::AD.AbstractBackend, f, xs...)

Return the pushforward function pf of the function f at the inputs xs using backend ab.

The pushfoward function pf accepts as input a Tuple of tangents, one for each element in xs. If xs consists of a single element, pf can also accept a single tangent instead of a 1-tuple.

source
AbstractDifferentiation.value_and_pushforward_functionFunction
AD.value_and_pushforward_function(ab::AD.AbstractBackend, f, xs...)

Return a function that, given tangents ts, computes the tuple (v, p) of the function value v = f(xs...) and the output p of the pushforward function AD.pushforward_function(ab, f, xs...) applied to ts.

See also AbstractDifferentiation.pushforward_function.

source

Vector-Jacobian products

This operation goes by a few names, like "pullback". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pullback operator pb_f is equivalent to applying the function v -> v' * J on a (co-tangent) vector v.

The following functions can be used to request the pullback operator/function with or without the function value. In order to request the pullback function pb_f of a function f at the inputs xs, you can use either of:

AbstractDifferentiation.pullback_functionFunction
AD.pullback_function(ab::AD.AbstractBackend, f, xs...)

Return the pullback function pb of the function f at the inputs xs using backend ab.

The pullback function pb accepts as input a Tuple of cotangents, one for each output of f. If f has a single output, pb can also accept a single input instead of a 1-tuple.

source
AbstractDifferentiation.value_and_pullback_functionFunction
AD.value_and_pullback_function(ab::AD.AbstractBackend, f, xs...)

Return a function that, given cotangents ts, computes the tuple (v, p) of the function value v = f(xs...) and the output p of the pullback function AD.pullback_function(ab, f, xs...) applied to ts.

See also AbstractDifferentiation.pullback_function.

source

Lazy operators

You can also get a struct for the lazy derivative/gradient/Jacobian/Hessian of a function. You can then use the * operator to apply the lazy operator on a value or tuple of the correct shape. To get a lazy derivative/gradient/Jacobian/Hessian use any one of:

AbstractDifferentiation.lazy_derivativeFunction
AD.lazy_derivative(ab::AbstractBackend, f, xs::Number...)

Return an operator ld for multiplying by the derivative of f at xs.

You can apply the operator by multiplication e.g. ld * y where y is a number if f has a single input, a tuple of the same length as xs if f has multiple inputs, or an array of numbers/tuples.

source
AbstractDifferentiation.lazy_gradientFunction
AD.lazy_gradient(ab::AbstractBackend, f, xs...)

Return an operator lg for multiplying by the gradient of f at xs.

You can apply the operator by multiplication e.g. lg * y where y is a number if f has a single input or a tuple of the same length as xs if f has multiple inputs.

source
AbstractDifferentiation.lazy_jacobianFunction
AD.lazy_jacobian(ab::AbstractBackend, f, xs...)

Return an operator lj for multiplying by the Jacobian of f at xs.

You can apply the operator by multiplication e.g. lj * y or y' * lj where y is a number, vector or tuple of numbers and/or vectors. If f has multiple inputs, y in lj * y should be a tuple. If f has multiple outputs, y in y' * lj should be a tuple. Otherwise, it should be a scalar or a vector of the appropriate length.

source
AbstractDifferentiation.lazy_hessianFunction
AD.lazy_hessian(ab::AbstractBackend, f, x)

Return an operator lh for multiplying by the Hessian of the scalar-valued function f at x.

You can apply the operator by multiplication e.g. lh * y or y' * lh where y is a number or a vector of the appropriate length.

source

Index