TuringLang
diff --git a/‎.JuliaFormatter.toml‎
Lines changed: 1 addition & 1 deletion b/‎.JuliaFormatter.toml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎HISTORY.md‎
Lines changed: 6 additions & 0 deletions b/‎HISTORY.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎Project.toml‎
Lines changed: 1 addition & 1 deletion b/‎Project.toml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md‎
Lines changed: 28 additions & 13 deletions b/‎README.md‎
Lines changed: 28 additions & 13 deletions
diff --git a/‎docs/Project.toml‎
Lines changed: 1 addition & 0 deletions b/‎docs/Project.toml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/make.jl‎
Lines changed: 7 additions & 4 deletions b/‎docs/make.jl‎
Lines changed: 7 additions & 4 deletions
diff --git a/‎docs/src/advi.md‎
Lines changed: 94 additions & 0 deletions b/‎docs/src/advi.md‎
Lines changed: 94 additions & 0 deletions
diff --git a/‎docs/src/defining.md‎
Lines changed: 94 additions & 0 deletions b/‎docs/src/defining.md‎
Lines changed: 94 additions & 0 deletions
diff --git a/‎docs/src/distributions.md‎
Lines changed: 55 additions & 32 deletions b/‎docs/src/distributions.md‎
Lines changed: 55 additions & 32 deletions
@@ -1,2 +1,2 @@
 style="blue"
-format_markdown=true
+format_markdown=true
@@ -1,3 +1,9 @@
+# 0.15.13
+
+Exports extra functionality that should probably have been exported, namely `ordered`, `isinvertible`, and `columnwise`, from Bijectors.jl
+
+The docs have been thoroughly restructured.
+
 # 0.15.12
 
 Improved implementation of the Enzyme rule for `Bijectors.find_alpha`.
 
@@ -1,6 +1,6 @@
 name = "Bijectors"
 uuid = "76274a88-744f-5084-9051-94815aaf08c4"
-version = "0.15.12"
+version = "0.15.13"
 
 [deps]
 ArgCheck = "dce04be8-c92d-5529-be00-80e4d2c0e197"
 
@@ -1,20 +1,35 @@
 # Bijectors.jl
 
-[![Docs - Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://turinglang.github.io/Bijectors.jl/stable)
-[![Docs - Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://turinglang.github.io/Bijectors.jl/dev)
-[![Interface tests](https://github.com/TuringLang/Bijectors.jl/workflows/Interface%20tests/badge.svg?branch=main)](https://github.com/TuringLang/Bijectors.jl/actions?query=workflow%3A%22Interface+tests%22+branch%3Amain)
-[![AD tests](https://github.com/TuringLang/Bijectors.jl/workflows/AD%20tests/badge.svg?branch=main)](https://github.com/TuringLang/Bijectors.jl/actions?query=workflow%3A%22AD+tests%22+branch%3Amain)
+[![Documentation for latest stable release](https://img.shields.io/badge/docs-stable-blue.svg)](https://turinglang.github.io/Bijectors.jl)
+[![Documentation for development version](https://img.shields.io/badge/docs-dev-blue.svg)](https://turinglang.github.io/Bijectors.jl/dev)
+[![CI](https://github.com/TuringLang/Bijectors.jl/actions/workflows/CI.yml/badge.svg)](https://github.com/TuringLang/Bijectors.jl/actions/workflows/CI.yml)
 
-*A package for transforming distributions, used by [Turing.jl](https://github.com/TuringLang/Turing.jl).*
+Bijectors.jl implements functions for transforming random variables and probability distributions.
 
-Bijectors.jl implements both an interface for transforming distributions from Distributions.jl and many transformations needed in this context.
-This package is used heavily in the probabilistic programming language Turing.jl.
+A quick overview of some of the key functionality is provided below:
 
-See the [documentation](https://turinglang.github.io/Bijectors.jl) for more.
+```julia
+julia> using Bijectors;
+       dist = LogNormal();
+LogNormal{Float64}(μ=0.0, σ=1.0)
 
-## Do you want to contribute?
+julia> x = rand(dist)      # Constrained to (0, ∞)
+0.6471106974390148
 
-If you feel you have some relevant skills and are interested in contributing, please get in touch!
-You can find us in the #turing channel on the [Julia Slack](https://julialang.org/slack/) or [Discourse](https://discourse.julialang.org).
-If you're having any problems, please open a Github issue, even if the problem seems small (like help figuring out an error message).
-Every issue you open helps us to improve the library!
+julia> b = bijector(dist)  # This maps from (0, ∞) to ℝ
+(::Base.Fix1{typeof(broadcast), typeof(log)}) (generic function with 1 method)
+
+julia> y = b(x)            # Unconstrained value in ℝ
+-0.43523790570180304
+
+julia> # Log-absolute determinant of the Jacobian at x.
+       with_logabsdet_jacobian(b, x)
+(-0.43523790570180304, 0.43523790570180304)
+```
+
+Please see the [documentation](https://turinglang.github.io/Bijectors.jl) for more information.
+
+## Get in touch
+
+If you have any questions, please feel free to [post on Julia Slack](https://julialang.slack.com/archives/CCYDC34A0) or [Discourse](https://discourse.julialang.org/).
+We also very much welcome GitHub issues or pull requests!
@@ -1,6 +1,7 @@
 [deps]
 Bijectors = "76274a88-744f-5084-9051-94815aaf08c4"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
+ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
 Functors = "d9f16b24-f501-4c13-a1f2-28368ffc5196"
 StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3"
 
 
@@ -9,10 +9,13 @@ makedocs(;
     format=Documenter.HTML(),
     modules=[Bijectors],
     pages=[
-        "Home" => "index.md",
-        "Transforms" => "transforms.md",
-        "Distributions.jl integration" => "distributions.md",
-        "Examples" => "examples.md",
+        "index.md",
+        "interface.md",
+        "defining.md",
+        "distributions.md",
+        "types.md",
+        "advi.md",
+        "flows.md",
     ],
     checkdocs=:exports,
     doctest=false,
 
@@ -0,0 +1,94 @@
+# Example: Variational inference
+
+The real utility of `TransformedDistribution` becomes more apparent when using `transformed(dist, b)` for any bijector `b`.
+To get the transformed distribution corresponding to the `Beta(2, 2)`, we called `transformed(dist)` before.
+This is an alias for `transformed(dist, bijector(dist))`.
+Remember `bijector(dist)` returns the constrained-to-constrained bijector for that particular `Distribution`.
+But we can of course construct a `TransformedDistribution` using different bijectors with the same `dist`.
+
+This is particularly useful in _Automatic Differentiation Variational Inference (ADVI)_.
+
+## Univariate ADVI
+
+An important part of ADVI is to approximate a constrained distribution, e.g. `Beta`, as follows:
+
+ 1. Sample `x` from a `Normal` with parameters `μ` and `σ`, i.e. `x ~ Normal(μ, σ)`.
+ 2. Transform `x` to `y` s.t. `y ∈ support(Beta)`, with the transform being a differentiable bijection with a differentiable inverse (a "bijector").
+
+This then defines a probability density with the same _support_ as `Beta`!
+Of course, it's unlikely that it will be the same density, but it's an _approximation_.
+
+Creating such a distribution can be done with `Bijector` and `TransformedDistribution`:
+
+```@example advi
+using Bijectors
+using StableRNGs: StableRNG
+rng = StableRNG(42)
+
+dist = Beta(2, 2)
+b = bijector(dist)                # (0, 1) → ℝ
+b⁻¹ = inverse(b)                  # ℝ → (0, 1)
+td = transformed(Normal(), b⁻¹)   # x ∼ 𝓝(0, 1) then b(x) ∈ (0, 1)
+x = rand(rng, td)                 # ∈ (0, 1)
+```
+
+It's worth noting that `support(Beta)` is the _closed_ interval `[0, 1]`, while the constrained-to-unconstrained bijection, `Logit` in this case, is only well-defined as a map `(0, 1) → ℝ` for the _open_ interval `(0, 1)`.
+This is of course not an implementation detail.
+`ℝ` is itself open, thus no continuous bijection exists from a _closed_ interval to `ℝ`.
+But since the boundaries of a closed interval has what's known as measure zero, this doesn't end up affecting the resulting density with support on the entire real line.
+In practice, this means that
+
+```@example advi
+td = transformed(Beta())
+inverse(td.transform)(rand(rng, td))
+```
+
+will never result in `0` or `1` though any sample arbitrarily close to either `0` or `1` is possible.
+_Disclaimer: numerical accuracy is limited, so you might still see `0` and `1` if you're 'lucky'._
+
+## Multivariate ADVI example
+
+We can also do _multivariate_ ADVI using the `Stacked` bijector.
+`Stacked` gives us a way to combine univariate and/or multivariate bijectors into a singe multivariate bijector.
+Say you have a vector `x` of length 2 and you want to transform the first entry using `Exp` and the second entry using `Log`.
+`Stacked` gives you an easy and efficient way of representing such a bijector.
+
+```@example advi
+using Bijectors: SimplexBijector
+
+# Original distributions
+dists = (Beta(), InverseGamma(), Dirichlet(2, 3))
+
+# Construct the corresponding ranges
+function make_ranges(dists)
+    ranges = []
+    idx = 1
+    for i in 1:length(dists)
+        d = dists[i]
+        push!(ranges, idx:(idx + length(d) - 1))
+        idx += length(d)
+    end
+    return ranges
+end
+
+ranges = make_ranges(dists)
+ranges
+```
+
+```@example advi
+# Base distribution; mean-field normal
+num_params = ranges[end][end]
+
+d = MvNormal(zeros(num_params), ones(num_params));
+
+# Construct the transform
+bs = bijector.(dists)       # constrained-to-unconstrained bijectors for dists
+ibs = inverse.(bs)          # invert, so we get unconstrained-to-constrained
+sb = Stacked(ibs, ranges)   # => Stacked <: Bijector
+
+# Mean-field normal with unconstrained-to-constrained stacked bijector
+td = transformed(d, sb)
+y = rand(td)
+```
+
+As can be seen from this, we now have a `y` for which `0.0 ≤ y[1] ≤ 1.0`, `0.0 < y[2]`, and `sum(y[3:4]) ≈ 1.0`.
@@ -0,0 +1,94 @@
+# Defining a bijector
+
+This page describes the minimum expected interface to implement a bijector.
+
+In general, there are two pieces of information needed to define a bijector:
+
+ 1. The transformation itself, i.e., the map $b: \mathbb{R}^d \to \mathbb{R}^d$.
+
+ 2. The log-absolute determinant of the Jacobian of that transformation.
+    For a transformation $b: \mathbb{R}^d \to \mathbb{R}^d$, the Jacobian at point $x \in \mathbb{R}^d$ is defined as:
+    
+    $$J_{b}(x) = \begin{bmatrix}
+    \partial y_1/\partial x_1 & \partial y_1/\partial x_2 & \cdots & \partial y_1/\partial x_d \\
+    \partial y_2/\partial x_1 & \partial y_2/\partial x_2 & \cdots & \partial y_2/\partial x_d \\
+    \vdots & \vdots & \ddots & \vdots \\
+    \partial y_d/\partial x_1 & \partial y_d/\partial x_2 & \cdots & \partial y_d/\partial x_d
+    \end{bmatrix}$$
+    
+    where $y = b(x)$.
+
+## The transform itself
+
+The most efficient way to implement a bijector is to provide an implementation of:
+
+```@docs; canonical=false
+Bijectors.with_logabsdet_jacobian
+```
+
+If you define `with_logabsdet_jacobian(b, x)`, then you will automatically get default implementations of both `transform(b, x)` and `logabsdetjac(b, x)`, which respectively return the first and second value of that tuple.
+So, in fact, you can implement a bijector by defining only `with_logabsdet_jacobian`.
+
+If you prefer, you can implement `transform` and `logabsdetjac` separately, as described below.
+Having manual implementations of these may also be useful if you expect either to be used heavily without the other.
+
+### Transformation
+
+```@docs; canonical=false
+transform
+```
+
+If `transform(b, x)` is defined, then you will automatically get a default implementation of `b(x)` which calls that.
+
+### Log-absolute determinant of the Jacobian
+
+```@docs; canonical=false
+Bijectors.logabsdetjac
+```
+
+## Inverse
+
+Often you will want to define an inverse bijector as well.
+To do so, you will have to implement:
+
+```@docs; canonical=false
+Bijectors.inverse
+```
+
+If `b` is a bijector, then `inverse(b)` should return the inverse bijector $b^{-1}$.
+
+If your bijector subtypes `Bijectors.Bijector`, then you will get a default implementation of `inverse` which constructs `Bijectors.Inverse(b)`.
+This may be easier than creating a second type for the inverse bijector.
+Note that you will also need to implement the methods for `with_logabsdet_jacobian` (and/or `transform` and `logabsdetjac`) for the inverse bijector type.
+
+If your bijector is not invertible, you can specify this here:
+
+```@docs; canonical=false
+Bijectors.isinvertible
+```
+
+## Distributions
+
+If your bijector is intended for use with a distribution, i.e., it transforms random variables drawn from that distribution to Euclidean space, then you should also implement:
+
+```@docs; canonical=false
+Bijectors.bijector
+```
+
+which should return your bijector.
+
+On top of that, you should also implement a method for `Bijectors.output_size(b, dist::Distribution)`:
+
+```@docs; canonical=false
+Bijectors.output_size
+```
+
+## Closed-form
+
+If your bijector does _not_ have a closed-form expression (e.g. if it uses an iterative procedure), then this should be set to false:
+
+```@docs; canonical=false
+Bijectors.isclosedform
+```
+
+The default is `true` so you only need to set this if your bijector is not closed-form.
@@ -1,55 +1,78 @@
-## Basic usage
+# Usage with distributions
 
-Other than the `logpdf_with_trans` methods, the package also provides a more composable interface through the `Bijector` types. Consider for example the one from above with `Beta(2, 2)`.
+Bijectors provides many utilities for working with probability distributions.
 
-```julia
-julia> using Random;
-       Random.seed!(42);
+```@example distributions
+using Bijectors
+
+dist = LogNormal()
+x = rand(dist)
+b = bijector(dist)  # bijection (0, ∞) → ℝ
+
+y = b(x)
+```
 
-julia> using Bijectors;
-       using Bijectors: Logit;
+Here, `bijector(d::Distribution)` returns the corresponding constrained-to-unconstrained bijection for `Beta`, which is a log function.
+The resulting bijector can be called, just like any other function, to transform samples from the distribution to the unconstrained space.
 
-julia> dist = Beta(2, 2)
-Beta{Float64}(α=2.0, β=2.0)
+The function [`link`](@ref) provides a short way of doing the above:
 
-julia> x = rand(dist)
-0.36888689965963756
+```@example distributions
+link(dist, x) ≈ b(x)
+```
+
+See [the Turing.jl docs](https://turinglang.org/docs/developers/transforms/distributions/) for more information about how this is used in probabilistic programming.
+
+## Transforming distributions
 
-julia> b = bijector(dist) # bijection (0, 1) → ℝ
-Logit{Float64}(0.0, 1.0)
+We can also couple a distribution together with its bijector to create a _transformed_ `Distribution`, i.e. a `Distribution` defined by sampling from a given `Distribution` and then transforming using a given transformation:
 
-julia> y = b(x)
--0.5369949942509267
+```@example distributions
+dist = LogNormal()          # support on (0, ∞)
+tdist = transformed(dist)   # support on ℝ
 ```
 
-In this case we see that `bijector(d::Distribution)` returns the corresponding constrained-to-unconstrained bijection for `Beta`, which indeed is a `Logit` with `a = 0.0` and `b = 1.0`. The resulting `Logit <: Bijector` has a method `(b::Logit)(x)` defined, allowing us to call it just like any other function. Comparing with the above example, `b(x) ≈ link(dist, x)`. Just to convince ourselves:
+We can then sample from, and compute the `logpdf` for, the resulting distribution:
+
+```@example distributions
+y = rand(tdist)
+```
+
+```@example distributions
+logpdf(tdist, y)
+```
+
+We should expect here that
 
 ```julia
-julia> b(x) ≈ link(dist, x)
-true
+logpdf(tdist, y) ≈ logpdf(dist, x) - logabsdetjac(b, x)
 ```
 
-## Transforming distributions
+where `b = bijector(dist)` and `y = b(x)`.
 
-```@setup transformed-dist-simple
-using Bijectors
+To verify this, we can calculate the value of `x` using the inverse bijector:
+
+```@example distributions
+b = bijector(dist)
+binv = inverse(b)
+
+x = binv(y)
 ```
 
-We can create a _transformed_ `Distribution`, i.e. a `Distribution` defined by sampling from a given `Distribution` and then transforming using a given transformation:
+(Because `b` is just a log function, `binv` is an exponential function, i.e. `x = exp(y)`.)
 
-```@repl transformed-dist-simple
-dist = Beta(2, 2)      # support on (0, 1)
-tdist = transformed(dist) # support on ℝ
+Then we can check the equality:
 
-tdist isa UnivariateDistribution
+```@example distributions
+logpdf(tdist, y) ≈ logpdf(dist, x) - logabsdetjac(b, x)
 ```
 
-We can the then compute the `logpdf` for the resulting distribution:
+You can also use [`Bijectors.logpdf_with_trans`](@ref) with the original distribution:
 
-```@repl transformed-dist-simple
-# Some example values
-x = rand(dist)
-y = tdist.transform(x)
+```@example distributions
+logpdf_with_trans(dist, x, false) ≈ logpdf(dist, x)
+```
 
-logpdf(tdist, y)
+```@example distributions
+logpdf_with_trans(dist, x, true) ≈ logpdf(tdist, y)
 ```
Original file line number	Diff line number	Diff line change
`@@ -1,2 +1,2 @@`
`1`	`1`	`style="blue"`
`2`		`-format_markdown=true`
	`2`	`+format_markdown=true`