[ENH] `TransformedDistribution` handling of index passing in subsetting

I noticed a few problems with `TransformedDistribution` subsetting, and it is an issue in-principle.

Comments on how to handle this would be appreciated.

Namely, consider a univariate `td = TransformedDistribution(distr, trafo, inv_trafo)`.

### Problem 1 - columns forgotten in scalar subsetting

Let's say `trafo` requires a specific column name to work.

If we do `td_subset = td.iat[0,0]`, this produces a column-less scalar distribution.
However, whenever we pass data to `trafo` or `inv_trafo`, it will require the column name, which is no longer stored in `td_subset` in the current implementation.

It seems we need to remember the column name when we subset to scalar.

### Problem 2 - multivariate transformations

Let's now consider `td = TransformedDistribution(distr, trafo, inv_trafo)` of `shape` (3, 2), i.e., two columns.

If we subset to a single column, applying `trafo` or `inv_trafo` to (3,1) objects no longer work, since both will require an (n, 2) object as per `sklearn` contract.

So, in the multivariate transformation case it is even worse, we seem to require a copied memory of the entire original distribution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENH] `TransformedDistribution` handling of index passing in subsetting #617

Problem 1 - columns forgotten in scalar subsetting

Problem 2 - multivariate transformations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ENH] TransformedDistribution handling of index passing in subsetting #617

Description

Problem 1 - columns forgotten in scalar subsetting

Problem 2 - multivariate transformations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[ENH] `TransformedDistribution` handling of index passing in subsetting #617