Refactor how skipped tests are defined

Complement was designed to test the Matrix Specification. However, the spec isn't static and MSCs need tests too. We introduced the build tag system a long time ago to allow tests to opt-in to certain tests. They are defined like this:
```go
//go:build msc3391
// +build msc3391

package tests

// ... rest of code
```
and used like this:
```bash
go test -tags "msc3083 msc3787 msc3874" ./...
```
The set of tests run is controlled by the user, which can be frustrating when they get out-of-sync: https://github.com/matrix-org/complement/issues/185

This wasn't the end of it though. We also have unique APIs specific to a homeserver, or sometimes some servers don't implement core features which are enabled by default. To fix this, we (ab)use the same build tag system. To opt-out of a test, they are defined like this:
```go
//go:build !dendrite_blacklist
// +build !dendrite_blacklist

package tests

// ... rest of code
```
and used like this:
```bash
go test -tags "dendrite_blacklist" ./...
```

This was the status quo for a long time, until there was a need to blacklist _some_ tests in the same file but not others. This introduced the `runtime` package, where you could skip a test in code, rather than conditional compilation. They are defined like this:
```go
func TestRoomImageRoundtrip(t *testing.T) {
	runtime.SkipIf(t, runtime.Dendrite)
	// .. rest of test
}
```
A runtime knows what it is by the virtue of the blacklist tag. E.g running `-tags "dendrite_blacklist"` would skip this test, even if the homeservers were synapse! The code for this is:
```go
//go:build dendrite_blacklist
// +build dendrite_blacklist

package runtime

import (
	"context"
	"time"

	"github.com/docker/docker/client"
)

func init() {
	Homeserver = Dendrite
}
```
This is clearly at odds with testing between heterogeneous homeservers. It's also been questioned if it is the _test's_ job to say what can and cannot run with it. Surely it would be better if it were more tag-like, and the user can opt-in to these tests?

We also have some other ideas for conditional execution of tests based on `/versions` output: https://github.com/matrix-org/complement/issues/549 . Our forebear, sytest, would automatically skip some tests based on if an earlier test which has a special `can_` property did not pass. This is problematic if the `can_` test is flakey!

From these use cases, there are some properties we want the test execution API to have:
 - granular but not too granular: MSC level is about right. Skipping entire files isn't great as the files are mostly arbitrary collections of tests.
 - The user should specify which tests to execute. It's a bit of a smell that Complement dictates this currently via `runtime.SkipIf`, and it means that this repository gets touched way too much just to remove these lines as servers get updated.
 - We don't want the set of tests executed to be nondeterminstic (dependent on a server response or a passing/failing previous test) as that is way too unstable and flakey.

A high-level approach here could be:
 - One file per group of tests. Similar to `mscXXXX_test.go` files.
 - Make test names part of the public testing API.
 - Allow users to specify a set of groups / tests to run / not run.

This would allow us to remove `runtime.SkipIf` from the codebase, and runtime detection which is flakey and doesn't work with multi-HS support. It also removes the smell of having tests dictate what can/cannot run.

### Proposal
 - Define a configuration format for specifying the list of groups / test names to run. The list of valid groups/tests must be easily discoverable (autogen from code, like we do with env vars?)
 - Read the configuration format and skip appropriately. Always use `t.Skipf` instead of conditional compilation? This would be beneficial as it is not uncommon for conditionally compiled code to break API wise from core (e.g alter a function signature in core, then patch up tests which whine, the mscXXXX files won't whine due to no build tags set by default in the IDE).
 - This implies some kind of test helper/executor that runs before every test, which isn't ideal as it starts to obfuscate what code runs when you hit test. Could just add it as boilerplate though?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor how skipped tests are defined #654

Proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor how skipped tests are defined #654

Description

Proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions