Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow hooks to retry a single test case multiple times with fresh fixtures #12939

Open
bcmills opened this issue Nov 4, 2024 · 4 comments
Open
Labels
topic: fixtures anything involving fixtures directly or indirectly type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature

Comments

@bcmills
Copy link

bcmills commented Nov 4, 2024

What's the problem this feature will solve?

Tests of APIs that rely on timer / timeout behaviors currently have to choose one (or both!) of {slow, flaky}:

  • If the test uses a short duration for the timeout, then sometimes — due to scheduling delays on the host OS, for example — something that needs to happen before the timeout fires doesn't happen, and the test has a flaky failure.
  • If the test uses a long duration for the timeout, then it ends up needing to sleep for some multiple of that long duration, and the test runs reliably but is extremely slow — say, 10s for a test function that could normally complete in <10ms.

I would like not to have to choose between those two: I want the test to run quickly, but to be retried automatically if the timeout turns out to be too short.

Describe the solution you'd like

Ideally, I would like implement a pytest fixture that takes on the current timeout value.
Then, each other test fixture that depends on it can configure its own objects configured based on that timeout, and the test is run with those fixtures.
If it passes, the test passes overall and is done. If it fails, the fixtures are torn down, a new (longer) timeout is selected, and a new set of fixtures are recreated with the new timeout value.

This process should be iterated until either the test passes, or the selected timeout exceeds a configured maximum.

In particular:

  • Warnings and errors for runs on short timeouts should not be logged.
  • Different timeout values should not be considered as separate test case parameters, since they fundamentally represent only one underlying test case: “run with an appropriate dynamic timeout”.
  • Test fixtures must be recreated with each new timeout value, since a fixture may create an object that uses the timeout internally. (For example: a connection timeout on a networking library; a batch-delay timeout on an asynchronous-batching mechanism; a sleep timeout on a polling-based API.)

Examples of this pattern (in Go rather than Python) can be found in the Go project's net package:
https://github.com/search?q=repo%3Agolang%2Fgo+%2FrunTimeSensitiveTest%5C%28%2F&type=code

Unfortunately, I don't see a way to run a pytest test a variable number of times with fresh fixtures:

Alternative Solutions

One alternative is to move all objects that depend on the configured timeout outside of pytest fixtures and into the test function itself. That works, but it severely diminishes the value of pytest fixtures for the affected test.

Another alternative is to design all objects in the hierarchy so that their timeouts can be reconfigured on-the-fly, and use a single set of fixtures for all attempts. Unfortunately, if I use any third-party libraries that may force me to rely on implementation details to monkey-patch the timeout configuration, and even that isn't always possible.

@bcmills bcmills changed the title Allow hooks to yield Allow hooks to retry a single test case multiple times Nov 4, 2024
@bcmills bcmills changed the title Allow hooks to retry a single test case multiple times Allow hooks to retry a single test case multiple times with fresh fixtures Nov 4, 2024
@The-Compiler
Copy link
Member

  • If the test uses a long duration for the timeout, then it ends up needing to sleep for some multiple of that long duration, and the test runs reliably but is extremely slow — say, 10s for a test function that could normally complete in <10ms.

You lost me there. Why does it need to sleep? That's not how timeouts usually work, no? I don't see the difference between running a test once with a 10s timeout, vs. running it with a 1s + 2s + 3s + 4s timeout. For at least your "a connection timeout on a networking library" example, the test will finish as soon as the server answers, and I'd argue that for many other cases the first thing to attempt is to make it work that way as well (e.g. with a polling based API, you might still want to poll all 0.1s or something, but time out after, say, 50 attempts).

FWIW, there's pytest-rerunfailures that recreates fixtures, and seems to have a way to access the .execution_count on the test item.

There's various open issues around exposing an API around fixtures (#12630, #12376, ...), and what you describe in particular sounds a lot like a duplicate of #12596 to me.

@bcmills
Copy link
Author

bcmills commented Nov 4, 2024

You lost me there. Why does it need to sleep? That's not how timeouts usually work, no?

This is for testing the cases where a call internal to the test intentionally does time out, not the case where the test itself exceeds its intended running time.

For at least your "a connection timeout on a networking library" example, the test will finish as soon as the server answers

No, you have it backwards. This is for the cases where we want the server not to answer in time.

Failure modes also need to be tested!

@bcmills
Copy link
Author

bcmills commented Nov 4, 2024

FWIW, there's pytest-rerunfailures that recreates fixtures, and seems to have a way to access the .execution_count on the test item.

Looks like that one also relies on undocumented implementation details:
https://github.com/pytest-dev/pytest-rerunfailures/blob/a53b9344c0d7a491a3cc53d91c7319696651d21b/src/pytest_rerunfailures.py#L499

@bcmills
Copy link
Author

bcmills commented Nov 4, 2024

what you describe in particular sounds a lot like a duplicate of #12596 to me.

Yep, that does seem similar! The key difference there, I think, is that they want to run the test until it fails, whereas I want to run it until it succeeds and discard the failure logs — but those parts might already be possible if the fixture-reset problem is addressed.

@Zac-HD Zac-HD added type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature topic: fixtures anything involving fixtures directly or indirectly labels Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: fixtures anything involving fixtures directly or indirectly type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature
Projects
None yet
Development

No branches or pull requests

3 participants