(no title)
mbell | 4 years ago
Just last week I opened a PR to fix some tests that should not have been passing but were due to an issue along these lines. The tests were making assertions about id columns from different tables and despite the code being incorrect the tests were passing because the sequence generators were clean and thus in sync. The order in which the test records were created happened to line up in the right way that an id in one table matched the id in another table.
So, I get the pain. But I'm not yet convinced it's worth a change.
Another option that I think isn't a bad approach is the default testing setup that Rails uses. Every test runs in a transaction but the test database is also initially seeded with a bunch of realistic data (fixtures in Rails lingo). This makes it impossible to write a test that assumes a clean database while also starting every test with a known state.
baskethead|4 years ago
Truncating a table is extremely fast. Rolling back a transaction is very slow. If you're not seeing this then there's something wrong with your setup.
amirkdv|4 years ago
This idea of "transaction rollback in test teardown because performance" has a life of its own. The recommended base class for django unit tests (informally recommended, via code comments, not actual docs) uses transaction rollbacks instead of table truncation [0].
On top of this, I think, db truncation gets mixed up with table truncation sometimes too. For example, from OP:
> The time taken to clean the database is usually proportional to the number of tables
... only if you're truncating the whole db and re-initializing the schema, no?
And people sometimes actually do clear the whole db between tests! One unfortunate reason being functionally necessary data migrations that are mixed up with schema-producing migrations, meaning truncating tables doesn't take you back to "zero".
[0]: https://docs.djangoproject.com/en/2.2/_modules/django/test/t...
jeltz|4 years ago
mbell|4 years ago
Truncation cleaning is extremely slow, not only because the cleaning is slower but because you actually have to commit everything your test code does.
pydry|4 years ago
I had a few other ideas to speed it up, also.
JoshTriplett|4 years ago
commandlinefan|4 years ago
Using a database at all in unit tests is horrifically slow - one of the (many) reasons you shouldn’t.
vlovich123|4 years ago
Empty databases should generally start quickly unless there’s some distributed consensus that’s happening (and even then, it’s all on a local machine…). You also don’t even need to tear it down all the way - just drop all tables.
emptysea|4 years ago
And tests hitting the database can be fast: https://www.brandur.org/nanoglyphs/029-path-of-madness
ramchip|4 years ago
An important trick when doing this is to respect unique constraints in fixtures. For instance if you have a users table with an email column as primary key, make the user fixture/factory generate a unique email each time ("user-1@example.com", "user-2@example.com", ...) Then you don't get slowdowns or deadlocks when many tests run in parallel.
srer|4 years ago
I notice in a VM on my laptop establishing the initial connection to postgres seems to take 2-3ms, and running a trivial query takes 300-1000us.
I routinely involve the database in unit tests, it is certainly slower but my primary concern is the correct behavior of production code which uses real databases.
ryanbrunner|4 years ago
goto11|4 years ago