top | item 45820610

(no title)

Property based testing is fantastic.

Why is it not more popular?

My theory is that only code written in functional languages has complex properties you can actually test.

In imperative programs, you might have a few utils that are appropriate for property testing - things like to_title_case(str) - but the bulk of program logic can only be tested imperatively with extensive mocking.

discuss

IanCal|3 months ago

I strongly disagree, but I think there are a few problems

1. Lots of devs just don't know about it.

2. It's easy to do it wrong and try and reimplement the thing you're testing.

3. It's easy to try and redo all your testing like that and fail and then give up.

Here's some I've used:

Tabbing changes focus for UIs with more than 1 element. Shift tabbing the same number of times takes you back to where you came from.

This one on TVs with u/d/l/r nav -> if pressing a direction changes focus, pressing the opposite direction takes you back to where you came from.

An extension of the last -> regardless of the set of API calls used to make the user interface, the same is true.

When finding text ABC in a larger string and getting back `Match(x, start, end)`, if I take the string and chop out string[start:end] then I get back exactly ABC. This failed because of a dotted I that when lowercased during a normalisation step resulted in two characters - so all the positions were shifted. Hypothesis found this and was able to give me a test like "find x in 'İx' -> fail".

No input to the API should result in a 500 error. N, where N>0, PUT requests result in one item created.

Clicking around the application should not result in a 404 page or error page.

Overall I think there's lots of wider things you can check, because we should have UIs and tools that give simple rules and guarantees to users.

chamomeal|3 months ago

I think testing culture in general is suffering because the most popular styles/runtimes don’t support it easily.

Most apps (at least in my part of the world) these days are absolutely peppered with side effects. At work our code is mostly just endpoints that trigger tons of side effects, then return some glob of data returned from some of those effects. The joys of micro services!!

If you’re designing from the ground up with testing in mind, you can make things super testable. Defer the actual execution of side effects. Group them together and move local biz logic to a pure function. But when you have a service that’s just a 10,000 line tangle of reading and writing to queues, databases and other services, it’s really hard to ANY kind of testing.

I think that’s why unit testing and full on browser based E2E testing are popular these days. Unit testing pretends the complexity isn’t there via mocks, and lets you get high test coverage to pass your 85% coverage requirement. Then the E2E tests actually test user stories.

I’m really hoping there’s a shift. There are SO many interesting and comprehensive testing strategies available that can give you such high confidence in your program. But it mostly feels like an afterthought. My job has 90% coverage requirements, but not a single person writes useful tests. We have like 10,000 unit tests literally just mocking functions and then spying on the mocked return.

For anybody wanting to see a super duper interesting use of property based testing, check out “Breaking the Bank with test contract”, a talk by Allen Rohner. He pretty much uses property based testing to verify that mocks of services behave identically to the actual services (for the purpose of the program) so that you can develop and test against those mocks. I’ve started implementing a shitty version of this at work, and it’s wicked cool!!

ibizaman|3 months ago

I actually used property testing very successfully to test a DB driver and a migration to another DB driver in Go. I wrote up about it here https://blog.tiserbox.com/posts/2024-02-27-stateful-property...

imiric|3 months ago

Thanks for sharing! Your article illustrates well the benefits of this approach.

One drawback I see is that property-based tests inevitably need to be much more complex than example-based ones. This means that bugs are much more likely, they're more difficult to maintain, etc. You do mention that it's a lot of code, but I wonder if the complexity is worth it in the long run. I suppose that since testing these scenarios any other way would be even more tedious and error-prone, the answer is "yes". But it's something to keep in mind.

DRMacIver|3 months ago

How popular do you want it to be?

The Python survey data (https://lp.jetbrains.com/python-developers-survey-2024/) holds pretty consistently at 4% of Python users saying they use it, which isn't as large as I'd like, but given that only 64% of people in the survey say they use testing at all isn't doing too badly, and I think certainly falsifies the claim that Python programs don't have properties you can test.

iLemming|3 months ago

> Why is it not more popular?

Probably because it's difficult to build simplistic intuition around them in languages that don't have robust built-in mechanisms that enable that kind of thinking.

For example, gen-testing in Clojure feels simple and intuitive, but if I have to do something similar, hmmm... I dunno, say in Java, I wouldn't even know where to start - the code inevitably would be too verbose; mutable state would obscure invariants; and the required boilerplate alone would cloud the core idea. While in Clojure, immutability-by-default feels transformative for gen-testing; data generation is first-class - generators are just values or functions; composition is natural; debugging is easier; invariants are verifiable; FP mindset already baked in - easier to reason about stateful sequences/event streams; there's far less ceremony overall;

But if I'm forced to write Java and see a good case for prop-based testing, I'd still probably try doing it. But only because I already acquired some intuition for it. That's why it is immensely rewarding to spend a year or more learning an FP language, even if you don't see it used at work or every day.

So, the honest answer to your question would probably be: "because FP languages are not more popular"

vrnvu|3 months ago

>> Why is it not more popular?

Property, fuzzy, snapshot testing. Great tools that make software more correct and reliable.

The challenge for most developers is that they need to change how they design code and think about testing.

I’ve always said the hardest part of programming isn’t learning, it’s unlearning what you already know…

maweki|3 months ago

I think core of the problem in property-based testing that the property/specification needs to be quite simple compared to the implementation.

I did some property-based testing in Haskell and in some cases the implementation was the specification verbatum. So what properties should I test? It was clearer where my function should be symmetric in the arguments or that there is a neutral element, etc..

If the property is basically your specification which (as the language is very expressive) is your implementation then you're just going in circles.

chriswarbo|3 months ago

Yeah, reimplementing the solution just to have something to check against is a bad idea.

I find that most tutorials talk about "properties of function `foo`", whereas I prefer to think about "how is function `foo` related to other functions". Those relationships can be expressed as code, by plugging outputs of one function into arguments of another, or by sequencing calls in a particular order, etc. and ultimately making assertions. However, there will usually be gaps; filling in those gaps is what a property's inputs are for.

Another good source of properties is trying to think of ways to change an expression/block which are irrelevant. For example, when we perform a deletion, any edits made beforehand should be irrelevant; boom, that's a property. If something would filter out negative values, then it's a property that sprinkling negative values all over the place has no effect. And so on.

eru|3 months ago

But wouldn't that apply just as much to example based testing?