top | item 30724159

(no title)

plainnoodles | 4 years ago

Python type hints are such a huge boon to the language.

I worked in what I would imagine was a pretty typical mature medium-size python codebase.

Typically, it was never too hard to figure out what a type was. Sometimes types were passed through several layers of functions and had the same short variable name through all of them - quite a few times I had to click around quite a bit to figure out what "f" was. Especially when I could infer it was some kind of file-ish thing but whether it was a file object, a string, or some composite type that held one of the two as a member, or what?! was sometimes difficult to determine, especially if the actual use of the parameter wasn't immediately obvious either.

When we updated to python 3, I started leaving type hints both in old code as I answered these questions for myself and new code as I created what otherwise would've been new questions for my future self. And I noticed two things happened:

* annoyance went down as I fixed the most well-trod of these problems

* I became less averse to using slightly more complicated types

I think python pushes you to keep things really simple and not use custom types unless really necessary. This is, overall, good? I do think Java developers, for instance (of which I consider myself one), generally reach to create datatypes that are basically just a simpler wrapper or a 2-tuple of collections or other primitives, and it can make the code a bit annoying to grok.

But the trouble is, in un-type-hinted python, I already start getting nervous about things like: [('2022-03-18', "something"), ('2022-03-19', "something else")]. And if your data content doesn't make it obvious (or at least somewhat guessable) what it is, it can make it hard to grok in a slightly worse way than the Java code would be.

In python 2 I'd usually make a namedtuple in these situations, but oftentimes I felt a bit weird there because I'd usually reach for it in lightweight situations when I feared they were becoming more complex.

But finally, in python 3, I feel like I'm generally happy with, in this order:

1. just use plain primitive types. no type hints. we all know what's in dates = ['2022-03-18'].

2. just use a type hint. I feel better about a Tuple[str, Dict[int, int]] if it's type hinted than not.

3. Use a namedtuple. This puts names onto the fields. so maybe my Tuple[str, Dict[int, int]] becomes a MyEntry(token: str, settings: Dict[int, int]) or something.

4. use a fully-fledged custom data type class.

discuss

order

packetlost|4 years ago

I would recommend using `dataclasses.dataclass` over namedtuple. Namedtuples are factories that generate objects with different identities every time you create one and will not behave like you expect when comparing them. With dataclasses you can type-annotate fields as well as generate various special functions just as args to `@dataclass` so you get better safety, immutable objects (when using `frozen=True`), `__eq__` generation, `asdict` to serialize complex objects (recursively!), and a bunch of other great stuff.

I come from a massive hard-realtime system codebase that's mostly in Python and there's a lot of moving parts and complex data types. Type hints, `typing.Protocol`, and `dataclass` are all godsends for having any sort of sane, human parseable structure to the code. Being able to navigate to type definitions with `gd`/ctrl + click is massively helpful.