top | item 14891185

Falsehoods programmers believe about geography (2012)

62 points| AndreyKarpov | 8 years ago |wiesmann.codiferes.net | reply

57 comments

order
[+] kaikai|8 years ago|reply
I work on geocoding, so I have a ridiculous stash of these saved up for a talk some day. Here's a couple:

- Place names are listed smallest feature to largest feature (address, city, state, etc)

- There is one name (per language) for every place, and everyone agrees on it

- Place names will start with an alphabetic character (see: 's-Heer Arendskerke)

- Addresses start with a number, for each building

- Addresses include a number for each building

- Address exist for each building

- On streets with addresses, the addresses will be in order of location on street

- Speaking of streets: all streets have names

- Every country's place hierarchy is roughly similar, or at least has a similar number of layers

- A city can not contain other cities

- Zipcodes have defined areas

- Zipcodes are areas

- Zipcodes are stationary areas

Here's some that are more Arg! Programming!:

- People will search for features in the language in their phone settings

- Two similar names separated by punctuation are likely to be synonyms (lookin' at you, Helena-West Helena)

- Governments use consistent projections

- Governments use consistent file formats (thanks for the SOSI, Norway) (no but really y'all make great maps, thank you)

I could go on. Even though there's a ton of inconsistencies and frustrating moments, it really is a joy to learn about the world through its idiosyncrasies.

(edited for formatting)

[+] kalleboo|8 years ago|reply
> - Every country's place hierarchy is roughly similar, or at least has a similar number of layers

It was always fun seeing packages arrive in Singapore with an address ending in "Singapore, Singapore, SINGAPORE" since every international shipping company assumes that city, state/prefecture, country are all different and required.

Ideally address input would just be one multiline text field, but unfortunately users are just too unreliable to be left to their own devices like that...

[+] macintux|8 years ago|reply

  - Zipcodes have defined areas
  - Zipcodes are areas
  - Zipcodes are stationary areas
Can you elaborate on these? I always assumed (thankfully not for any professional purposes) there was indeed an area assigned to each zip code.
[+] eastern|8 years ago|reply
Bangalore, a large city in Southern India, is probably the world leader in the proportion of layers that have embedded numbers. Here's an example:

ICICI BANK LTD, 273, 15TH CROSS, 20TH MAIN, JP NAGAR 5TH PHASE, BANGALORE - 560078

[+] sideshowb|8 years ago|reply
And that's missing all the coordinate stuff. Let's add

* FALSE: You don't need to know the difference between geographic and projected coordinate systems

* FALSE: There is only one geographical coordinate system and it's called WGS84

* FALSE: For any pair of coordinate systems there is one and only one way to transform data between them

(On the last one: https://blogs.esri.com/esri/arcgis/2009/05/06/about-geograph... )

[+] wcarey|8 years ago|reply
Another fun one:

*FALSE: A coordinate that looks like 15 45 38 is Degrees, Minutes, Seconds.

(Sometimes it's degrees, decimal minutes. Happily, the maps in question had some "seconds" values greater than 59, which tipped us off.)

[+] nerdponx|8 years ago|reply
* FALSE: calculating the distance between points is straightforward and there is one way to do it.
[+] paganel|8 years ago|reply
> Buildings do not move

In communist Romania even churches moved, and quite by a long distance (200 meters or so). There's this Google Images link for moving churches (https://www.google.ro/search?client=firefox-b-ab&biw=1920&bi...) and this other one for moving buildings (https://www.google.ro/search?client=firefox-b-ab&biw=1920&bi...)

[+] freehunter|8 years ago|reply
I had a fast food chicken restaurant near me that moved about 30 miles once. Guess it was cheaper than building a new one.

It happens so rarely, though, I can't imagine putting in any specific code for a building having been moved. Idk.

[+] scaryclam|8 years ago|reply
Also, postcodes have a geographic area. Here in the UK, postcodes can be non-geographic: http://www.royalmail.com/sites/default/files/docs/pdf/11july...
[+] jmnicolas|8 years ago|reply
I remember trying to deliver an urgent parcel in London (I was coming from France) and discovering that the address numbers don't follow each other and are not organized by odd numbers on one side of the road and even numbers on the other side.

So you could have house number 200 between 15 and 42 ... of course I had to go the full length of the street before finding the right address. I remember feeling sympathy for English postmen back then ;-)

[+] taneq|8 years ago|reply
In Australia, postcodes have a many:many relationship with suburbs. A single postcode can cover multiple suburbs and a single suburb can have multiple postcodes.

I think postcodes are geographically contiguous but I'm probably wrong. :P

[+] Sleeep|8 years ago|reply
I would add: All house numbers with the same street name in a town are on the same street.

My address is 11 XXX Street. The house nextdoor to mine is 17 XXX Street. 13 and 15 XXX Street both exist, however they are on an identically named street on the other side of town with the same town name and zip code mailing address. Ever single "in between" number on my entire street is on the doppelgänger street.

Not so much of a problem from programmers (yet!) but delivery/repair people sometimes get confused and end up on the wrong street.

[+] awkward|8 years ago|reply
FALSE: Place names are a neutral fact, and using one source without interrogating what it names what and where it draws borders will never cause an incident.
[+] theBobBob|8 years ago|reply
Derry vs Londonderry being one. Its interesting how different publications have different style guide rules to handle it as best as possible. I can't remember which it was, but one had that it would always be called Londonderry first, the Derry for the rest of the article/piece.
[+] theBobBob|8 years ago|reply
All countries have a Postal Code / ZIP Code or equivalent. Ireland only very recently added them and called them Eircodes (because of course we did). It used to be slightly annoying having to find some combination of characters that the online form would accept as they often varied site to site. Interestingly enough, several of these companies had their European head-quarters in Ireland. Also having to talk to the bank manager because the cashier in a UK bank wouldn't let me open an account without giving a valid postcode for my Irish address.

EDIT: typos

[+] rmc|8 years ago|reply
And to increase confusion, Eircodes aren't areas. Each letter box gets it's own eircode. So Irish postcodes are points, and 2 eircodes can have the same point. Also eircodes aren't similar to each other. "D01 ABCD" can be beside "D01 87D4"
[+] bungie4|8 years ago|reply
We have some code here, not written by me!, which queries by Canadian Postal Code, of which, their are several hundred thousand.

Apparently, dropping the last character and requerying is the equivalent of a proximity search!

This in itself is laughable. But the company sent the code back to the creators to have the ability to query by city added.

Guess what. Apparently if you drop the last letter of a city, it's the equivalent of doing a proximity search!

I must be totally turned around on this.

[+] nailer|8 years ago|reply
Postal addresses have counties or states. In the UK counties haven't been a required part of postal addresses for many years.

https://personal.help.royalmail.com/app/answers/detail/a_id/...

[+] rvense|8 years ago|reply
I recently had to give up buying an item from a web shop. They were asking for my address and forcing me to include a state (my country, and thus my address, has nothing like it), but then complaining that my address wasn't the one associated with my credit card.
[+] mnw21cam|8 years ago|reply
And if you do give a county, do you mean the historic unchanging counties or the modern government administrative areas with the same names?
[+] ramshorns|8 years ago|reply
> One of the Kergelen islands (part of France) is called Île de Croÿ, most french persons have no clue how to type the “ÿ” character.

Meanwhile, programmers believing some falsehoods about text encoding (that one character is one byte and -1 is EOF) may find themselves with a surplus of the “ÿ” character.

[+] BrandoElFollito|8 years ago|reply
> most french persons have no clue how to type the “ÿ” character

Of course we do. While ÿ is not common, you find it here and there.

Typing it on a keyboard is no different from typing ë (the tréma first, then y). It is true however that some mobile keyboards do not have it as part of of the "y" family.

[+] losteverything|8 years ago|reply
Post office oddities in my universe:

Two identical number and street name. (Eg 12 Aspen Lane) (). As a delivery person we (USPS, fedex, ups) make that mistake once. Why knowing name is critical.

Farmland made into houses: postal address in town A, property in town B. Have 4 houses with this in my area.

() the only reason this never changes is cause 911 knows the difference.

[+] Asooka|8 years ago|reply
Let me add one, pertaining to my corner of East Europe that drives me crazy whenever I try to look up places on Google maps:

- Residential buildings can be identified by the name of the street they're on and a number.

For whatever reason, we decided that naming streets and numbering the buildings on them is too passé, so lots of our cities use Section + Number (where the numbers carry no geographic meaning), in parallel with Street + Number, but every building is addressable using only one scheme.

Concrete example with the city of Sofia, Bulgaria - consider these two buildings [1] [2]. They're both next to Vasil Kalchev street. One is a kindergarten, the other is a block of flats. Let's see what the address for each is, if you want to send a letter to them. The kindergarten is, obviously, St. Pimen Zografski street No. 5... well OK, that's the street on the other side of the building, nothing too strange; while the block of flats is zh.k. Dianabad bl. 54. The abbreviations mean literally "residential quarter Dianabad, residential building number 54". No, the building is not addressable via the street, you cannot send post to that building or locate it on a map via "Vasil Kalchev street, No. X" for any X. There are, in fact, no numbers on Vasil Kalchev street. And the residential building numbers aren't geographically meaningful - directly east of said building 54 is building 53, but directly west of it are buildings 42 and 43. There is no building 44. There are, however, 33, 33A, and 33B. They are just ad-hoc numbers (maybe with letters) that you need to have in a database, like you have the locations of streets and where the numbers on the street are geographically.

So what does this have to do with Google? Maps doesn't understand the Section+Number address system. 90% of the residential buildings cannot be found on Google maps using their official address. If I need to send my address to a friend, I can only do it as coordinates, because entering my address, the one on which I receive mail, will result in either no results, or worse - Maps will try interpreting it as a place name, do a partial match, and send you to some completely unrelated building, maybe on the other end of the city.

They're getting kind of better, because people are adding buildings to the map as "missing places", but it's still much safer to just use our local maps site. What I do to plan routes using Google Maps is first locate where the place is using our maps, then match the location on Google's. At least it has pretty good road data.

[1] https://www.google.bg/maps/place/Kindergarten+49+Radost/@42....

[2] https://www.google.bg/maps/place/42%C2%B039'53.7%22N+23%C2%B...

Edit: P.S. I just checked and OSM understands my address, and even shows you building numbers when you scroll around the map.

[+] pyb|8 years ago|reply
One of the Kergelen islands (part of France) is called Île de Croÿ

Surely they mean the Kerguelen Islands ?

[+] mercer|8 years ago|reply
My office has two separate zip codes... Regularly causes confusion with the delivery people.
[+] torrent-of-ions|8 years ago|reply
While interesting, I can't see why any programmer would assume half of these ever. Why would anyone go out of their way to restrict place names to be in the "usual character set of the country"?
[+] vec|8 years ago|reply
No one would go out of their way to, but it's pretty common to have these cases break because no one's ever bothered to test them. On screen keyboards don't have characters no one will ever type, fonts don't have characters no one will ever display, sorting and string manipulations may not bother to handle accented characters correctly if there will never be accented characters, etc.

No one actively thinks "I'm going to intentionally omit solid unicode support"; we just don't bother with it until we feel there's a good reason to, and by then it's often too late.

[+] potatolicious|8 years ago|reply
Data validation - you want to make sure your users are entering addresses correctly, and catch errors early (say, at checkout) rather than result in a negative experience (say, a missed delivery or returned package).

You may also want to make sure your customers have entered their full address, rather than a short form that cannot be used (see: "123 Fake St", without any markers for city, county, country, etc) - and doing so necessarily requires some structured understanding of addresses... which comes with all the pitfalls of assumptions.

There are also uses for addresses that aren't necessarily about delivering a physical item to said address - for example determining the correct taxes to charge a customer based on zip code (some zip codes do not map to a physical area, therefore are not useful for determining taxation).

There are lots of perfectly understandable reasons why programmers would assume the format of an address.

[+] icebraining|8 years ago|reply
It's not that you'd restrict it, but you might not test it with "weird" characters, and it might break your application in some way (e.g. layout).
[+] awkwarddaturtle|8 years ago|reply
Would be nice if there was an ANSI/ISO standard for addresses like we have for dates.
[+] NikolaeVarius|8 years ago|reply
Why the hell do these things keep coming up?

Can we just concat "Falsehoods (Arbitrary group of people) believe about (Arbitrary Massive Topic)" and just post it directly to a wiki?

This seems like the equivalent of "10 ways you're doing sex wrong".

[+] Sleeep|8 years ago|reply
They keep coming up because people find them interesting and/or helpful in some way. If you don't, that's ok, you won't like everything posted to HN. Just don't click and move on.
[+] CodexArcanum|8 years ago|reply
Almost any topic in CS has a multitude of edge cases and potential pitfalls. I've learned quite a few tricky edges to things I had never considered just reading through RFCs and standards docs on topics like calendars, Unicode, floating point numbers, and lots more. Practical guides like these blog posts of "Things I've really encountered problems with" are just the sprinkles on top.

Such a wiki would be great, but it would basically amount to collecting all the practical knowledge of every domain to which computers have ever been applied!

[+] sp332|8 years ago|reply
It's not spitefully calling out programmers for being ignorant. It's an invitation for reflection and getting some perspective.