GDPR – A Practical Guide for Developers (2017)

[+] havkom|8 years ago|reply

I just want to point out that a lot of the article is the authors own opinion on how the regulation should be implemented into software and a lot of things are probably not needed normally and would be a burden for businesses.

My own take (and the take of most European data protection lawyers I meet) is that consent is not needed, and also possibly is inappropriate in 90% of the cases - instead the “legal basis” called “legitimate interest” should be used instead. This is where you do your own judgement if your data processing is reasonable. Imagine if you yourself always had to consent to all common sense use of your personal data, what a hassle!

If you use legitimate interest, also skip the “under 16” part, consent checkboxes and re-request consent part of the article (Of course there is more to it, but I would not get in to that unless you are doing data processing you do not think the data subjects in general would approve of).

Functions that allow the users to delete and automatically download/access their own data is good practice for legitimate interest but not needed. You are anyway in general allowed to deal with these types of request on a case by case basis if you provide your data subjects with an email address.

What you should do though is automatically delete data that you do not need anymore, such as old logs, customers contact details of customers long gone, old backups etc.

The parts of encyption, you should create an api etc are not required but may be good practice. Just make sure you have normal OK data and data access security.

As discussed above, you SHOULD use data for purposes that the customer has not agreed/consented to. However, never use personal data for purposes that are not compatible with the purposes you informed your customers of when collecting the data (normally stated in your privacy notice on your web site). If you did not have a privacy notice at the time of collection pre-GDPR, what is an compatible purpose will be a judgement call from the context of collection.

[+] bozho|8 years ago|reply

A few clarifications (author here)

1. yes, you are correct, most of the features don't need to be implemented in code and having documented procedures would be sufficient (and that is pointed out in a number of places in the article). However, if you are not a small business or have a lot of users, the time needed to implement the features will be negligible compared to the amount of time needed for handling manual requests.

2. The "legitimate interest" legal basis is harder than it seems and many regulators warn against its overuse. Lawyers in my country are skeptical that regulators will accept legitimate interest in many cases, so "to be on the safe side" they recommend relying on consent. Again, as pointed out in the article, this is up to the legal team to decide.

3. The right to be forgotten is valid even under legitimate interest. Article 17(1)(c) is clear about that - whenever a user objects to their data being processed on the basis of legitimate interest. It is a bit hidden, as Article 17 refers to Article 21 which in turn refers to Article 6, but you can piece the whole scenario anyway.

4. About the best practices - agreed, they are not mandatory under the regulation (as pointed out in the article), but having them in place will demonstrate a higher level of compliance.

[+] havkom|8 years ago|reply

> Functions that allow the users to delete and automatically download/access their own data is good practice for legitimate interest but not needed. You are anyway in general allowed to deal with these types of request on a case by case basis if you provide your data subjects with an email address.

I want to be very clear - you _almost always without exception_ have to provide access to/copy of personal data to the data subject no matter what legal basis is used (consent or not). “Data portability”, providing the data in a commonly used electronic format, such as JSON/XML download, is optional when using legitimate interest but mandatory when using consent.

You also _normally_ have to delete personal data of those data subjects who have requested it. You also normally have to stop using the data for the purposes the data subject requests you to stop with.

These processes do not normally need to be automated (but with consent it should be as easy to provide consent for the data subject as it is to revoke it).

[+] grabeh|8 years ago|reply

The main processing basis for many entities will be straightforward processing to provide service under Art 6.1.b. Legitimate interests like consent should generally be avoided wherever possible due to the additional burden it places on organisations to document the balancing test you've undertaken, and the potential for it to be questioned in future. Although I fully agree in many cases it will be entirely appropriate to use legitimate interests as the relevant processing basis.

[+] nawitus|8 years ago|reply

This is the biggest problem with GDPR, there's no agreement what it means, but it will go into effect in a few weeks.

[+] DangerousPie|8 years ago|reply

I really don't understand how this is going to work in practice for small side projects with a single part-time developer. How are they supposed to afford implementing all these changes, none of which seem trivial or even practical for your standard little PHP site?

So if I run a forum as a side project, what are my options?

1) Spend all free time over the next few months adding these features and neglect any other work on the project.

2) Ignore the GDPR and hope nobody complains.

3) Shut down the side project.

Of course if you're Facebook or Twitter you just assign a few developers to this and you'll be fine. But I don't understand how this will not end up killing small-time web companies, or at least make them a lot less feasible to create.

I suspect many people will go for (2) and hope this fizzles out the same way the cookie law did.

[+] stefan_|8 years ago|reply

Don't store a bunch of personally identifiable data and you don't have to do any of this.

We have seen what this laissez faire attitude to "capture everything, delete never" has done. Trust has been supremely squandered so at this point I don't think anyone is particularly inclined to believe when someones cries wolf.

[+] lucideer|8 years ago|reply

Smaller summarised guide for very small side projects:

1. ideally, don't store personal data

2. if you absolutely need to, store the bare minimum

3. if you're doing 2, don't give any of it to 3rd parties

The article splits its bullet points into three sections. The second section is basic security best practice: you should have this covered anyway regardless of the size of your project.

If you stick to my points above, the author's remaining bullets should either be null, or much much easier to implement.

[+] sovietmudkipz|8 years ago|reply

One man shop here...

Cloudfront forwards country information to your origin servers in AWS. My plan was to not do business or display content in European countries until an easy solution to GDPR enables me to quickly meet it's criteria. Certainly libraries will crop up helping to ease the burden of the regulation for smaller operations.

Though... I'm not quite sure what happens when a European citizen uses VPN to spoof a non-GDPR country to gain access to my site, provides me their personal data, then requests to be forgotten. Would it be relevant that I intended my site _not_ to be used in Europe and the user in question circumvented my attempts not to do business in Europe? My bet is that it wouldn't. * Shrug *

> 2) Ignore the GDPR and hope nobody complains.

> 3) Shut down the side project.

Basically, I'm implementing a hybrid of #2 and #3 by not letting European users into my system until I know I can comply with GDPR cheaply and easily.

[+] powvans|8 years ago|reply

It looks like the maximum fine is 4% of annual revenue... seems like the regulation has no teeth if you have no revenue. IANAL and could be totally wrong.

To your point about small companies, I agree, it feels onerous.

What irks me about the right to be forgotten is that it directly counters my right to remember things. Should a shop keeper be allowed to record their observations about who enters their store each day? If they maintain a physical guest book in their brick and mortar store, does a visitor have a right to be erased from that book?

[+] grabeh|8 years ago|reply

Option 4) - adopt a risk-based approach to compliance, and look to assess whether any aspect of your service, and the way it makes use of data in its current form is an egregious breach of GDPR. If that is the case, you're likely in breach of existing data protection law.

In terms of risk factors, most side projects will generally use data to provide a service to customers. This type of processing is unlikely to attract regulatory attention.

It's also unlikely that the vast majority of entities will be 100% compliant come May 25, but as the Belgian data protection authority has pointed out, it is important that people demonstrate a good faith approach to compliance.

For new requirements around restriction of processing/provision of data in a programmatic manner, again, dependent on risk, it is likely not necessary to implement these features and side project owners should focus on building their product instead.

Relevant risks for a side-project owner are around a) volume of data held, b) types of uses of data, c) likelihood of users making a requests around erasure/restriction (look at any historic requests received here), and d) regulatory focus on specific areas of the legislation.

This law is not going to fizzle out and on a general level, in my view it is advisable for any entity to look to respect user data regardless of legal obligations relating to that.

[+] kyriakos|8 years ago|reply

Small side project means small data too, easy to manually sort out gdpr requests as need arises. No need to automate everything.

[+] blowski|8 years ago|reply

Until Stripe, that seemed to be true of PCI compliance as well. Just keep saying your project is too small form tone to spend any time trying to hack it.

[+] dazc|8 years ago|reply

'I suspect many people will go for (2) and hope this fizzles out the same way the cookie law did.'

I suspect you may be right. http://nocookielaw.com/

[+] rmc|8 years ago|reply

Or just don't store personal data?

[+] pkaye|8 years ago|reply

4) If you are not living in the EU you are not obligated to implement any of the changes. Being a small time developer, what is the worst that can happen?

[+] devit|8 years ago|reply

Use forum software that already implements the required features.

[+] boggio|8 years ago|reply

God damn it EU, all these regulations make it impossible for small companies, indie developers to cope with all the bureaucracy.

The VAT for digital products, now the GDPR.

10 more years of regulation and you will spend 90% of the time working on implementing legal requirements and 10% on the actual product.

[+] lucideer|8 years ago|reply

GDPR—while vastly different to what has become the defacto standard practice in most companies—is largely simple, basic, common decency and common sense. My very tiny startup won't have any problems complying because we've actually given a smidgen of consideration to our users' privacy up until now.

In fact, I foresee it being a much greater tax on large corporations: the work in GDPR is not compliance—that's relatively easy once you have procedures in place—the real work is converting existing non-compliant systems to bring them into compliance. This is going to be much easier for those maintaining relatively small, simpler systems, and easiest of all for brand new startups.

[+] rectang|8 years ago|reply

It wasn't the company's data to begin with. Modern businesses have caused harm to countless individuals by treating data cavalierly.

The GDPR puts things right. It brings the externality into the market, and now the market can correct.

Businesses that rely upon slinging private information around irresponsibly need to adapt. If they can't, their failure in the marketplace is just.

[+] berkay|8 years ago|reply

"I think all of the above features can be implemented in a few weeks by a small team. " how to trust the rest of the article after reading this?

[+] marten-de-vries|8 years ago|reply

It's interesting to see how the GDPR seems to clash with some popular data models. For example, git.

Rewriting history of a shared branch is disastrous, but it's currently the only way to redact, say, an e-mail address someone committed with a couple of years ago. I'm curious how the various code hosting sides plan to handle that. Perhaps we'll see an extension of the data model that links commits to committer UUIDs, with the actual information being linked to that, making removal easier.

[+] andygcook|8 years ago|reply

Does this apply to internal software like Slack and Github provided by an employer to an employee?

e.g. An ex-employee requests that all their identifiable data be deleted from all communication and systems of their former employer. That seems like a problem for institutional knowledge transfer. Will the employer have to adhere to that request?

[+] pbhjpbhj|8 years ago|reply

It's "personal identifiable data" that covered. What you did at work doesn't count, your companies record of which days you worked is not personal within the company. If they shared it without anonymising it then it becomes personal.

"andygcook wrote this library" in internal company data isn't personal data.

[+] majewsky|8 years ago|reply

How is identifiable data important for institutional knowledge transfer?

[+] njl|8 years ago|reply

> Restrict processing – in your admin panel where there’s a list of users, there should be a button “restrict processing”. The user settings page should also have that button. When clicked (after reading the appropriate information), it should mark the profile as restricted. That means it should no longer be visible to the backoffice staff, or publicly. You can implement that with a simple “restricted” flag in the users table and a few if-clasues here and there.

The simple hubris in this statement is jaw-dropping. “Just a flag and a few if clauses! Easy peasy!”

[+] simonw|8 years ago|reply

This article is one of the best I've seen for describing actual features that you need to build.

I agree that the specific language here is poorly chosen ("simple" and "a few if-clauses" are perilously close to the word "just") but I don't think that should detract from the enormous value the article itself provides.

[+] lifeisstillgood|8 years ago|reply

I think the right to be forgotten is a serious flaw in what otherwise is a major step forward in Data handling law.

Data today has been compared by Schneier to pollution in the industrial revolution. The GDPR is probably the first anti-pollution law with real bite and with a real grasp of just how far this all goes (the extra-territoriality etc)

This does not make this perfect solution. I honestly don't think that "being forgotten" actually makes sense as a right - it seems to have sprung from some unusual case law in ECJ and could much more easily be dealt with by a "do not further process".

But we genuinely can always find ways to implement new laws - the most obvious is to encrypt user data, and then lose the key, but beyond that i think the best outcome of all this is to stop moving data around so much. moving data from system to system is a smell in my view - and one that a eu law is going to help architects the world over realise they are doing wrong

[+] deltron3030|8 years ago|reply

Like pollution laws, it's nonsense if not enforced worldwide. The web can't be contained to a specific locality anymore, it's against the core idea of the technology. The people in the EU who are responsible for this have no clue about the technology.

[+] goblin89|8 years ago|reply

I’ve been reading the EU’s General Data Protection Regulation, and it seems to contain certain loopholes that may be exploited by less than honest agents. The sad possibility is that the mere existence of such loopholes can push an otherwise law-obedient small companies towards mostly-ignoring GDPR in order to remain competitive.

For example, there’s this huge “if” concerning personal data removal, reiterated in multiple sections of GDPR. Quoting the very first section about data processing principles[0], personal data can be stored even after you’ve achieved the initial explicitly stated purpose, as long as it:

> will be processed solely for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) subject to implementation of the appropriate technical and organisational measures required by this Regulation in order to safeguard the rights and freedoms of the data subject (‘storage limitation’)

How wide is the range of activities that can be reasonably claimed to be for scientific or statistical purposes, or for safeguarding the rights and freedoms of your user? How strictly would this be enforced in cases where scientific and statistical purposes are closely intertwined with commercial interests, as it often happens?

Meanwhile, the referenced Article 89(1)[1] doesn’t seem to take a hard stance except for requiring “data minimization”. Even pseudonymisation is explicitly optional, as long as you’ll have a convincing argument that pseudonymising the PII you’ve collected prevents you from fulfilling your “statistical purposes”.

I’m not a lawyer and I’m wondering if someone with more expertise can weigh in on this.

[0] https://gdpr-info.eu/art-5-gdpr/

[1] https://gdpr-info.eu/art-89-gdpr/

[+] grabeh|8 years ago|reply

Any exceptions to the regulation will inevitably be subject to a narrow interpretation particularly if it is clear that someone is looking to do something which is outside the spirit of the regulation.

[+] Sylos|8 years ago|reply

These sort of cases will be decided on by a judge. You'll at the very least have to make your decisions sound reasonable.

But yeah, these loopholes do exist nonetheless. The GDPR has been reported on as the most lobbied law in the history of the EU. It scares Google, Facebook, Microsoft et al.

[+] elnygren|8 years ago|reply

GDPR and Kappa/event sourcing/message queue based/you name it architecture goes together nicely as you get audit logs of everything and it should be quite doable to propagate "delete this person's data" events around the place.

It's a huge hassle compared to what many companies are doing with customer data now but I think it's for the best.

Most things about GDPR go like "Does it feel a bit shady? It probably is. Don't do that." (depending on your moral compass of course)

One thing is for sure: there's a lot of opportunities for consultants as all the big companies need help to resolve the mess of legacy systems storing customer data.

[+] espadrine|8 years ago|reply

I didn't realize it until reading this post, but certain very popular technologies break GDPR in a deep way.

Bitcoin, for instance, contains a wealth of personal information, which by design are both public, persisted forever, and immutable.

Are blockchain products all going to need a full rewrite or a complicated hard fork?

What about the Wayback Machine? Will they need to have an endpoint that every company will need to call for every “right to be forgotten” request worldwide?

[+] TobbenTM|8 years ago|reply

(assuming you meant to write Kafka) Being able to notify every internal service to delete a user's data is always nice, but in the case of event sourcing, the events are the data. Yet you can't delete Kafka events (not sure about other platforms). In my eyes, GDPR is the death of Kafka as an event sourcing store.

[+] hanoz|8 years ago|reply

As a freelance developer I'm quite sure that if I were to force my clients to comply with as strict an interpretation of GDPR as this, I would pretty shortly find myself replaced by a freelance developer with a more relaxed attitude to GDPR compliance.

[+] sb8244|8 years ago|reply

This is probably true. If I were in this situation, I would probably only make suggestions and not force compliance.

[+] nawitus|8 years ago|reply

> I think all of the above features can be implemented in a few weeks by a small team.

That's.. optimistic in the enterprise world.

[+] zavi|8 years ago|reply

Practical guide to developers - build your product in the US then expand to Indo-Pacific. Don't bother with rolling out to Europe. AI is the future of business & healthcare, which, due to inherent need for data, is incompatible with anti-data sharing laws such as GDPR. Population is rapidly aging in Europe (47.1 year old average in Germany, 42.9 in EU), so might as well set your business up for the long term by pivoting to the region where growth will take place (and where general population is more acceptive of emerging technologies that rely on easy access to data).

[+] scrollaway|8 years ago|reply

This is the worst advice in this thread. Not only do you lose the European market for no good reason and on logic you might hear from moon landing conspiracy theorists, but you don't even solve your issue as you will still have European users no matter what. People do travel.

[+] YetAnotherNick|8 years ago|reply

What about things explicitly designed in a way that there is no option to be forgotten. What about commits in version control sites? What about mailing lists?

From skimming over the spec, it seems that politicians haven't thought about any other sites than social networks or some other profit making sites. Even in that case, if some ML system is trained on the data of the customer, do they have to re-train after anyone invokes right to be forgotten.

[+] smarx007|8 years ago|reply

Well, if you model can tell guess my name from 100 browser history entries, then yes, I want the law to require you to retrain your model once I invoke that right.

An interesting matter is a use of blockchain-like scheme. I guess the law would mean you can't put GDPR protected information in a public distributed blockchain, but instead use identifiers to decouple GDPR info from that ID. And the invocation of that right to be forgotten would require us to permanently delete an entry linking that identifier to a user.

[+] muchbetterguy|8 years ago|reply

One of the clearest things I’ve read on The subject.

It is a lot of common sense. Questions over the right to be forgotten vs tax / legal issues come under the “legitimate interest” clause I think. You should delete their data except where you are required to keep it. And that may mean deleting preferences and browsing history, but not their name and address if you are required to keep it.

I intend to implement a “forget me” feature by anonymising any PID and potentially redacting things like messages between users on our system. That way we keep info for stats purposes but don’t have any way to id a person from the data we hold.

The restoring backups / storing preferences about deletion request etc in a separate DB solution is also a good idea. It shows willing to comply with the regulation as well even if it may not strictly be compliant (e.g. until the backup has synced up with the preferences DB, you still have the PID) I think so long as you show a lot of willing and progress towards being compliant and take all practical and reasonable steps to do so, then it shouldn’t be too much of a burden.

[+] 5_minutes|8 years ago|reply

This is basically developed to protect users from big abusive companies such as Facebook, Google, Twitter and big marketing agencies.

But it really is overkill for the local restaurant that wants to mail their customers.

Using a bazooka to kill some flies.

[+] cinquemb|8 years ago|reply

It's worse than that, because GDPR in of itself, will not technically stop useds from inadvertently blasting data to any service

Decentralized services that EU citizens use will be even less in compliance as data is shared and copied between nodes by default. Sure block a few servers by spending more resources to find/go through the legal moves than it will take for a dozen more to pop up… see torrent sites/software and how people are monetizing such, because that will be the future… laws like GDPR only make such even more attractive.

And lets just set aside that nation state actors that are routinely compromised will still collect this data that will leak on to the internet… lol

These laws are analogous to those that were against the printing press… fighting the tide of reality where it's easier to do nothing than to contort something to fit a luddites dream of personal privacy provided by the state mandates (on top of building a functional product), without having to do anything oneself to protect ones interest, in the age of deep packet inspection, 0day-exploit-exfil-as-a-service, and metadata drone strikes.

Would be more effective to just make it law that users have to plug a black box onto their devices/networks so it can just filter non GDPR colored bytes lol

[+] damontal|8 years ago|reply

What about Wikipedia? User accounts are linked with article edits/history. So if you delete the user how do you handle their edits?

[+] marksomnian|8 years ago|reply

Suppression[0] or similar tech. Delete the username associated with the edit, while keeping the diff around.

[0]: https://en.wikipedia.org/wiki/Wikipedia:Oversight

[+] nihonde|8 years ago|reply

This strikes me as all very pie-in-the-sky. I understand the law, and the policies that it serves, but the article assumes that a company has a single, centralized data source that you can just put some hand-waving “if then statements” around to limit access, and that supports perfect cascading of data from the user down so we can just implement a few checkboxes to configure, etc. It sounds like good stuff, but that’s not how things work in the real world, where half your users trade Excel output, and can’t be bothered to log their interactions with third parties. I’m not saying that they shouldn’t do it, but they won’t.

[+] tobr|8 years ago|reply

I wonder how to deal with data that is accidentally identifiable. For example, imagine that you are running an anonymous poll or survey. In the general case that would not identify an individual person, but in some circumstances a particular collected answer will be unique and could theoretically be connected to an individual.

In such cases it's not really possible to give individuals control over their data, because except for the special case the whole point is that it's not connected to an individual...

[+] antaviana|8 years ago|reply

For example, imagine you only collect an email at sign up (no name, no country) and you state in your EULA that you might use the email to send onboarding information or commercial communications (promotions, newsletter) that can be opted out.

If you do not have any means to know the country where the owner of the email is located, how do you ensure the right of non-EU citizens to receive the commercial communications they have agreed to receive in your EULA unless they opt out later?

If you do not collect your user country for privacy reasons (I would be wary to sign up for a trial of a service who wants to know my citizenship), how can you prevent EU citizens from using your product?

198 comments