top | item 15606250

Data Classes for Java

261 points| mfiguiere | 8 years ago |cr.openjdk.java.net | reply

205 comments

order
[+] HumanDrivenDev|8 years ago|reply
... To write such a class responsibly, one has to write a lot of low-value, repetitive code: constructors, accessors ...

If you're writing a plain data class, why on earth would you write getters and setters? Just make your fields public and be done with it.

I will be forever perplexed by the idea of getters and setters (and this extends to C#s syntax sugar). I have no idea what problem they solve. If you're at the level of plain data, then just make a class with public fields and no methods. If you're trying to write a 'higher level' object - then don't refer to the names of fields in your method names. The getter/setter/property approach is just the worst of both worlds. They make explicit reference to an objects internal fields, while at the same time obfuscating what happens when you actually access them.

[+] derefr|8 years ago|reply
> obfuscating what happens when you actually access them

This is the point, AFAICT. In languages with both primitive "read/write field" operations that can't be interceded upon, but also with interfaces/protocols, "obfuscating" (encapsulating) a field access in getter/setter methods is how you allow for alternative implementations of the interface/protocol.

Specifying your object's interface in terms of getters/setters allows for objects that satisfy your interface but which might:

• read from another object (flyweight pattern)

• read from another object and return an altered value (decorator pattern)[1]

• read from an object in a remote VM over a distribution protocol (proxy pattern)

Etc.

Then a client can build these things and feed them to your library, and you won't know that they're any different than your "ordinary" data objects.

You don't see this in languages with "pure" message-passing OOP, like Ruby or Smalltalk, because there's no (exposed) primitive for field access. All field access—at the AST level—goes through an implicit getter/setter, and so all field access can be redefined. But in a language like Java, where field access has its own semantics divorced from function calls, you need getters/setters to achieve a similar effect.

And yes, this can cause problems—e.g. making expected O(1) accesses into O(N) accesses, or causing synchronous caller methods to block/yield. This is why languages like Elixir, despite having dynamic-dispatch features, have chosen to discourage dynamic dispatch: it allows someone reading the code to be sure, from the particular function being called, what its time-and-space-complexity is. You know when you call MapSet.fetch that you're getting O(1) behaviour, and when you call Enum.fetch that you're not—rather than just accessing MapSets through the Enum API and having them "magically" be O(1) instead of O(N).)

---

[1] Which is a failure of the Open-Closed Principle. That doesn't stop people.

[+] geophile|8 years ago|reply
I think getters/setters are an abomination resulting from a sensible OO idea taken too far.

Having private fields and public methods has benefits that are so obvious that I won't belabor the point. Sometimes -- maybe even often -- you really do want to get and set some fields, so then you have getters and setters. But applying this pattern blindly, to all fields, is insane and, as many people on this thread have noted, defeat the point of private fields and public methods. Just make the fields public, because that's how much protection you have anyway. (Adding code around the getting and setting is, of course, easier with methods already in place.)

I think this trend really got started with Java Beans, in which getters/setters were required (unless you wanted to write yet more code to nominate other methods as the getters and setters). And then the stupidity set in. Well why wouldn't you want your object to be a bean? Beans are good! Be a bean.

I believe that things like corporate coding standards then kicked in and made this nonsense unavoidable.

[+] Tarean|8 years ago|reply
If you have a temperature class with a celsius field and change the internal representation to kelvin you have to update all client code to call a conversion method. C#s syntax sugar allows you to slot in the conversion method without a breaking change but it screws with the cost model of field accessing when used unresponsibly. Java doesn't have it so you would have to hide everything behind getters if you don't want to tie yourself to a representation.

Whether a data class should touch enough code to make the transformation a burden is another matter, though.

[+] jt2190|8 years ago|reply
The pattern evolved from early Java’s lack of reflection. In order to give tools the ability to set/get data, we got the JavaBean. The culture internalized the pattern, and we’ve been mindlessly doing it ever since.
[+] bunderbunder|8 years ago|reply
I can't speak to Java, but in C#, the original goal was to reduce coupling. You can add behavior to getter and setter methods without breaking binary compatibility among packages. It absolutely is obfuscation, to the extent that encapsulation is just a special kind of obfuscation.

Possibly more importantly, public fields give you no way to create immutable types.

[+] vaskebjorn|8 years ago|reply
In my experience writing getters and setters has been boilerplate 99% of the time.

But I still write them because situations arise where you do need to change the nature of that property, sometimes dynamically, and then it's suddenly worth it. You can, for instance, change a getter and none of the class's clients need to know or care about the change.

Though I haven't used Groovy much I like their approach to this: "Although the compiler creates the usual getter/setter logic, if you wish to do anything additional or different in those getters/setters, you’re free to still provide them, and the compiler will use your logic, instead of the default generated one."

[+] BlackFly|8 years ago|reply
Short answer: to avoid side effects from other parts of the code rippling into the data class.

Some objects are inherently mutable and exposing a reference to them allows mutability. You might not have control of all the class definitions in your data domain in order to make them immutable. Instead, a getter and setter can create equal copies and return those.

This is assuming you are talking about getters and setters on immutable objects. If you are talking about mutable objects, then they really need getters and setters to ensure that mutations on the object they return or receive do not affect them.

[+] FollowSteph3|8 years ago|reply
They are useless until one day they become extremely useful. In other words most of the time there’s minimal advantages. However by following good form you will eventually come across and instance where you have to do something and instead of having to change code in tons and tons of places, places you may not event have access to such as a public API, you don’t realize the importance. Hopefully you never across the need to do this, it’s pretty rare, but when it happens I can’t tell you how valuable it is.

The benefits are all in code maintenance. Again most of the time it’s useless but when you need I can’t tell you how incredibly useful it can be. And once you come across such an instance you never again ask why or refuse to do it ;)

[+] aryehof|8 years ago|reply
Frameworks elevated the need for getters and setters, given the need to programmatically access the internals of objects.

Their overuse is also a consequence of the popularity of a programming style of code acting on data, typically within an application that has class data merely representing database entities, acted on by separate logic and constraints in an application/service layer. Behavior based object design seems to be an endangered species.

[+] aomix|8 years ago|reply
These kind of classes are structs with busy work attached. At work I feel a little guilty when I delete them over the course of refactoring but if no processing on the incoming and outgoing data is necessary then public fields are better in every way by being easier and simpler.
[+] drizze|8 years ago|reply
To provide flexibility in the future. When your data comes from a method you can change its source without changing the caller. This can be very useful and should be leveraged whenever possible.
[+] mohaine|8 years ago|reply
Getters/Setters do have a couple of reasons to exist:

1) Allowing you to validate data/morph data on your objects on set. (example: Date is not a workday)

2) Allow others to do either 1 or other actions on set/get (examples: Hibernate entity beans that update db/set dirty on set, GUI can extend object to update on set)

That said it was a total PITA when this first became popular. I spent way too much time changing code over because that was now "best practice". And don't get me started on Boolean not being get but is or has.

[+] wtetzner|8 years ago|reply
> If you're writing a plain data class, why on earth would you write getters and setters? Just make your fields public and be done with it.

I think it's because you can't specify fields interfaces. If you different data classes, but you want them all to be "Named", you need "getName()" on your "Named" interface, because you can say that it should have a "name" field.

Of course, that's a language flaw, but it's one that people have to work around.

[+] ScottBurson|8 years ago|reply
Setters, at least, I find to be frequently useful places to put breakpoints when debugging. This is quite a bit rarer with getters, but I guess it happens occasionally.
[+] pilif|8 years ago|reply
If you ever have to change a fields meaning or add a new field and provide the old fields value for older clients, you can't just make it private and add a public setters and getters without breaking compatibility for all your users - not just binary compatibility but also source-level compatibility.
[+] josteink|8 years ago|reply
> If you're writing a plain data class, why on earth would you write getters and setters? Just make your fields public and be done with it

I'm not going to completely dismiss the such a practice may be plain old cargo-culting...

But as devils-advocate, properties are implemented as methods/functions, and as such (under the hood) accessed through a vtable.

This means they can be overridden in derived (generated) classes, which can then implement things like change-detection and other stuff which may be of use for a data-access/persistence class. And these derived classes can be returned in place, without the calling code knowing nothing about it.

> They make explicit reference to an objects internal fields

Not really. A property is supposed to be a public API, and definitely not "internal".

[+] nwatson|8 years ago|reply
Even for just plain data values, getters and setters can be decorated with annotations that modify the member-"property's" behavior depending on context, e.g., what should happen to this member when I serialize to JSON ... or when I read from a DB?

I've found these decorators real useful in the past ... and while they balloon the code for the "data class" itself, they eliminate a lot of code elsewhere.

E.g., this may violate some "separation of concerns", but often one will want to: (a) read some hierarchical data from JSON in a web service request; (b) persist these elements to DB, maintaining proper relations between entities read in from JSON but now persisted as rows to multiple tables in a DB; (c) at some later point read back some of those elements and reconstruct the hierarchy and perhaps transform it or merge it with other hierarchies; (d) perhaps re-persist it, or send it out as a JSON response ... so ...

... in such a situation I am dealing with the same small set of "data classes" everywhere -- but when persisting or reading from a DB I have data fields that represent "primary" or "foreign key" DB values -- I possibly don't want those field values to ever get written to JSON; other member fields deal with the semantic hierarchy of that data in the JSON-like or internal-Java view (e.g., lists, maps, pointers-to-parents-in-hierarchies) ... but I don't want those members ever to be written to the DB (since they're replaced by PK/FK references) ... >>> rather than maintaining two sets of data classes for the two alternative representations (and the ugly boilerplate required to translate between them), it's much easier to have just one set of semantic data classes that each knows how to live in each context (DB, JSON, in-memory, etc. ...). <<<

The key to telling these data classes and their members to behave differently, in Java, for each context (JSON deserialize vs JSON serialize vs DB read vs DB write), is to use decorators on their setters and getters.

EDIT: included more lengthy rationale.

[+] flukus|8 years ago|reply
You'll get you run out of town with that level of heresy in most java/c# shops. Getters and Setters are great when you want finer control over what can access certain fields, in a List class for instance, I wouldn't want the length to be externally writable. But somewhere along the way it went from a useful feature in some circumstances to a mandatory feature in all circumstances.

In the c# world "public int Id;" will fail code review but "public int Id { get; set; }" will be just fine. I've never seen a justification for it though, if we make a change in future it's going to take just as much work and 99.99% of the time will never be required.

But we have to appease the encapsulation gods.

[+] virmundi|8 years ago|reply
In Java, they came with the bean standard. The goal was to codify properties to make reflection easier. That’s it.

If you don’t need reflection or the bean utilities that help with that, just use simple public final T values.

[+] camus2|8 years ago|reply
> If you're writing a plain data class, why on earth would you write getters and setters? Just make your fields public and be done with it.

When doing pure OOP, objects should only communicate via methods, period, it's called state encapsulation and is one of the core principle of OOP, the second being polymorphism.

> I will be forever perplexed by the idea of getters and setters (and this extends to C#s syntax sugar). I have no idea what problem they solve.

Then you don't understand encapsulation.

[+] michaelgrosner2|8 years ago|reply
In C# 6, using properties is actually slightly less verbose, "public int Foo { get; }" vs "public readonly int Foo;". Also, you can easily slap an interface around your data class if you need to later. So my thinking has switched when writing new C# -- why not use a property?
[+] moron4hire|8 years ago|reply
You don't get to execute validation code on a bare field right away.
[+] pvg|8 years ago|reply
You're cutting out/ignoring a big chunk of the quote and then arguing a point the author didn't make. Simply accessing properties directly doesn't solve the problem.
[+] xapata|8 years ago|reply
If you have the option of a property later, there's no reason to start with accessor methods, but some languages don't support properties.
[+] iamandoni|8 years ago|reply
While I mostly agree, getters without setters is a nice way to expose that youre data is immutable rather than relying on final fields
[+] danvasquez29|8 years ago|reply
I do it in PHP to make something immutable. data goes into the constructor and then only getters are provided.
[+] haglin|8 years ago|reply
Some reasons:

- more efficient representation

- thread safety

- security (defensive copy)

- validation

- logging

- debugging

- backward compatibility

[+] tobiasSoftware|8 years ago|reply
Getters and setters are confusing because there are no reasons to use them for small projects, like what you would build in a college course. Software becomes exponentially confusing the larger it becomes. In addition, one piece of software may be built by multiple teams, such as using third party libraries. The best way to handle the additional complexity is to split each section into two: the interface (public) and the implementation (private). Here are the reasons why:

First, change. Changes to code happen all of the time. If someone wants to build a library so that many other users can interact with it, they have to realize that one day changes to that library will need to happen. If code is separated into public and private properly, then any private functions and fields can be rewired, and as long as the rewiring makes the public functions and fields perform the same, then the change will not break anything. How important is not breaking things? Well anyone familiar with Python's 2 and 3 schism knows the pain of changes that break compatibility.

Second, visibility. When you are interacting with a large codebase, you need to know what functions and fields to interact with. IDEs have very useful tools that can list these for you, but if everything is public then they will list everything. If 90% of your codebase is private data, that is a lot of useless functions and fields to sort through!

Third, hackery. When you are building a large codebase, there may be some things you don't want you programmers interacting with it to do. Users might take advantage of quirks in your code that were never meant to be used that way. XKCD's Spacebar Heating comic illustrates this nicely. Another reason may be security, as you might not want someone to have full access to functions and fields to poke around and understand your code better.

So these are the reasons why public and private are important concepts. The problem is then if something starts out as public, it won't have any of these advantages. If you want to change it to private later on, then you are screwed. So the Java solution is to make everything private from the beginning. Vanilla getters and setters are really just making private data appear public. Then if the data needs to be changed from "public" to private, the getters and setters can be modified accordingly.

Python's property approach is a similar concept, but infinitely better, as instead of relying on a programmer to build getters and setters, the language itself invisibly builds them under the hood. Then if you want to change something from public to private, instead of being screwed, you just add a property tag with the getter and setter code to change the getter and setter from an invisible default to a visible modification.

[+] geodel|8 years ago|reply
Generating getters and setters are day jobs for 'JavaBean Jerries' which comprise most if not all enterprise Java developers.
[+] tantalor|8 years ago|reply
AutoValue with Builders is nice:

  Animal dog = Animal.builder()
    .setName("dog")
    .setNumberOfLegs(4)
    .build();
https://github.com/google/auto/blob/master/value/userguide/b...
[+] guelo|8 years ago|reply
Not that nice compared to kotlin:

    data class Animal(dog: String, numberOfLegs: Int)  

    val dog = Animal(
        name = "dog",
        numberOfLegs = 4
    )
The amount of boilerplate for AutoValue builders is so ridiculous that most people use IDE plugins to generate it.
[+] harryh|8 years ago|reply
I, for one, am enjoying the ever so slow scalaization of java.
[+] sushisource|8 years ago|reply
If the JVM magically got rid of nulls and Scala cleaned up some of the slightly wartier bits (implicits come to mind, although they are useful and no clue how I'd fix em...) then Scala would be my favorite language around by quite a bit.

Besides maybe Rust. Really different usecases though.

[+] bobbyi_settv|8 years ago|reply
But it also can be annoying as a Scala programmer because you are going to have to be aware of both the Java and the Scala versions of each thing and how they differ.

For example, under this proposal, you will need "new" when creating an instance of a Java data class, but you are probably in the habit of omitting it when creating an instance of a case class.

[+] icedchai|8 years ago|reply
Or just install Lombok.
[+] didibus|8 years ago|reply
Can't do java without it.
[+] bherrmann7|8 years ago|reply
Lombok @Value for the win!
[+] tybit|8 years ago|reply
As the article starts out with, Algebraic Anny, Boilerplate Billy etc wanting different things, I wonder if rather than side stepping this issue with a compromise if they could solve it with metaclasses as proposed for C++ https://herbsutter.com/2017/07/26/metaclasses-thoughts-on-ge...

I really like the idea of metaclasses and would love to see proposals for C# and Java too. As the proposal mentions Java and C# have already gone down a separate path with interfaces (and structs for C#) but would be cool to see if it could be resolved anyway.

[+] bognition|8 years ago|reply
Honestly immutable data classes that are created via builders is a must have in Java.
[+] emidln|8 years ago|reply
Make fields public final and your constructor is your "builder".
[+] pjmlp|8 years ago|reply
Interesting to see everyone discussing Java vs C# properties, when C# properties are actually based on how Eiffel and Delphi do them. :)
[+] nikolay|8 years ago|reply
Why underscores in "__data"? Is this just a temporary thing?
[+] kodablah|8 years ago|reply
> Can a data class extend an ordinary class? Can a data class extend another data class? Can a non-data class extend a data class?

No, no, and no. There are interfaces with default impls these days, don't allow base class state.

And figure out how to make interfaces "sealed".

[+] virmundi|8 years ago|reply
Can’t we just use AOP to generate the boilerplate code if we are ok with Kotlin style data types? Toss an annotation on the class like @Data. After that everything should get generated at compile time.
[+] yeupyeupyeup|8 years ago|reply
Classic Brian Goetz bashing Java serialization.
[+] deepsun|8 years ago|reply
"data class" is already in Kotlin flavor.

Yes, I refuse to call Kotlin a different programming language like Scala, because it's just syntactic sugar over good-ol-Java, while all tools, approaches, stackoverflow are the same. But it has the data classes, implemented just like the article described.

[+] itronitron|8 years ago|reply
This seems unnecessary if the functional / lambda approach is fully adopted as the data can be understood by looking at the functions that operate on the data, so there would be less of a need for simplifying access to the data values when passing them into functions. As I have developed more data-intensive computational methods in Java I no longer implement data encapsulating classes but rather just use collections containing the basic data types.