TL;DR Django models are the database, which makes them the wrong choice for presenting a service-layer interface to the persistence. They are inherently unable to hide, encapsulate, and protect implementation details from consumer that don't care or shouldn't be allowed to access.
The Django model is a representation of the database state. It's an infrastructure-layer object. It's is _very_ tightly coupled to the database.
Your business needs should not be so coupled to the database! While it is very helpful for an RDB to accurately model your data, a database is not an application. They have different jobs.
(The TL;DR of the following paragraph is "encapsulation and interfaces")
Your business logic belongs in the "service layer" or "use case layer". The service layer presents a consistent interface to the rest of the application - whether that is a Kafka producer, the HTTP API views, another service, whatever. Your service layer has sensible, human-understandable methods like "register user" "associate device with user", whatever. These methods are going to contain business logic that often needs to be applied _before_ a database model ever exists, or apply a bunch of business logic after existing models are retrieved in order to present a nice, usable, uncluttered return value. Your service layer hides ugly or unnecessary details of the database state from the rest of the application. Consumers shouldn't care about these details, they shouldn't rely on them (so you can fix or change without breaking the interface) , and they very probably should not be presented direct access to edit whatever they want.
If you do not do this and instead choose the fat models method all of the following will happen:
1. You will repeatedly write that business logic everywhere
where you use the models. You'll write it in your serializers, your API views, your queue consumers/producers, etc. You'll never write it the same way twice and you damn sure won't test it everywhere.
2. You'll get tired of writing the same thing and you will add properties or methods on the model. This is the Fat Model! This might be appropriate for convenience property or two that calculates something or decides a flag from the state of the model, but that's it. As soon as you start reaching across domains and implementing something like "register device for user" on the user model, or the device model, you are just reinventing a service layer in a crappy way that will eventually make your model definition 4000 lines long (not even remotely an exaggeration).
3. Every corner of your application will be updating the database - via the model - however it wants. They will rely on it! Whole features will be built on it! Now when it's time to deprecate that database field or implement a new approach, too bad. 20 different parts of your app are built on the assumption that any arbitrary database update allowed by the model is valid and a-ok.
Preferred approach:
1. Each domain gets a service layer, which contains business logic, but also presents an nice reliable interface to anything else that might consume that domain. This interface includes raising business logic errors that mean something related to our business logic. It does not expose "Django.models.DoesNotExist" or "MultipleObjectsReturned". It returns an error that tells the service consumer what went wrong or what they did wrong.
2. The service layer is the only thing that accesses or sees the Django models aka the database state. It completely hides the Django models for its domain from the rest of the application. It returns dataclasses or attrs, or whatever you want to use. The models are no longer running rampant all over the application getting updated and saved willy nilly. The service layer controls what the consumers in the rest of the application can know and do.
You will write more boilerplate. It will be boring. You will write more tests. It will be boring. But it will be reliable and modular and easier to reason about, and you can deliver features and changes faster and with much less fear of breakage.
Your business logic will live one place, completely decoupled, and it can be tested alone with everything else mocked.
How your consumers (like API views)turn service responses and errors into external (like HTTP) responses and errors, lives in one place, completely decoupled, and can be tested alone with everything else mocked.
Your models will not need to be tested because they are just a Django model. They don't do anything that's not already 100% tested and promised by the Django framework.
We started moving off "fat models" at my job and onto DDD (service methods, entities, etc.), and I have to say after a year I'm not a fan. Here are my beefs:
1. If you're not using models, it's a lot of work to stay fast.
If you've got a Customer instance, and you want to get customer.orders, you've got a problem if it's not lazy. If it's a queryset, you get laziness for free, if it isn't you have to build it yourself. God help you if you have anything even remotely complicated. You also need trapdoors everywhere if you want to use any Django feature like auth, or Django libraries.
2. You have to build auth/auth yourself
Django provides really nice auth middleware and methods (user_passes_test).
3. Service methods only do things something else should be doing.
You might be doing deserialization, auth/auth checks, database interactions, etc. All of that stuff belongs at a different layer (preferably abstracted away like @user_passes_test or serializers).
4. The model exposed by Django and DRF is actually pretty good, and you'll probably reimplement it (not as well)
The core request lifecycle is:
request -> auth -> deserialize -> auth -> db (or other persistence stuff) -> business stuff -> db (or other persistence stuff) -> serialize -> response
We've reimplemented all of those layers, and since we built multiple domains we reimplemented some of them multiple times. It probably would've been better to just admit "get_queryset" and the like are good ideas.
5. Entities are a poor substitute for regular objects and interfaces.
We've mostly ended up wrapping our existing models in entities, but just not implementing most of the properties/fields/attributes/methods. But again, we have to trapdoor a lot, we have trouble with laziness and relationships in general, and we have a lot of duplicate code in our different domains.
6. We have way too many unit tests.
Changing very small things requires changing between 5-10 tests, each of which use mocks and are around a dozen lines at least. Coupled with the level of duplication, this has really slowed us down. They also take _forever_ to run.
FWIW I think you're right about jamming too much into models; I think that works at a small scale but really breaks down quickly. I think at this point, my preferences are:
1. Ideally, your business logic should be an entirely separate package. It shouldn't know about HTML, JSON, SQL, transactions, etc. This means all that stuff (serialization, persistence) is handled in a different layer. Interfaces are your friend here, i.e. you may be passing around something backed by models, but it implements an interface your business logic package defines.
2. The API of your business logic package are the interfaces you expose and document. The API of your application is your REST/GraphQL/whatever API--that you also document.
3. Models should be solely database-specific. If you're not dealing with the database and joins and whatever, it doesn't go in models and it doesn't go in managers.
4. Don't make a custom user model [1].
5. Serialization, auth, and persistence should be a declarative and DRY as possible. That means class-level configuration and decorators.
6. Bias strongly against unit tests, and rely more strongly on integration tests. Also consider using them during development/debugging, and removing them when you're done.
Does that seem reasonable to you? I spend a lot of time thinking about this stuff, and I would like my life to be less about it (haha) so, any insight you can give would be super appreciated.
skrtskrt|5 years ago
The Django model is a representation of the database state. It's an infrastructure-layer object. It's is _very_ tightly coupled to the database.
Your business needs should not be so coupled to the database! While it is very helpful for an RDB to accurately model your data, a database is not an application. They have different jobs.
(The TL;DR of the following paragraph is "encapsulation and interfaces") Your business logic belongs in the "service layer" or "use case layer". The service layer presents a consistent interface to the rest of the application - whether that is a Kafka producer, the HTTP API views, another service, whatever. Your service layer has sensible, human-understandable methods like "register user" "associate device with user", whatever. These methods are going to contain business logic that often needs to be applied _before_ a database model ever exists, or apply a bunch of business logic after existing models are retrieved in order to present a nice, usable, uncluttered return value. Your service layer hides ugly or unnecessary details of the database state from the rest of the application. Consumers shouldn't care about these details, they shouldn't rely on them (so you can fix or change without breaking the interface) , and they very probably should not be presented direct access to edit whatever they want.
If you do not do this and instead choose the fat models method all of the following will happen:
1. You will repeatedly write that business logic everywhere where you use the models. You'll write it in your serializers, your API views, your queue consumers/producers, etc. You'll never write it the same way twice and you damn sure won't test it everywhere.
2. You'll get tired of writing the same thing and you will add properties or methods on the model. This is the Fat Model! This might be appropriate for convenience property or two that calculates something or decides a flag from the state of the model, but that's it. As soon as you start reaching across domains and implementing something like "register device for user" on the user model, or the device model, you are just reinventing a service layer in a crappy way that will eventually make your model definition 4000 lines long (not even remotely an exaggeration).
3. Every corner of your application will be updating the database - via the model - however it wants. They will rely on it! Whole features will be built on it! Now when it's time to deprecate that database field or implement a new approach, too bad. 20 different parts of your app are built on the assumption that any arbitrary database update allowed by the model is valid and a-ok.
Preferred approach:
1. Each domain gets a service layer, which contains business logic, but also presents an nice reliable interface to anything else that might consume that domain. This interface includes raising business logic errors that mean something related to our business logic. It does not expose "Django.models.DoesNotExist" or "MultipleObjectsReturned". It returns an error that tells the service consumer what went wrong or what they did wrong.
2. The service layer is the only thing that accesses or sees the Django models aka the database state. It completely hides the Django models for its domain from the rest of the application. It returns dataclasses or attrs, or whatever you want to use. The models are no longer running rampant all over the application getting updated and saved willy nilly. The service layer controls what the consumers in the rest of the application can know and do.
You will write more boilerplate. It will be boring. You will write more tests. It will be boring. But it will be reliable and modular and easier to reason about, and you can deliver features and changes faster and with much less fear of breakage.
Your business logic will live one place, completely decoupled, and it can be tested alone with everything else mocked.
How your consumers (like API views)turn service responses and errors into external (like HTTP) responses and errors, lives in one place, completely decoupled, and can be tested alone with everything else mocked.
Your models will not need to be tested because they are just a Django model. They don't do anything that's not already 100% tested and promised by the Django framework.
unknown|5 years ago
[deleted]
camgunz|5 years ago
1. If you're not using models, it's a lot of work to stay fast.
If you've got a Customer instance, and you want to get customer.orders, you've got a problem if it's not lazy. If it's a queryset, you get laziness for free, if it isn't you have to build it yourself. God help you if you have anything even remotely complicated. You also need trapdoors everywhere if you want to use any Django feature like auth, or Django libraries.
2. You have to build auth/auth yourself
Django provides really nice auth middleware and methods (user_passes_test).
3. Service methods only do things something else should be doing.
You might be doing deserialization, auth/auth checks, database interactions, etc. All of that stuff belongs at a different layer (preferably abstracted away like @user_passes_test or serializers).
4. The model exposed by Django and DRF is actually pretty good, and you'll probably reimplement it (not as well)
The core request lifecycle is:
request -> auth -> deserialize -> auth -> db (or other persistence stuff) -> business stuff -> db (or other persistence stuff) -> serialize -> response
We've reimplemented all of those layers, and since we built multiple domains we reimplemented some of them multiple times. It probably would've been better to just admit "get_queryset" and the like are good ideas.
5. Entities are a poor substitute for regular objects and interfaces.
We've mostly ended up wrapping our existing models in entities, but just not implementing most of the properties/fields/attributes/methods. But again, we have to trapdoor a lot, we have trouble with laziness and relationships in general, and we have a lot of duplicate code in our different domains.
6. We have way too many unit tests.
Changing very small things requires changing between 5-10 tests, each of which use mocks and are around a dozen lines at least. Coupled with the level of duplication, this has really slowed us down. They also take _forever_ to run.
FWIW I think you're right about jamming too much into models; I think that works at a small scale but really breaks down quickly. I think at this point, my preferences are:
1. Ideally, your business logic should be an entirely separate package. It shouldn't know about HTML, JSON, SQL, transactions, etc. This means all that stuff (serialization, persistence) is handled in a different layer. Interfaces are your friend here, i.e. you may be passing around something backed by models, but it implements an interface your business logic package defines.
2. The API of your business logic package are the interfaces you expose and document. The API of your application is your REST/GraphQL/whatever API--that you also document.
3. Models should be solely database-specific. If you're not dealing with the database and joins and whatever, it doesn't go in models and it doesn't go in managers.
4. Don't make a custom user model [1].
5. Serialization, auth, and persistence should be a declarative and DRY as possible. That means class-level configuration and decorators.
6. Bias strongly against unit tests, and rely more strongly on integration tests. Also consider using them during development/debugging, and removing them when you're done.
Does that seem reasonable to you? I spend a lot of time thinking about this stuff, and I would like my life to be less about it (haha) so, any insight you can give would be super appreciated.
[1]: https://docs.djangoproject.com/en/3.0/topics/auth/customizin...