I don't understand the attraction to Graphql. (I do understand it if maybe you actually want the things that gRPC or Thrift etc gives you)
It seems like exactly the ORM solution/problem but even more abstract and less under control since it pushes the orm out to browser clients and the frontend devs.
ORM suffer from being at beyond arms length from the query analyzer in the database server.
A query optimizer that's been tuned over decades by pretty serious people.
Bad queries, overfetching, sudden performance cliffs everywhere.
Graphql actually adds another query language on top of the normal orm problem. (Maybe the answer is that graphql is so simple by design that it has no dark corners but that seems like a matter of mathematical proof that I haven't seen alluded to).
Why is graphql not going to have exactly this problem as we see people actually start to work seriously with it?
Four or five implementations in javascript, haskell and now go. From what I could see none of them were mentioning query optimization as an aspiration.
GraphQL is quite similar to SQL. They’re both declarative languages, but GraphQL is declaring a desired data format, whereas SQL is declaring (roughly) a set of relational algebra operations to apply to a relational database. GraphQL is really nothing like an ORM beyond the fact that they are both software tools used to get data from a database. You might use an ORM to implement the GraphQL resolvers, but that’s certainly not required.
I wouldn’t expect the performance issues to be much more problematic than they would be for REST endpoints that offer similar functionality. If you’re offering a public API, then either way you’re going to need to solve for clients who are requesting too many expensive resources. If you control the client and the server, then you probably don’t need to worry about it beyond the testing of your client code you would need to do anyway.
As far as query optimization goes, that’s largely out of scope of GraphQL itself, although many server implementations offer interesting ways to fulfill GraphQL queries. Dataloader is neat, and beyond that, I believe you can do any inspection of the query request you want, so you could for example see the nested path “Publisher -> Book -> Author -> name” and decide to join all three of those tables together. I’m not aware of any tools that provide this optimization automatically, but it’s not difficult to imagine it existing for some ORMs like those in Django or Rails.
Seems like you're looking at this through the lens of a single system that could submit a query to a single database and get all the data it needs. From that perspective GraphQL is definitely an extra layer that probably doesn't make sense. But even then there's still some value in letting the client specify the shape of the the data it needs and having client SDKs (there's definitely non-GraphQL ways to achieve these too).
My impression is GraphQL starts to shine when you have multiple backend systems, probably separated based on your org chart, and the frontend team needs to stitch them together for cohesive UX. The benchmark isn't absolute performance here, it's whether it performs better than the poor mobile app making a dozen separate API calls to different backends to stitch together a view.
The advantage of GraphQL is that the code for each API endpoint, which depends on frontend design (e.g. how many comments should be visible by default on a collapsed Facebook story), is now part of the frontend codebase (as a GraphQL query, that is then automatically extracted and moved to the backend), and thus frontend and backend development are no longer entangled.
Without it or a similar system frontend developers have to ask backend developers to create or modify an API endpoint every time the website is redesigned.
Also, it allows to combine data fetching for components and subcomponents automatically without having to do that manually in backend code, and automatically supports fine-grained caching of items.
Having seen many product teams implement graphQL, concerns were never around performances, and more around speed of development.
A typical product would require integrations with several existing APIs, and potentially some new ones. These would be aggregated (and normalised) into a single schema built on top of GraphQL. Then the team would build different client UIs and iterate on them.
By having a single queryable schema, it's very easy to build and rebuild interfaces as needed. Tools like Apollo and React are particularly well suited for this, as you can directly inject data into components. The team can also reason on the whole domain, rather than a collection of data sources (easier for trying out new things).
Of course, it would lead to performance issues, but why would you optimise something without validating it first with the user? Queries might be inefficient, but with just a bit of caching you can ensure acceptable user experience.
Where I'm at now is my first foray with GraphQL - Graphene on the Django backend and Apollo on the frontend.
I'm not sure if it is the implementation - and it could very well be - but there has been more overhead and complexities than with traditionally accessed REST APIs. I can't see much value-add.
This becomes a lot more apparent when you start to include TS in the mix.
It's attractive primarily to frontend developers. Instead of juggling various APIs (oftne poorly designed or underdesigned due to conflicting requirements and time constraints) you have a single entry into the system with almost any view of the data you want.
Almost no one ever talks about what a nightmare it becomes on the server-side, and how inane the implementations are. And how you have to re-do so many things from scratch, inefficiently, because you really have no control of the queries coming into the system.
My takeaway from GraphQL so far has been:
- good for frontend
- usable only for internal projects where you have full control of who has access to your system, and can't bring it down because you forgot an authorisation on a field somewhere or a protection against unlimited nested queries.
GraphQL was developed by Facebook to be used in conjunction with their frontend GraphQL client library called Relay. Most people opt Apollo + Redux because they were more active early on in releasing open source, and people argue it is an easier learning curve. IMO Relay is a huge win for the frontend to deal with data dependencies; and is a much better design than Apollo + Redux.
GraphQL formalizes the contract between front and back end in a very readable and maintainable way, so they can evolve in parallel and reconcile changes in a predictable, structured place (the GraphQL schema and resolvers). And it allows the frontend, with Relay, to deal with data dependencies in a very elegant and performant way.
That is upto the graphql framework and the consumers of them. Graphql is just a query language.
You need to have data loader (batching) on the backend to avoid n+1 queries and some other similar stuff with cache to improve the performance.
You also have cache and batching on the frontend usually. Apollo client (most popular graphql client in js) uses a normalized caching strategy (overkill and a pain).
For rate/abuse limiting, graphql requires a completely different approach. It's either point based on the numbers of nodes or edges you request so you can calculate the burden of the query before you execute it or deep introspection to avoid crashing your database. Query white listing is another option.
There are few other pain points you need to implement when you scale up. So yeah defo not needed if it's only a small project.
Honestly graphql is a fairly small step up from REST if you squint at it hard enough. You could get pretty much 90% of the effect of graphql with a REST framework and a couple of conventions:
- Have the client specify which fields to return, and return only those fields
- Use the above to allow for expanding nested objects when needed
- Specify an API schema somehow.
All GraphQL does is formalize these things into a specification. In my experience the conditional field inclusion is one of the most powerful features. I can simply create a query which contains all of the fields without paying for a performance penalty unless the client actually fetches all those fields simultaneously.
GraphQL queries tend to map rather neatly on ORM queries. Of course you run into the same sort of nonsense you get with ORMS, such as the n+1 one problem and whatnot. The same sort of tools for fixing those issues are available since your graphql query is just going to call the ORM in any case, with one large addition. Introspecting graphql queries is much easier than ORM or SQL queries. I can avoid n+1 problems by seeing if the query is going to look up a nested object and prefetch it. With an ORM I've yet to see one which allows you to do that.
Lastly GraphQL allows you to break up your API very smartly. Just because some object is nested in another doesn't mean they are nested in source code. One object type simply refers to another object type. If an object has some nested objects that needs query optimizing you can stick that optimization in a single place and stop worrying about it. All the objects referring to it will benefit from the optimization without knowing about it.
GraphQL combines all of the above rather smartly by having your entire API declared as (more or less) a single object. That only works because queries only run if you actually ask for the relevant fields to be returned. It's very elegant if you ask me!
Long story short: yes you run into the same sort of issues optimization wise you get with an ORM, but importantly they don't stack on top the problems your ORM is causing already.
I couldn't agree more. While GraphQL does allow you to be explicit about what you want from your backend, I've yet to see an implementation/solution that gives you back your data efficiently. If anything, the boilerplate actually seems to introduce inefficiency, with some especially inefficient joins.
And when you are explicit about how you want to implement joins etc, you pretty much have to hand code the join anyway, so I don't see the point.
In almost all use cases that I've come across, a standard HTTP endpoint with properly selected parameters works just as well as a GraphQL endpoint, without the overhead of parsing/dealing with GraphQL.
Super useful for bandwidth sensitive situations where you need to piece together a small amount of data from several APIs that normally return a large amount of data.
I would say that you're also completely ignoring the benefits of typing. It's fair to say JS's lack of typing is a deep flaw, and so tools like TypeScript and GraphQL (which pair magically by the way; free type generation!) are ways to lift the typing from the backend to the frontend and give frontends stories around typing and mocking APIs that greatly improve the testability of the code.
I don't see the conflict? If the GraphQL query is translated into SQL on the server, then then the query optimizer would optimize that just as effectively as if the query had been written in SQL originally.
In my experience GraphQL can be much nicer to implement than REST, and it offers a good structure around things that many REST APIs implement in particular ways (like selecting which fields you want). The pain you'll experience depends heavily on your data model and the abuse potential that brings.
I think the biggest problem with GraphQL is the JavaScript ecosystem around it, and all of its implicit context. It seems to be built entirely on specific servers and clients, instead of on the general concepts.
Relay[1], a popular client-side library, adds all kinds of requirements in addition to the use of GraphQL. One of those is that until version 8, it required all mutation inputs and outputs to contain a "clientMutationId", which had to be round-tripped. It was an obvious hack for some client-side problem which added requirements to the backend. Somehow it had a specification written for it instead of being fixed before release. This hack is now in public APIs, like every single mutation in the GitHub API v4.
GraphQL also includes "subscriptions", which are described incredibly vaguely and frankly underspecified. There are all kinds of libraries and frameworks that "support subscriptions", but in practice they mean they just support the websocket transport[2] created by Apollo GraphQL.
If you just use it as a way to implement a well-structured API, and use the simplest tools possible to get you there, it's a pleasure to work with.
Tbh I'd expected a little better than framing this question in a "REST vs GraphQL" discussion coming from sourcehut.org. If you control your backend, you can aggregate whatever payloads you please into a single HTTP response, and don't have to subscribe to a (naive) "RESTful" way where you have network roundtrips for every single "resource", a practice criticized by Roy Fielding (who coined the term "REST") himself and rooted in a mindset I'd call based more on cultural beliefs rather than engineering. That said, a recent discussion [1] convinced me there are practical benefits in using GraphQL if you're working with "modern" SPA frameworks, and your backend team can't always deliver the ever-changing interfaces you need so you're using a backend-for-fronted (an extra fronted-facing backend that wraps your actual backend) approach anyway, though it could be argued that organizational issues play a larger role here.
I like GraphQL but if you're just serving a single SPA I wonder about all this busy work we still have to do. Why haven't we gone a step further and just abstracted all the networking and serialisation steps away and our models are synced for us in the background. Maybe the apollo team is heading in this direction but their offline story isn't great yet.
Edit: I remember now that the Apollo team is made up of members of the former Meteor team which worked in a similar way using a client side database.
Whatever you do, don't even think that GraphQL will solve your problems. You were on the right track staying away from it till now.
I can't also advise enough to stay away from a typed language (Go in this case) serving data in a different typed language (gql). You will eventually be pulling your hair out jumping through hoops matching types.
After my last web project that require gql and go, I did some digging around, thinking, there has to be a better alternative to this. I have worked with jQuery, React, GraphQL.
Where are the GraphQL lessons learned? The author hasn't even implemented a solution with it yet, but that hasn't stopped him from declaring it to the world. I don't find an announcement useful.
Maybe GraphQL adopters aren't sharing their experiences with it in production because they're realizing its faults? People are quick to announce successes and very reluctant to own, let alone share, costly mistakes. Also, people change jobs so often that those who influence a roll-out won't even be around long enough for the post-mortem. GraphQL publicity is consequently positively biased. If the HN community were to follow up with posters who announced their use of GraphQL the last two years, maybe we can find out how things are going?
The reason this post was written was for users of Sourcehut, especially for people writing to its API. This post isn’t particularly relevant or explanatory for other audiences, but I don’t think it’s supposed to be so that’s fine.
I think this article misses the explanation of why GraphQL over REST. I usually don't like "x versus y" articles but here both have been tested on SourceHut, the hindsight should probably appear as useful.
Well, that's too bad. I always thought this was a cool project. But if you can't dev your way into decent performance for a small alpha project using python/flask/sql, I don't think your tools are the problem. And I guarantee that a graphql isn't the solution.
GraphQL as a query language is simply better than REST in most cases imo. REST has too much client side state, which not only has the potential to make things harder for clients to consume, but also has all the inconsistent states to handle where your consumer gets part way through a multiple-REST method workflow, and then bails. REST also absolutely sucks for mutating arrays.
Really I just look at GraphQL as a nice RPC framework. The graph theory operations like field level resolvers are mostly useless. But if you treat each relationship as a node rather than each field, you can get it to work very nicely with a normalized data set. I haven’t found it hard to preserve join efficiency in the backend either, and it so far hasn’t forced me into redundant query operations.
Just as long as you don’t use appsync. Really, don’t even bother.
> GraphQL as a query language is simply better than REST in most cases imo. REST has too much client side state, which not only has the potential to make things harder for clients to consume, but also has all the inconsistent states to handle where your consumer gets part way through a multiple-REST method workflow, and then bails.
How much client state you maintain seems to me to be orthogonal to GraphQL/REST.
Take your example or a multiple-REST workflow. I presume your point was that the workflow could be implemented by a single GraphQL query/mutation/whatever - but just the same, you can put as much code and logic as you like behind a REST call?
> With these, you can deploy a SourceHut instance with no frontend at all, using the GraphQL APIs exclusively.
This reminds me of Kubernetes' design. You have an API server which is practially the Kubernetes from user's perspective. `kubectl` is just one out of possibily many clients that talk to this API.
The author says that he has soured on Python for “serious, large projects”. While it’s clearly personal opinion, and that’s fair enough , I can’t help but think his choice of framework hasn’t helped him and has likely caused significant slowdown when delivering features.
Looking through some of the code for Sourcehut, there’s an insane amount of boilerplate or otherwise redundant code[1]. The shared code library is a mini-framework, with custom email and validation components[2][3]. In the ‘main’ project we can see the views that power mailing lists and projects[4][5].
I’m totally biased, but I can’t help but think “why Flask, and why not Django” after seeing all of this. Most of the repeated view boilerplate would have gone ([1] could be like 20 lines), the author could have used Django rest framework to get a quality API with not much work (rather than building it yourself[6]) and the pluggable apps at the core of Django seem a perfect fit.
I see this all the time with flasks projects. They start off small and light, and as long as they stay that way then Flask is a great choice. But they often don’t, and as the grow in complexity you end up re-inventing a framework like Django but worse whilst getting fatigued by “Python” being bad.
Exactly agreed. I basically only use Flask for things I want to explicitly be single-file these days. For anything larger, I reach for Django, because I know that if I need at least one thing from it (and I always need the ORM/migrations/admin), it will have been worth it.
My current favorite way of building APIs is this Frankenstein's monster of Django/FastAPI, which actually works quite well so far:
FastAPI is a much better way of writing APIs than DRF, I wish it were a Django library, but hopefully compatibility will improve as Django adds async support.
I maintain a fairly complex flask application and cannot see a better tool for that job. Our code looks similar as well. It's boilerplate repeated often for sure, but there will always be that one endpoint where you need that flexibility to do something a highly opinionated framework just won't let you. In the end it's deciding whether you write some extra code with flexibility or some extra code fighting the framework.
Can you show me a comparable codebase in django and how it looks? I'm genuinely curious how people deal with edge cases.
I haven't used Django in years, so maybe things have changed, but I recall two incidents that stick in my mind and prevent me from taking the whole project seriously.
The first was when they removed tracebacks. Singularly useless thing to do IMO. But there's a --show-tracebacks option (or something like that, it was a long time ago) to show tracebacks, but it didn't work. I dug into the code for this one. IIRC, the guy who added the code to suppress tracebacks didn't take into account the CLI option. I patched it to not suppress tracebacks but there turned out to be another place where tracebacks were suppressed, and I eventually gave up.
The second incident (although, thinking about it they happened in chonologically reversed order) was when a junior dev came to me with a totally wacky traceback that he couldn't understand.
All he was trying to do was subclass the HTML Form widget, like a good OOP programmer, but it turned out that Django cowboys had used metaclasses to implement HTML Forms, and utterly defeated this poor kid.
I was so mad: Who uses metaclasses to make HTML forms? Overkill much?
(In the event the solution was simple: make a factory function to create the widget then patch it to have the desired behaviour and return it. But you shouldn't have to do that: OOP works as advertised, why fuck with a good thing?)
So, yeah, Django seems to me to be run by cowboys. I can't take it seriously.
FWIW, I'm learning Erlang/OTP and I feel foolish for thinking Python was good for web apps, etc. Don't get me wrong, I love Python (2) but it's not the right solution for every problem.
I've been using home grown RPC for my servers (POST + JSON) for HTTP or binary serialization for lower level access. Works like a charm for years. Never felt like missing anything.
I wanted to learn GraphQL recently and I wrote a small library to automagically generate GraphQL schemas from SQLAlchemy models. [1]
It's inspired by Hasura, the schema is almost the same. It's not optimized at all, but it's a nice way to quickly get started with GraphQL and expose your existing models.
So I've now had the opportunity to use both GraphQL and protocol buffers ("protobufs" is the more typical term) professionally and I have some thoughts on this.
1. Protobufs use integer IDs for fields. GraphQL uses string names. IMHO this is a clear win for protobufs. Changing the name of a field of GraphQL is essentially impossible. Once a name is there it's there forever (eg mobile client versions are out there forever) so you're going to have to return null from it and create a new one. In protobufs, the name you see in code is nothing more than the client's bindings. Get a copy of the .proto file, change a name (but not the ID number) and recompile and everything will work. The wire format is the same;
2. People who talk about auto-generating GraphQL wrappers for Postgres database schemas (not the author of this post, to be clear, but it's common enough) are missing the point entirely. The whole point of GraphQL is to span heterogeneous and independent data sources;
3. Protobuf's notions of required vs optional fields was a design mistake that's now impossible to rectify without breaking changes. Maybe protobuf v3/gRPC did this. I'm honestly not sure.
4. Protobuf is just a wire format plus a way of generating language bindings for it. There are RPC extensions for this (Stubby internally at Google; gRPC externally and no they're not the same thing). GraphQL is a query language. I do think it's better than protobufs in this regard;
5. GraphQL fragments are one of these things that are probably a net positive but they aren't as good as they might appear. You will find in any large codebase that there are key fragments that if you change in any way you'll generate a massive recompile across hundreds or thousands of callsites. And if just one caller uses one of the fields in that fragment, you can't remove it;
6. GraphQL does kind of support union types (eg foo as Bar1, foo as Bar2) but it's awkward and my understanding is the mobile code generated is... less than ideal. Still, it's better than not having it. The protobuf equivalent is to have many optional submessages and there's no way to express that only one of them will be popualated;
7. Under the hood I believe the GraphQL query is stored on the server and identified by ID but the bindings for it are baked into the client. Perhaps this is just how FB uses it? It always struck me as somewhat awkward. Perhaps certain GraphQL queries are particularly large? I never bothered to look into the reason for this but given that the bindings are baked into the code it doesn't seem to gain you much;
8. GraphQL usage in Facebook is pervasive and it has first class support in iOS, Android and React. This is in stark contrast to protobufs where protobuf v2 in Google is probably there forever and protobuf v3/gRPC is largely for the outsiders. It's been several years now since I worked at Google but I would be shocked if this had changed or there was even an intention of changing it at this point;
9. The fact that you can do a GraphQL mutation and declare what fields are returned is, IMHO, very nice. It saves really awkward create/update then re-query hops.
10. This is probably a problem only for Google internally but another factor on top of protobuf version was the API version. Build artifacts were declared with this API version, which was actually a huge headache if you wanted to bring in dependencies, some of which were Java APIv1 and others Java APIv2. I don't really understand why you had to make this kind of decision in creating build artifacts. Again, maybe this has improved. I would be surprised however.
Lastly, as for Sourcehut, I had a look at their home page. I'm honestly still not exactly sure what they are or what value they create. There are 3 pricing plans that provide access to all features so I'd have to dig in to find the difference (hint: I didn't). So it's hard for me to say if GraphQL is an appropriate choice for them. At least their pages loaded fast. That's a good sign.
> Another (potential) advantage of GraphQL is the ability to compose many different APIs into a single, federated GraphQL schema.
If anyone else can share experiences of this sort of problems and solution, I'd be really interested to hear it. I've written non-GQL APIs before that back onto other internal and external services; what am I missing?
I think Drew is referring to composition in terms of API end-user code, not sr.ht code, e.g. making it possible for the user write a single GQL query that combines data from multiple sr.ht services.
I've mentioned my (negative) experience elsewhere in this thread. It wasn't actually me doing the implementation, but the result was a horrendous, unmaintainable and slow code base. Ultimately, no part of the team benefited from the use of GraphQL instead of REST.
To be fair, the same devs that built it using GraphQL would likely have made many of the same mistakes with a REST API, but I do feel it would at least have been easier to reason about the code.
A lot of places have a hell of a time dealing with very nested graphql queries that effectively DoS your app servers, or a resolver that’s very slow. Caching is also an open question. But for the simplest systems, I’d hesitate to recommend graphql at this point.
Author here. Wow, there is a ton of "didn't RTFA" comments here. It seems like half of this thread saw GraphQL in the title, it knocked two gears into place in their head, and they started writing up their own little essay about how bad it is.
I evaluated GraphQL twice before, and discarded it for many of the reasons brought up here. Even this time around, I give a rather lackluster review of it and mention that there are still many caveats. It's not magic, and I'm not entirely in love with it - but it's better than REST.
Query optimization, scalability, authentication, and many other issues raised here were part of the research effort and I would not have moved forward with it if I did not feel that they were adequately addressed.
Before assuming some limitation you have had with it in the past applies to sr.ht, I would recommend reading through the code:
If you're curious for a more detailed run-down of my problems with REST, Python, Flask, and SQLAlchemy, I answered similar comments last night on the Lobsters thread:
I would also like to point out that the last time I thought a rewrite was in order, we got wlroots, which is now the most successful project in its class.
[+] [-] WhatIsDukkha|5 years ago|reply
It seems like exactly the ORM solution/problem but even more abstract and less under control since it pushes the orm out to browser clients and the frontend devs.
ORM suffer from being at beyond arms length from the query analyzer in the database server.
https://en.wikipedia.org/wiki/Query_optimization
A query optimizer that's been tuned over decades by pretty serious people.
Bad queries, overfetching, sudden performance cliffs everywhere.
Graphql actually adds another query language on top of the normal orm problem. (Maybe the answer is that graphql is so simple by design that it has no dark corners but that seems like a matter of mathematical proof that I haven't seen alluded to).
Why is graphql not going to have exactly this problem as we see people actually start to work seriously with it?
Four or five implementations in javascript, haskell and now go. From what I could see none of them were mentioning query optimization as an aspiration.
[+] [-] baddox|5 years ago|reply
I wouldn’t expect the performance issues to be much more problematic than they would be for REST endpoints that offer similar functionality. If you’re offering a public API, then either way you’re going to need to solve for clients who are requesting too many expensive resources. If you control the client and the server, then you probably don’t need to worry about it beyond the testing of your client code you would need to do anyway.
As far as query optimization goes, that’s largely out of scope of GraphQL itself, although many server implementations offer interesting ways to fulfill GraphQL queries. Dataloader is neat, and beyond that, I believe you can do any inspection of the query request you want, so you could for example see the nested path “Publisher -> Book -> Author -> name” and decide to join all three of those tables together. I’m not aware of any tools that provide this optimization automatically, but it’s not difficult to imagine it existing for some ORMs like those in Django or Rails.
[+] [-] kevan|5 years ago|reply
My impression is GraphQL starts to shine when you have multiple backend systems, probably separated based on your org chart, and the frontend team needs to stitch them together for cohesive UX. The benchmark isn't absolute performance here, it's whether it performs better than the poor mobile app making a dozen separate API calls to different backends to stitch together a view.
[+] [-] devit|5 years ago|reply
Without it or a similar system frontend developers have to ask backend developers to create or modify an API endpoint every time the website is redesigned.
Also, it allows to combine data fetching for components and subcomponents automatically without having to do that manually in backend code, and automatically supports fine-grained caching of items.
[+] [-] real_ben_michel|5 years ago|reply
A typical product would require integrations with several existing APIs, and potentially some new ones. These would be aggregated (and normalised) into a single schema built on top of GraphQL. Then the team would build different client UIs and iterate on them.
By having a single queryable schema, it's very easy to build and rebuild interfaces as needed. Tools like Apollo and React are particularly well suited for this, as you can directly inject data into components. The team can also reason on the whole domain, rather than a collection of data sources (easier for trying out new things).
Of course, it would lead to performance issues, but why would you optimise something without validating it first with the user? Queries might be inefficient, but with just a bit of caching you can ensure acceptable user experience.
[+] [-] stevenjohns|5 years ago|reply
I'm not sure if it is the implementation - and it could very well be - but there has been more overhead and complexities than with traditionally accessed REST APIs. I can't see much value-add.
This becomes a lot more apparent when you start to include TS in the mix.
Perhaps it just wasn't a good use case.
[+] [-] dmitriid|5 years ago|reply
It's attractive primarily to frontend developers. Instead of juggling various APIs (oftne poorly designed or underdesigned due to conflicting requirements and time constraints) you have a single entry into the system with almost any view of the data you want.
Almost no one ever talks about what a nightmare it becomes on the server-side, and how inane the implementations are. And how you have to re-do so many things from scratch, inefficiently, because you really have no control of the queries coming into the system.
My takeaway from GraphQL so far has been:
- good for frontend
- usable only for internal projects where you have full control of who has access to your system, and can't bring it down because you forgot an authorisation on a field somewhere or a protection against unlimited nested queries.
[+] [-] jayd16|5 years ago|reply
Query chaining/batching and specifying a sub-selection of response data seem like solid features.
The graph schema seems to make good on some of the HATEOS promises.
I like the idea of GraphQL but the downsides have me worried.
[+] [-] rojobuffalo|5 years ago|reply
GraphQL formalizes the contract between front and back end in a very readable and maintainable way, so they can evolve in parallel and reconcile changes in a predictable, structured place (the GraphQL schema and resolvers). And it allows the frontend, with Relay, to deal with data dependencies in a very elegant and performant way.
[+] [-] eveningcoffee|5 years ago|reply
It clearly looks questionable adaption for a single organization.
[+] [-] searchableguy|5 years ago|reply
You need to have data loader (batching) on the backend to avoid n+1 queries and some other similar stuff with cache to improve the performance.
You also have cache and batching on the frontend usually. Apollo client (most popular graphql client in js) uses a normalized caching strategy (overkill and a pain).
For rate/abuse limiting, graphql requires a completely different approach. It's either point based on the numbers of nodes or edges you request so you can calculate the burden of the query before you execute it or deep introspection to avoid crashing your database. Query white listing is another option.
There are few other pain points you need to implement when you scale up. So yeah defo not needed if it's only a small project.
[+] [-] cnorthwood|5 years ago|reply
[+] [-] Doxin|5 years ago|reply
- Have the client specify which fields to return, and return only those fields
- Use the above to allow for expanding nested objects when needed
- Specify an API schema somehow.
All GraphQL does is formalize these things into a specification. In my experience the conditional field inclusion is one of the most powerful features. I can simply create a query which contains all of the fields without paying for a performance penalty unless the client actually fetches all those fields simultaneously.
GraphQL queries tend to map rather neatly on ORM queries. Of course you run into the same sort of nonsense you get with ORMS, such as the n+1 one problem and whatnot. The same sort of tools for fixing those issues are available since your graphql query is just going to call the ORM in any case, with one large addition. Introspecting graphql queries is much easier than ORM or SQL queries. I can avoid n+1 problems by seeing if the query is going to look up a nested object and prefetch it. With an ORM I've yet to see one which allows you to do that.
Lastly GraphQL allows you to break up your API very smartly. Just because some object is nested in another doesn't mean they are nested in source code. One object type simply refers to another object type. If an object has some nested objects that needs query optimizing you can stick that optimization in a single place and stop worrying about it. All the objects referring to it will benefit from the optimization without knowing about it.
GraphQL combines all of the above rather smartly by having your entire API declared as (more or less) a single object. That only works because queries only run if you actually ask for the relevant fields to be returned. It's very elegant if you ask me!
Long story short: yes you run into the same sort of issues optimization wise you get with an ORM, but importantly they don't stack on top the problems your ORM is causing already.
[+] [-] osrec|5 years ago|reply
And when you are explicit about how you want to implement joins etc, you pretty much have to hand code the join anyway, so I don't see the point.
In almost all use cases that I've come across, a standard HTTP endpoint with properly selected parameters works just as well as a GraphQL endpoint, without the overhead of parsing/dealing with GraphQL.
[+] [-] umvi|5 years ago|reply
[+] [-] nawgszy|5 years ago|reply
[+] [-] goto11|5 years ago|reply
[+] [-] unknown|5 years ago|reply
[deleted]
[+] [-] PostThisTooFast|5 years ago|reply
[deleted]
[+] [-] DrFell|5 years ago|reply
Their justification for needing it is that the API team takes too long to implement changes, and endpoints never give them the data shape they need.
The silent reason is that server-side code, databases, and security are a big scary unknown they are too lazy to learn.
A big project cannot afford to ask for high standards from frontenders. You need a hoard of cheap labor to crank out semi-disposable UIs.
[+] [-] jorams|5 years ago|reply
I think the biggest problem with GraphQL is the JavaScript ecosystem around it, and all of its implicit context. It seems to be built entirely on specific servers and clients, instead of on the general concepts.
Relay[1], a popular client-side library, adds all kinds of requirements in addition to the use of GraphQL. One of those is that until version 8, it required all mutation inputs and outputs to contain a "clientMutationId", which had to be round-tripped. It was an obvious hack for some client-side problem which added requirements to the backend. Somehow it had a specification written for it instead of being fixed before release. This hack is now in public APIs, like every single mutation in the GitHub API v4.
GraphQL also includes "subscriptions", which are described incredibly vaguely and frankly underspecified. There are all kinds of libraries and frameworks that "support subscriptions", but in practice they mean they just support the websocket transport[2] created by Apollo GraphQL.
If you just use it as a way to implement a well-structured API, and use the simplest tools possible to get you there, it's a pleasure to work with.
[1]: https://relay.dev/
[2]: https://github.com/apollographql/subscriptions-transport-ws
[+] [-] tannhaeuser|5 years ago|reply
[1]: https://news.ycombinator.com/item?id=23119810
[+] [-] jamil7|5 years ago|reply
Edit: I remember now that the Apollo team is made up of members of the former Meteor team which worked in a similar way using a client side database.
[+] [-] zapf|5 years ago|reply
Whatever you do, don't even think that GraphQL will solve your problems. You were on the right track staying away from it till now.
I can't also advise enough to stay away from a typed language (Go in this case) serving data in a different typed language (gql). You will eventually be pulling your hair out jumping through hoops matching types.
After my last web project that require gql and go, I did some digging around, thinking, there has to be a better alternative to this. I have worked with jQuery, React, GraphQL.
My conclusion was that next time I will stick to turbolinks (https://github.com/turbolinks/turbolinks) and try stimulus (https://stimulusjs.org/).
[+] [-] say_it_as_it_is|5 years ago|reply
Maybe GraphQL adopters aren't sharing their experiences with it in production because they're realizing its faults? People are quick to announce successes and very reluctant to own, let alone share, costly mistakes. Also, people change jobs so often that those who influence a roll-out won't even be around long enough for the post-mortem. GraphQL publicity is consequently positively biased. If the HN community were to follow up with posters who announced their use of GraphQL the last two years, maybe we can find out how things are going?
[+] [-] gbear605|5 years ago|reply
[+] [-] dgellow|5 years ago|reply
[+] [-] tleb_|5 years ago|reply
Thanks Drew and others for SourceHut.
[+] [-] ianamartin|5 years ago|reply
So, I mean, good luck.
[+] [-] leadingthenet|5 years ago|reply
It’s been a 10x+ improvement on Flask, in my experience.
[+] [-] AmericanChopper|5 years ago|reply
Really I just look at GraphQL as a nice RPC framework. The graph theory operations like field level resolvers are mostly useless. But if you treat each relationship as a node rather than each field, you can get it to work very nicely with a normalized data set. I haven’t found it hard to preserve join efficiency in the backend either, and it so far hasn’t forced me into redundant query operations.
Just as long as you don’t use appsync. Really, don’t even bother.
[+] [-] GordonS|5 years ago|reply
How much client state you maintain seems to me to be orthogonal to GraphQL/REST.
Take your example or a multiple-REST workflow. I presume your point was that the workflow could be implemented by a single GraphQL query/mutation/whatever - but just the same, you can put as much code and logic as you like behind a REST call?
[+] [-] mehdix|5 years ago|reply
This reminds me of Kubernetes' design. You have an API server which is practially the Kubernetes from user's perspective. `kubectl` is just one out of possibily many clients that talk to this API.
Edit: typos.
[+] [-] awinter-py|5 years ago|reply
yup
[+] [-] orf|5 years ago|reply
Looking through some of the code for Sourcehut, there’s an insane amount of boilerplate or otherwise redundant code[1]. The shared code library is a mini-framework, with custom email and validation components[2][3]. In the ‘main’ project we can see the views that power mailing lists and projects[4][5].
I’m totally biased, but I can’t help but think “why Flask, and why not Django” after seeing all of this. Most of the repeated view boilerplate would have gone ([1] could be like 20 lines), the author could have used Django rest framework to get a quality API with not much work (rather than building it yourself[6]) and the pluggable apps at the core of Django seem a perfect fit.
I see this all the time with flasks projects. They start off small and light, and as long as they stay that way then Flask is a great choice. But they often don’t, and as the grow in complexity you end up re-inventing a framework like Django but worse whilst getting fatigued by “Python” being bad.
1. https://git.sr.ht/~sircmpwn/paste.sr.ht/tree/master/pastesrh...
2. https://git.sr.ht/~sircmpwn/core.sr.ht/tree/master/srht/emai...
3. https://git.sr.ht/~sircmpwn/core.sr.ht/tree/master/srht/vali...
4. https://git.sr.ht/~sircmpwn/hub.sr.ht/tree/master/hubsrht/bl...
5. https://git.sr.ht/~sircmpwn/hub.sr.ht/tree/master/hubsrht/bl...
6. https://git.sr.ht/~sircmpwn/paste.sr.ht/tree/master/pastesrh...
[+] [-] StavrosK|5 years ago|reply
My current favorite way of building APIs is this Frankenstein's monster of Django/FastAPI, which actually works quite well so far:
https://www.stavros.io/posts/fastapi-with-django/
FastAPI is a much better way of writing APIs than DRF, I wish it were a Django library, but hopefully compatibility will improve as Django adds async support.
[+] [-] ramraj07|5 years ago|reply
Can you show me a comparable codebase in django and how it looks? I'm genuinely curious how people deal with edge cases.
[+] [-] carapace|5 years ago|reply
The first was when they removed tracebacks. Singularly useless thing to do IMO. But there's a --show-tracebacks option (or something like that, it was a long time ago) to show tracebacks, but it didn't work. I dug into the code for this one. IIRC, the guy who added the code to suppress tracebacks didn't take into account the CLI option. I patched it to not suppress tracebacks but there turned out to be another place where tracebacks were suppressed, and I eventually gave up.
The second incident (although, thinking about it they happened in chonologically reversed order) was when a junior dev came to me with a totally wacky traceback that he couldn't understand.
All he was trying to do was subclass the HTML Form widget, like a good OOP programmer, but it turned out that Django cowboys had used metaclasses to implement HTML Forms, and utterly defeated this poor kid.
I was so mad: Who uses metaclasses to make HTML forms? Overkill much?
(In the event the solution was simple: make a factory function to create the widget then patch it to have the desired behaviour and return it. But you shouldn't have to do that: OOP works as advertised, why fuck with a good thing?)
So, yeah, Django seems to me to be run by cowboys. I can't take it seriously.
FWIW, I'm learning Erlang/OTP and I feel foolish for thinking Python was good for web apps, etc. Don't get me wrong, I love Python (2) but it's not the right solution for every problem.
[+] [-] Scarbutt|5 years ago|reply
[+] [-] hn_throwaway_99|5 years ago|reply
1. The biggest mistake GraphQL made was putting 'QL' in the name so people think it's a query language comparable to SQL. It's not: https://news.ycombinator.com/item?id=23120997
2. Some benefits of GraphQL over REST: https://news.ycombinator.com/item?id=23124862
[+] [-] FpUser|5 years ago|reply
[+] [-] iooi|5 years ago|reply
It's inspired by Hasura, the schema is almost the same. It's not optimized at all, but it's a nice way to quickly get started with GraphQL and expose your existing models.
[1] https://github.com/gzzo/graphql-sqlalchemy
[+] [-] cletus|5 years ago|reply
1. Protobufs use integer IDs for fields. GraphQL uses string names. IMHO this is a clear win for protobufs. Changing the name of a field of GraphQL is essentially impossible. Once a name is there it's there forever (eg mobile client versions are out there forever) so you're going to have to return null from it and create a new one. In protobufs, the name you see in code is nothing more than the client's bindings. Get a copy of the .proto file, change a name (but not the ID number) and recompile and everything will work. The wire format is the same;
2. People who talk about auto-generating GraphQL wrappers for Postgres database schemas (not the author of this post, to be clear, but it's common enough) are missing the point entirely. The whole point of GraphQL is to span heterogeneous and independent data sources;
3. Protobuf's notions of required vs optional fields was a design mistake that's now impossible to rectify without breaking changes. Maybe protobuf v3/gRPC did this. I'm honestly not sure.
4. Protobuf is just a wire format plus a way of generating language bindings for it. There are RPC extensions for this (Stubby internally at Google; gRPC externally and no they're not the same thing). GraphQL is a query language. I do think it's better than protobufs in this regard;
5. GraphQL fragments are one of these things that are probably a net positive but they aren't as good as they might appear. You will find in any large codebase that there are key fragments that if you change in any way you'll generate a massive recompile across hundreds or thousands of callsites. And if just one caller uses one of the fields in that fragment, you can't remove it;
6. GraphQL does kind of support union types (eg foo as Bar1, foo as Bar2) but it's awkward and my understanding is the mobile code generated is... less than ideal. Still, it's better than not having it. The protobuf equivalent is to have many optional submessages and there's no way to express that only one of them will be popualated;
7. Under the hood I believe the GraphQL query is stored on the server and identified by ID but the bindings for it are baked into the client. Perhaps this is just how FB uses it? It always struck me as somewhat awkward. Perhaps certain GraphQL queries are particularly large? I never bothered to look into the reason for this but given that the bindings are baked into the code it doesn't seem to gain you much;
8. GraphQL usage in Facebook is pervasive and it has first class support in iOS, Android and React. This is in stark contrast to protobufs where protobuf v2 in Google is probably there forever and protobuf v3/gRPC is largely for the outsiders. It's been several years now since I worked at Google but I would be shocked if this had changed or there was even an intention of changing it at this point;
9. The fact that you can do a GraphQL mutation and declare what fields are returned is, IMHO, very nice. It saves really awkward create/update then re-query hops.
10. This is probably a problem only for Google internally but another factor on top of protobuf version was the API version. Build artifacts were declared with this API version, which was actually a huge headache if you wanted to bring in dependencies, some of which were Java APIv1 and others Java APIv2. I don't really understand why you had to make this kind of decision in creating build artifacts. Again, maybe this has improved. I would be surprised however.
Lastly, as for Sourcehut, I had a look at their home page. I'm honestly still not exactly sure what they are or what value they create. There are 3 pricing plans that provide access to all features so I'd have to dig in to find the difference (hint: I didn't). So it's hard for me to say if GraphQL is an appropriate choice for them. At least their pages loaded fast. That's a good sign.
[+] [-] crabmusket|5 years ago|reply
If anyone else can share experiences of this sort of problems and solution, I'd be really interested to hear it. I've written non-GQL APIs before that back onto other internal and external services; what am I missing?
[+] [-] maddyboo|5 years ago|reply
[+] [-] GordonS|5 years ago|reply
To be fair, the same devs that built it using GraphQL would likely have made many of the same mistakes with a REST API, but I do feel it would at least have been easier to reason about the code.
[+] [-] unknown|5 years ago|reply
[deleted]
[+] [-] camgunz|5 years ago|reply
[+] [-] ddevault|5 years ago|reply
I evaluated GraphQL twice before, and discarded it for many of the reasons brought up here. Even this time around, I give a rather lackluster review of it and mention that there are still many caveats. It's not magic, and I'm not entirely in love with it - but it's better than REST.
Query optimization, scalability, authentication, and many other issues raised here were part of the research effort and I would not have moved forward with it if I did not feel that they were adequately addressed.
Before assuming some limitation you have had with it in the past applies to sr.ht, I would recommend reading through the code:
https://git.sr.ht/~sircmpwn/git.sr.ht/tree/master/api
https://git.sr.ht/~sircmpwn/gql.sr.ht
If you're curious for a more detailed run-down of my problems with REST, Python, Flask, and SQLAlchemy, I answered similar comments last night on the Lobsters thread:
https://lobste.rs/s/me5emr/how_why_graphql_will_influence_so...
I would also like to point out that the last time I thought a rewrite was in order, we got wlroots, which is now the most successful project in its class.
Cheers.
[+] [-] hpen|5 years ago|reply