depends, if all requests are parallel they should hit different instances and that would distribute load more efficiently instead of pinning to single instance. You would actually get response in 200ms not to mention that your ability to properly size each node is increased. It also enables you to have a grey area where response can be partial and not just fail/pass. As usual YMMW depending on use case.
cunac|5 years ago
systemvoltage|5 years ago
adamscybot|5 years ago
GraphQL has first class support for this scenario.