top | item 15900563

(no title)

ewencp | 8 years ago

Regarding your first issue, this is very much a matter of defaults. I can't be sure of your exact pipeline and connectors, but if, for example, you were using the JDBC connector, it has included support for at least prefixing names since the original version, effectively supporting the namespacing you require https://docs.confluent.io/current/connect/connect-jdbc/docs/.... I agree this might not be as ideal as namespacing directly at the Kafka layer for some users. The addition of single message transforms to arbitrarily modify the topic names (based on the existing topic name or really any data in the record or any info in the transformation config) gives a lot more flexibility as of Kafka 0.10.2. On the Hadoop/Hive side, I think there may still be that limitation; transformations effectively remove it since you can arbitrarily adjust the topic the sink connector sees, but this probably isn't an obvious solution. Also, we really would prefer to avoid any coding required when using Connect. It's a difficult tradeoff between standardization (same configs everywhere), usability (minimize configs the user has to set), and simplicity+immediate usability (transformations came later and introduce configuration complexity). I (and other Kafka contributors) are certainly welcome to thoughts on how to make this all simpler; I think most software, especially open source software, errs too heavily on towards configurability, but clearly in your case you found things not configurable enough.

re: the point about backpressure, there are plenty of cases where you don't want backpressure. If you want the thing that's producing data to keep humming along even if some downstream app (lets say Connect dumping the data into HDFS for some downstream batch analytics), you don't want to see backpressure. In Kafka you should just define your retention period to be long enough to cover any slowness/lag in consumer applications -- it's pretty fundamental to its design and use cases that it doesn't have explicit backpressure from consumers back to producers. (You do get backpressure from a single broker back to the producer via the TCP connection, but I assume you meant from consumer back to producer.)

discuss

order

No comments yet.