Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka-avro refactoring #59

Open
ricardohbin opened this issue Feb 15, 2019 · 2 comments
Open

Kafka-avro refactoring #59

ricardohbin opened this issue Feb 15, 2019 · 2 comments

Comments

@ricardohbin
Copy link
Collaborator

ricardohbin commented Feb 15, 2019

The actual code is outdated and has a particular way of functionality: it's different of other avro implementations (like python and java) - this already have been discussed in some issues.

Should we need to write a code using confluent's CachedSchemaRegistryClient strategy as is, and remove all these fetch strategies or keep the actual implementation using fetch?

Anyway, the code style and the examples can be improved to a more actual syntax, with async/await etc.

Our integration tests also need to be improved a lot, and we need some unit tests to safer features.

But to do all this, we need to do a completely rewrite of actual code, almost a "new project".

What's the best strategy to a 2.0.0 release? Now github supports WIP pull requests - maybe this can help us with this.

What do you think @thanpolas?

@thanpolas
Copy link
Contributor

@ricardohbin you have a better grasp than me, this is your call now. I have only informed about the reason behind certain decisions like pre-caching the schemas, which i'll restate again for historical purposes. You use this library, you depend on it, your call.

Due to the nature of kafka, a service could get huge amounts of data in a single moment, it is well established that kafka's throughput can't be matched. The service handling the burst should start processing immediately. Having to wait for a roundtrip to the schema-registry while a stream of messages keeps piling up increases the chances of crashing the server and definitely creates a bottleneck that can propagate way beyond the moment it takes to dynamically fetch the schema from the SR.

@ricardohbin
Copy link
Collaborator Author

Yes, this fetch strategy is very useful in these cases. After some few feats (like fetchRefreshRate), it will only breaks when a schema is created and the message send before the app restarts or the refresh comes.

I just add this issue to get other insights of users too (and let them know some features that this lib doesn't have). It's very noble rewrites all the code, but it is working for years. The downside is the code could be improved a lot. And all this demands a LOT of time.

@ricardohbin ricardohbin changed the title Kafka-avro 2.0 Kafka-avro refactoring Sep 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants