-
Notifications
You must be signed in to change notification settings - Fork 1.3k
kafka sink #648
kafka sink #648
Conversation
Can one of the admins verify this patch? |
We found a Contributor License Agreement for you (the sender of this pull request) and all commit authors, but as best as we can tell these commits were authored by someone else. If that's the case, please add them to this pull request and have them confirm that they're okay with these commits being contributed to Google. If we're mistaken and you did author these commits, just reply here to confirm. |
1 similar comment
We found a Contributor License Agreement for you (the sender of this pull request) and all commit authors, but as best as we can tell these commits were authored by someone else. If that's the case, please add them to this pull request and have them confirm that they're okay with these commits being contributed to Google. If we're mistaken and you did author these commits, just reply here to confirm. |
ok to test |
CLAs look good, thanks! |
1 similar comment
CLAs look good, thanks! |
Can one of the admins verify this patch? |
ok to test On Wed, Oct 14, 2015 at 5:00 PM, cadvisorJenkinsBot <
|
Thanks , @vishh , would you please kindly review this PR? : ) |
@huangyuqi if this PR is ready on your end, please remove WIP from the title |
To use the kafka sink add the following flag: | ||
|
||
``` | ||
--sink=kafka:<KAFKA_SERVER_URL>[?<OPTIONS>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please indent via four spaces if it's just one line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mvdan ,thanks for your comments so much. The adding lines of Kafka sink are copied from the above sinks to keep pace with their format. So i think these lines are not one line, i will retain this. : )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the format was changed weeks ago, the other sink configuration lines are indented with four spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thanks so much. : )
You applied some of the fixes by amending your original commit, i.e. rebasing the history. Please don't do that until the PR is ready, so that the history of changes is kept. |
return nil, fmt.Errorf("failed to parser url's query string: %s", err) | ||
} | ||
|
||
if len(opts["timeseriestopic"]) < 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really nitpicky, but usually people test if strings are empty like if s == ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mvdan , thanks for your attentive comments. Here, I think the opt
of query string is often an array, so i test it by len()
. Do you agree with me? ^_^
Thanks for all the fixes! The code looks a lot better now. Someone with a better understanding of kafka can probably speak as to the quality of the implementation. |
Thanks @mvdan for your meticulous comments. |
Note that four comments are still standing - they will be hidden once fixed. |
@mvdan , thanks for your attentive comments. In the four remaining comments:
|
if err != nil { | ||
return fmt.Errorf("failed to transform the items to json : %s", err) | ||
} | ||
message := &proto.Message{Value: []byte(string(jsonItems))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are exposing an internal data structure TimeSeries
over the wire here. If we update the data structure in the future, this will break all client of this sink. That is not desirable.
What is the minimum amount of data that you need? Are there any standard formats for exposing metrics and events kind of data through kafka?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments. @vishh .
I need all the ops information, includes metrics and events.
Exposing an internal data structure TimeSeries over the wire is not well advised. I will pick out the effective information about metrics and events to give the kafka sink.Include: timestamp, value of metrics and events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. You can create a versioned structure in this sink.
On Mon, Oct 26, 2015 at 4:32 AM, yuqi huang [email protected]
wrote:
In sinks/kafka/driver.go
#648 (comment):
return fmt.Errorf("failed to produce Kafka messages: %s", err)
}
- }
- return nil
+}
+// produceKafkaMessage produces messages to kafka
+func (self *kafkaSink) produceKafkaMessage(v interface{}, topic string) error {
- if v == nil {
return nil
- }
- jsonItems, err := json.Marshal(v)
- if err != nil {
return fmt.Errorf("failed to transform the items to json : %s", err)
- }
- message := &proto.Message{Value: []byte(string(jsonItems))}
Thanks for your comments. @vishh https://github.com/vishh .
I need all the ops information, includes metrics and events.
Exposing an internal data structure TimeSeries over the wire is not well
advised. I will pick out the effective information about metrics and events
to give the kafka sink.Include: timestamp, value of metrics and events.—
Reply to this email directly or view it on GitHub
https://github.com/kubernetes/heapster/pull/648/files#r42982284.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vishh , Sorry for the delay. I have add two internal data structures for kafka sink. Would you please check again? Thanks ^_^
Sorry for the delay @huangyuqi! Thanks for the PR! I have posted a couple of comments. Other than that, this PR LGTM. Lets get this in quickly. |
@huangyuqi: Is this PR ready for review? Have you addressed the pending comments? |
@vishh : this PR is ready, I have fixed all the comments. : ) |
@vishh , Would you please review the PR of kafka sink for some modification about data structure and kafka's client? : ) |
The part about sink setup isn't clear to me yet. Once you clarify the behavior there, this PR is good to go.
SGTM! Just curious - does ES support retention policies? |
Thanks, @vishh :
type BrokerConf struct {
// Kafka client ID.
ClientID string
// LeaderRetryLimit limits the number of connection attempts to a single
// node before failing. Use LeaderRetryWait to control the wait time
// between retries.
// Defaults to 10.
LeaderRetryLimit int
// LeaderRetryWait sets a limit to the waiting time when trying to connect
// to a single node after failure.
// Defaults to 500ms.
// Timeout on a connection is controlled by the DialTimeout setting.
LeaderRetryWait time.Duration
// AllowTopicCreation enables a last-ditch "send produce request" which
// happens if we do not know about a topic. This enables topic creation
// if your Kafka cluster is configured to allow it.
// Defaults to False.
AllowTopicCreation bool
// Any new connection dial timeout.
// Default is 10 seconds.
DialTimeout time.Duration
// DialRetryLimit limits the number of connection attempts to every node in
// cluster before failing. Use DialRetryWait to control the wait time
// between retries.
// Defaults to 10.
DialRetryLimit int
// DialRetryWait sets a limit to the waiting time when trying to establish
// broker connection to single node to fetch cluster metadata.
// Defaults to 500ms.
DialRetryWait time.Duration
...
} In this sink, I set the count of all retries to 1, and waiting time to zero, means that if the client can not connect the broker, it will return immediately
Elasticsearch |
@huangyuqi I really would be interested to see a PR for an ES sink. Is this for events only or also metrics ? This would fix #227 ? |
Hi, @rvrignaud , Elasticsearch is a distributed full text search engine, ES sink is able to save metrics and events; And we can use the RestAPI to search. |
This is the error that I was suggesting to avoid. I will post a patch against this PR to fix this. |
@huangyuqi: I opened #683 based on this PR. I tested that against a local kafka cluster and it seems to be resilient against cluster outages. If you are OK with that, we can merge that in. We need a simple integration test for this client. |
I'm closing this PR. |
@vishh , I think something wrong about the sink's format: ^_^ your format : two issues, let me explain : )
so, the kafka sink's url needs quotation mark to avoid confuse. Maybe, we can modify the use of symbol
Kafka sink gets the brokers list from the url's query string.So kafka sink's url need follow the format of URL. So, i use |
Enclosing the entire sink config inside |
|
@huangyuqi Hi. Would you mind open a PR or at least send a gist for the Elasticsearch sink as you mentioned above ? It would be very useful to us to get that feature ! The issue is describe here : #227. Thanks ! |
@rvrignaud , i am very sorry for delay. I am working on this, I will open a PR in recent two days. I have finished most of the sink code. ^_^ |
Hi, @rvrignaud , so sorry for delay. I have finished the ES sink, would you please help me to review it? BTW, Hi, @vishh , I found that I can not test any sink in standalone mode. |
@huangyuqi : Hi. Thanks a lot for the PR. Unfortunately I'm not a Golang developer but I'll try to find some time to build and test it. |
@huangyuqi: Why are you not able to test sink in standalone mode? |
Apache Kafka is publish-subscribe messaging rethought as a distributed commit log.
Kafka sink just transforms the monitoring metrics and events to kafka message with two topics: timeseriestopic, eventstopic.
To use the kafka sink add the following flag:
Normally, kafka server has multi brokers, so brokers' list need be configured for producer.
So, we can set
KAFKA_SERVER_URL
to a dummy value, and provide kafka brokers' list in url's query string.Besides,the following options need be set in query string:
timeseriestopic
- Kafka's topic for timeserieseventstopic
- Kafka's topic for eventsLike this: