-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration tests migration to pytest framework #1293
Conversation
a212fc6
to
f6ac46c
Compare
@dpkp Is ipv6 enabled on these build VMs? I'm getting the following error in the travis build:
I haven't changed anything related to the Kafka broker bind address, so I'm wondering why of these errors... All my local tests have passed and bound to ipv6 addresses without any problems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting a lot of hard work into this.
I didn't have time to review everything, but did leave some comments.
It's honestly a bit scary/intimidating to review a massive PR like this, especially when it's changing tests that we rely on for correctness.
What do you think about breaking this apart into multiple PRs? I can think of several logical chunks that would be easier to review individually. Perhaps adding pytest fixtures in one. Then migrating each group of tests to the pytest fixtures as separate PRs. And finally a PR to remove the unittest leftovers. Doing this would both make it conceptually easier to review, and make it easier for you to go through and explain the reasoning behind some of the changes.
Lastly, I saw a bunch of files where the contents were completely removed, and then a bunch more files where the contents show up again. But git didn't show it as a file rename, so I assume you put different tests in different files. What were you trying to accomplish with this re-organization?
Again, thank you for all the hard work, it is appreciated. Just want to be cautious on pulling this in to make sure no accidental oversights.
kafka/protocol/types.py
Outdated
@@ -100,7 +100,7 @@ def decode(cls, data): | |||
|
|||
@classmethod | |||
def repr(cls, value): | |||
return repr(value[:100] + b'...' if len(value) > 100 else b'') | |||
return repr('<None>' if value is None else value[:100] + b'...' if len(value) > 100 else b'') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to cast None
to a string, repr()
will do it automatically and it'll be less confusing (at least to me) to see the naturally expected string representation of None
. Also, if you prepend the None check to the existing length check, then that will get evaluated first, preventing the len()
from throwing the exception.
I also think there's a bug here and the final b''
should actually be value
as the whole point is to print the entire thing if it's less than 100 characters. But IIRC @tvoinarovskyi wrote this and he normally knows what he's doing, so perhaps there's a reason for this?
So all that put together probably looks like:
value[:100] + b'...' if value is not None and len(value) > 100 else value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on all counts. I'll wait for @tvoinarovskyi response for the final change in regards to the b''
case.
kafka/client_async.py
Outdated
@@ -526,6 +526,8 @@ def poll(self, timeout_ms=None, future=None, delayed_tasks=True): | |||
timeout_ms = 100 | |||
elif timeout_ms is None: | |||
timeout_ms = self.config['request_timeout_ms'] | |||
elif not isinstance(timeout_ms, int) and not isinstance(timeout_ms, float): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isinstance()
supports checking against multiple types in one call:
isinstance(timeout_ms, (int, float))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I'll change it.
kafka/client_async.py
Outdated
@@ -526,6 +526,8 @@ def poll(self, timeout_ms=None, future=None, delayed_tasks=True): | |||
timeout_ms = 100 | |||
elif timeout_ms is None: | |||
timeout_ms = self.config['request_timeout_ms'] | |||
elif not isinstance(timeout_ms, int) and not isinstance(timeout_ms, float): | |||
raise RuntimeError('Invalid type for timeout: %s' % (type(timeout_ms),)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument to %
doesn't have to be a tuple if it's just a single argument:
>>> 'Invalid type for timeout: %s' % (type(5),)
"Invalid type for timeout: <type 'int'>"
>>> 'Invalid type for timeout: %s' % type(5)
"Invalid type for timeout: <type 'int'>"
I'd argue that in this case the latter form is more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I know. But I usually prefer to make it explicit as a tuple to avoid ambiguity with the case where the argument is a tuple. It's not the case here, since we know that type(5)
will never be a tuple. But I usually try to use the same style for consistency.
I'm happy to change this since the style for this project is the latter form.
@@ -19,6 +19,6 @@ log4j.appender.stdout=org.apache.log4j.ConsoleAppender | |||
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout | |||
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n | |||
|
|||
log4j.logger.kafka=DEBUG, stdout | |||
log4j.logger.kafka=INFO, stdout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. This was just for some tests and slipped into the commit. I'll revert this.
test/fixtures.py
Outdated
|
||
def __del__(self): | ||
self.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it safe to remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea is that these should all be instantiated by pytest fixtures, and I'm handling the destruction in the fixture methods, as explained above.
test/test_client_async.py
Outdated
@@ -7,14 +7,14 @@ | |||
# vendored backport module | |||
import kafka.vendor.selectors34 as selectors | |||
|
|||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should not be moved per pep8, pytest
should be in 3p library imports separate from stdlib imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted. I'll change
test/test_client_async.py
Outdated
while not future.is_done: | ||
time.sleep(sleep_secs) | ||
client.poll(timeout_ms=100, delayed_tasks=True) | ||
assert time.time() < start_time + tolerance_secs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add parens to make the order of operations more obvious?
time.time() < (start_time + tolerance_secs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
test/test_client_async.py
Outdated
future = client.schedule(task, time.time() + 1) | ||
client.unschedule(task) | ||
assert future.is_done | ||
assert future.value == None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be is None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, you can directly put the thumbsup on the comment (via the smiley face/emoticon dropdown) rather than leaving a separate comment... reduces spam for watchers of the project when all you want to do is say "i agree/support this"
test/test_integration_consumer.py
Outdated
import six | ||
import time | ||
|
||
from six.moves import xrange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per pep8, the above imports should be split into a section for stdlib and a section for 3p imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change... And will re-read PEP8!
@jeffwidman Thanks for the review. |
@jeffwidman Would you know anything about the "protocol not available" errors? I see that other builds in Travis were also affected by the same issue. Even documentation changes fail with that error, so I assume it's an infrastructure problem. |
f6ac46c
to
ca4c754
Compare
@jeffwidman I've reduced the scope of this PR and per your suggestion and will build up from here after this one is approved and merged. The CI builds, though, keep failing, but is again due to those "protocol not available" errors. I tested the changes on my test box for all the Python and Kafka versions and it works well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for slimming this down, it is much more approachable.
Did a first-pass, most of it looks straightforward. Had a few questions where I wasn't sure if I'm missing something... once you clarify, then I'll take another pass.
Thank you again for working on this, it is much appreciated!
@@ -1,5 +1,6 @@ | |||
[TYPECHECK] | |||
ignored-classes=SyncManager,_socketobject | |||
generated-members=py.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The py
module seems to have some dynamically generated members and pylint
complains about not finding the ones used by the module:
Module 'py' has no 'path' member (no-member)
This is the same issue reported here. Adding this option to pylint just silences this error.
This will be temporary anyway, as I explain below.
test/fixtures.py
Outdated
log.info(" zk_chroot = %s", self.zk_chroot) | ||
log.info(" replicas = %s", self.replicas) | ||
log.info(" partitions = %s", self.partitions) | ||
log.info(" tmp_dir = %s", self.tmp_dir.strpath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether it's better to have these as separate log entries or as a single log entry that just has newlines? I'm not actually sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer the way it is. It makes the code more readable, IMO.
test/fixtures.py
Outdated
|
||
def open(self): | ||
if self.running: | ||
self.out("Instance already running") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to use stdout
here rather than log.info()
?
I noticed throughout this code that some messages are going to the logs and others going to stdout which seems confusing to me, but perhaps there's a reason or unix convention for this... @dpkp your input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that as well, but I don't know why. I just kept it the way it had been.
Happy to change it, if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dpkp can you comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nudge @dpkp
test/fixtures.py
Outdated
@@ -71,28 +90,29 @@ def test_resource(cls, filename): | |||
@classmethod | |||
def kafka_run_class_args(cls, *args): | |||
result = [os.path.join(cls.kafka_root, 'bin', 'kafka-run-class.sh')] | |||
result.extend(args) | |||
result.extend([str(x) for x in args]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at first glance, this change makes sense to me, but I'm curious why not casting to strings wasn't causing problems before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the changes that I'll be submitting next I changed the create_topic
method to be more "robust" and don't rely on the SimpleClient API or on auto_create_topic=true
to be able to create a topic.
The method tries to use the API, if possible (Kafka > 0.10.1), to create a topic but, if not, falls back to using kafka.admin.TopicCommand
to ensure the topic is created successfully.
This command takes numeric command line arguments for the --partitions
and --replication-factor
options and this is what cause me to hit an exception at this point of the code.
I could've just passed them as strings to kafka_run_class_args
instead, but I found this was more elegant and could help avoid further issues later.
import time | ||
import uuid | ||
|
||
from six.moves import urllib | ||
import py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What benefit do we get from this new dependency py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at replacing tempdir
with the pytest's tmpdir
fixture as you suggested and I liked the idea.
The pytest's tmpdir
fixture is an instance of the py.path.local
class. Once I finish converting all the test cases to pytest
, the zookeeper
and kafka
fixtures will automatically get a tmpdir
fixture from pytest
and will pass it to the ZookeeperFixture
and KafkaFixture
constructors, respectively.
Before the conversion is complete, though, the constructors will still not be passed tmpdir
as a parameter and will have to instantiate it internally.
Hence, this import is only needed temporarily to allow the unittest
code to continue working while we're going through this conversion process. Once everything is converted to pytest
, we can remove this since the tmpdir
will always be provided by the pytest
fixtures.
import time | ||
import uuid | ||
|
||
from six.moves import urllib | ||
import py | ||
from six.moves import urllib, xrange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I prefer using the python3 syntax and then let six
fill in the python 2 equivalent. So use range
rather than xrange
here--they are functionally the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this got missed? GitHub is hiding the comment, but we'd discussed using six.range
rather than six.xrange
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I missed. Will take care in the upcoming PRs
test/conftest.py
Outdated
ret = decorate(func, _set_params) | ||
return ret | ||
|
||
return real_set_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I agree with the design goal, but can't you skip the setters in favor of pytest
's built-in @pytest.mark.parametrize()
decorator?
It provides the flexibility to both dynamically assign the fixture params for a test, and also to run a single test multiple times using different input permutations.
Am I missing some additional functionality that the custom setters provide?
Also, I'm not sure what the protocol error is about... @dpkp is this a known issue? |
tox.ini
Outdated
@@ -20,6 +20,9 @@ deps = | |||
lz4 | |||
xxhash | |||
py26: unittest2 | |||
decorator | |||
# sugar plugin depends on specifc versions of py | |||
py<1.5,>=1.4.33 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the issue here. But if there is a bug in sugar wrt its dependency management, we should pin sugar not py. When I test locally, this causes a failure:
"Plugin %r could not be loaded: %s!" % (ep.name, e))
pluggy.PluginValidationError: Plugin 'sugar' could not be loaded: (py 1.4.34 (/Users/dpowers/Code/kafka-python/.tox/py27/lib/python2.7/site-packages), Requirement.parse('py>=1.5.0'), set(['pytest']))!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asdaraujo can you also address this point?
Overall it looks good to me. Tests pass locally after fixing merge conflicts and removing the tox.ini py version pin. |
@asdaraujo This looks nearly complete to me, and I'd like to see your hard work merged. Do you mind rebasing to fix merge conflicts? Also take a final spin through the comments--I think there are two comments that still need addressing (should be trivial changes). |
@jeffwidman Will do! Just got back from PTO and will get back to this asap. |
ca4c754
to
75fb69b
Compare
@jeffwidman I've addressed the comments and rebased the changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Assuming tests pass, I'll wait another day or two before merging just in case @tvoinarovskyi or @dpkp want to take a final pass...
Hi there, sorry for no input on this before, had my hands full. @asdaraujo could you explain the purpose of this PR, as far as I could tell there's no actual migration of test cases themselves to |
The problem I have with this is that it's probably not tested and will require changes later on anyway. For example, the |
@tvoinarovskyi this was because in #1293 (review) I requested he break his changes (which were a massive PR) into several smaller changesets. Your other point sounds valid, at least at first glance. Perhaps it'd be helpful to open all the PR's at once so if there are questions about whether the API is correct that we can look at the final usage. That'd also be useful so that if @asdaraujo is hit by a bus we have a copy of his code still available as all the PR's kinda go together... That said, I don't want to create a big headache for @asdaraujo though, as I really appreciate all the hard work he's doing here to migrate everything to |
@jeffwidman Got it, as this PR does not hinder test runs, we can merge it. If some changes will be needed after, we can do it in following PR's. |
@asdaraujo before I merge, what do you think of the suggestion to have a
|
Thanks, @dpkp. I like the I'm happy to open all the PRs once this is defined. And I'll still try to stay away from buses... |
@tvoinarovskyi @jeffwidman I've had a look into this and I think most of the work is already there and I can simplify things a bit. Looking for your thoughts here before I make any changes. The For the sake of simplicity, I'm thinking on the following changes:
With this, if the default consumer/producer is needed the test can simple use the related consumer/producer fixture. If any specific settings are required, the test can use the Thoughts? |
👍 This sounds like a good plan to me. Let's go ahead and do that and then get it merged so that you're unblocked for the rest of the code changes you are making. Thanks again for your patience and willingness to go through multiple revisions on this. |
Dismissing until refactored per discussion
@jeffwidman Sorry for the delay. This month is kinda crazy for me and I've been flat out. I made the changes along the lines we had discussed before, although I've had to change a few things from the original plan. But the overall idea is the same and the I was also concerned with the earlier comment from @tvoinarovskyi that things were being committed without testing. So after the changes I migrated two test methods to use So, as you'll see, the size of the PR has increased a little bit, but hopefully not overwhelmingly. |
07d54a3
to
726735a
Compare
Sounds good, I started to review but looks like you're still pushing updates... let me know when you're done and I'll give it a final review. |
@jeffwidman I just noticed a small issue that caused Python 3.x tests to fail. Just fixed it and tests are queued again. Not planning to change anything else, unless tests fail again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks again for all the hard work!
Travis is being flaky and timing out, I'm restarting and will merge as soon as tests go green.
import time | ||
import uuid | ||
|
||
from six.moves import urllib | ||
import py | ||
from six.moves import urllib, xrange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this got missed? GitHub is hiding the comment, but we'd discussed using six.range
rather than six.xrange
import time | ||
import uuid | ||
|
||
from six.moves import xrange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffwidman Yes, I missed this. I'll take care of this in the upcoming PRs, if you don't mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, at all, I'm happy to clean it up myself after merging
So I'm afraid there are a couple of test failures this time around that look like legit race conditions... I'm not sure if they're problems with the test harness (which I don't mind ignoring other than raising an issue to fix in later PRs) or race condition in You're now more familiar with the test harness than I am... Any idea? |
I'll look into those errors |
@jeffwidman I believe I've found the cause of the racing condition. The client's If the client sends a message to the topic in that state, the broker will throw internally a I've patched the
The offset request will throw the same error as the produce request if the log(s) creation is completed. In my tests I noticed that I sporadically got a few exceptions in the first loop iterations, which stopped usually in less than 1 second, after the brokers finished the creation of the topic' logs. I'm still running tests, but this seems to fix the issues. I think it might be a good idea to move the check above to the I couldn't find anything in this PR that would've had introduced this though. This issues seems to have existed before. I'm still not sure why it started manifesting now. |
dd073d8
to
f091a9a
Compare
If a future was passed as the only positional parameter it would be assigned to the "timeout_ms" parameter erroneously. This mistake would not raise any exception but would lead to odd behaviour later, what could make it extremely difficult to troubleshoot. Adding a type check ensures that an exception is raise earlier to notify the user about the problem.
This commits adds new pytest fixtures in prepation for the migration of unittest.TestCases to pytest test cases. The handling of temporary dir creation was also changed so that we can use the pytest tmpdir fixture after the migration.
24e7172
to
406aa30
Compare
406aa30
to
c65c400
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. Thanks for all your hard work on this.
I expect we will find some things that could be tweaked, but we can deal with those in followup PRs... I'm going to land this so that we get the framework in and new features can start using these new pytest fixtures.
@asdaraujo when you have time, please put up the next PR here so we can keep pushing this migration across the finish line. I'll try to be faster next time to turn this around. 😃 |
Damn, too fast =)
As for |
The I'm not sure about the |
@tvoinarovskyi @jeffwidman Yes, the As for @jeffwidman I've started working on the new PRs but I have a big personal deadline at the end of next week, so I'll probably be quiet here until early March. |
Sounds good @asdaraujo thanks for the update. We all very much appreciate the help on this, will make everyone's lives much smoother to have everything on |
Let's please also make sure that repl fixtures continue to work without requiring a virtualenv. I very frequently run this in a shell for local testing / debugging / etc:
The above now requires that I have |
Makes sense, I didn't even think about that. Thankfully, as noted in #1293 (comment), this should be temporary:
|
@asdaraujo any idea when you'll be able to pick this up again? Be nice to get the rest of the tests migrated and the GSSAPI stuff under test coverage... |
Bump @asdaraujo any chance you'll be able to finish migrating the legacy tests? I'm hoping to move toward a big 2.0 release with both #1540 and #1196, but for #1196 we need the legacy tests migrated... You understand this testing framework better than anyone else right now, so if you have any spare cycles to finish this migration we'd certainly appreciate it. |
In preparation for the changes I'll need to make to the integration tests to cover the GSSAPI authentication changes in PR #1283, I had to make a few changes to the existing test cases, which were written using unittest.TestCase.
After some discussions with @dpkp and @jeffwidman I decided to first move those test cases to pytest, which is the preferred test framework for the project.