-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rethink STM32 I2C v2 HAL #15350
Rethink STM32 I2C v2 HAL #15350
Conversation
@@ -77,39 +77,6 @@ struct serial_s { | |||
#endif | |||
}; | |||
|
|||
struct i2c_s { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit silly that this structure is copied verbatim into like 10 different objects.h headers, despite all of them being identical and being used by a single version of i2c_api.c. Since I had to modify the structure, I consolidated these into a single version in stm_i2c_api.h
while I was at it.
@multiplemonomials, thank you for your changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's list of changes in the commit message. Was this pull request squashed? It's not easy to review so much changes within i2c implementation, should this be split into smaller logical changes (my 2 below comments are related to the same remark).
@jeromecoutant Would you find time to review ? |
To be honest, for a complete review, I think not... |
To answer your question, yes this was squashed together from a bunch of commits, many of which were bugfixes. If you want, I can try to break a couple of the standalone fixes (e.g. adding docs, adding return codes) into separate MRs or separate commits. However, I'll warn you, at least 80% of the total changes need to be made at once and reviewed together, or they won't work and they won't make sense. I definitely understand that this type of change is hard to review -- the STM I2C register level operation is complex and not very intuitive in how it maps to Mbed's I2C API. I can count at least 10 different times when my test broke and it turns out the hardware was working in a way completely different from how I'd expected. But please believe me that I've tested it extensively -- I'm pretty confident that if there were issues in the code, my hardware test cases would catch them (even if the FPGA board tests don't, because they only do one transaction before deleting the I2C object). And even if there are some small issues, please believe me that it's an order of magnitude less buggy than it was before. I really don't want to let the perfect be the enemy of the good here, especially when people are constantly running into these I2C bugs. I can personally say that, in ~4 years of Mbed OS development, I have seen developers run into this bug on 4 separate occasions, each time causing major havoc (and one time requiring me to drive all the way to a lab at UC San Diego). I really would like to put it to bed. |
understood, so lets just split what can be (my 2 remarks above, possibly moving struct i2c_s into the header can also be separate logical unit, the rest within i2c implementation can stay to address the bugs there) - separate commits, no need to create another PR. |
a567b81
to
a850734
Compare
Pull request has been modified.
Can you check the styling check failing (just one line). I'll need to fix license check, it looks like something broke in the scancode dependencies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding 2 commits, it looks good to me (just styling fixes required).
} else { | ||
|
||
if(obj_s->state == STM_I2C_SB_READ_IN_PROGRESS) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please keep the styling consistent, it uses {
attached to the condition, and space after if
, see line 1143, please fix all instances in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a command I can run to autoformat everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, dang notepad++...
Regarding the license check, feel free to use my patch mbed-ce#104 |
@0xc0170 I maintain scancode and this is now fixed. Sorry for the churn! |
a850734
to
0c2ee4f
Compare
Jenkins CI Test : ❌ FAILEDBuild Number: 1 | 🔒 Jenkins CI Job | 🌐 Logs & ArtifactsCLICK for Detailed Summary
|
Hmm, do you have any idea why the CI failed? I don't really know how to read the logs |
This are the errors there for DISCO_L475VG_IOT01A board. |
OK, looks like this was a mistake in my original Mbed CE MR. Fixed it there and here. |
CI started |
Jenkins CI Test : ✔️ SUCCESSBuild Number: 2 | 🔒 Jenkins CI Job | 🌐 Logs & ArtifactsCLICK for Detailed Summary
|
Summary of changes
Hi all. As you may have heard, I have been doing a lot of work on my mbed-ce fork to better test and fix up the I2C HAL for STM32 devices. I think that several serious bugs have been fixed, so I'd like to upstream these changes if possible.
In summary, I created a new test program that runs lots of different I2C scenarios against real hardware. As-is, the STM32 I2C v2 HAL failed many of these test cases, and all of the ones that involved single-byte functions (e.g
I2C::write(int)
). These functions are problematic because they are so low-level that the STM32 HAL cannot be used, and chip registers must be accessed directly to implement them. There were several places where the I2C peripheral could get into a bad state, freezing it and preventing subsequent transactions from going through.To fix the issue, I rethought how a lot of the low-level I2C code works, implementing a new
state
enum to track just what the hardware is doing at any one time. After several weeks of nights on the logic analyzer working through test failures, the code is finally able to handle anything I can throw at it, and is much more compliant with how other manufacturers' HALs work.Impact of changes
This PR is a major rethink of the STM32 I2C code for their v2 I2C peripheral. It tries to get it into a more API-compliant and less buggy state. It should fix several known bugs:
Additionally, I found more issues while making this PR that will also be fixed:
The biggest change this PR makes is the addition of a new state enum, obj_s->state. This provides a simple indicator of what the I2C code is doing at any given time, making certain parts of the logic a lot simpler and enabling some of the fixes to work. I also added a lot of new checks based on this state variable, so if you use the I2C object in an incorrect way, you have a fighting chance of getting a useful error code back instead of undefined behavior. (it's still tough though because certain functions do not have the ability to return an error code...)
Also, up until now, the single-byte API would return after enqueuing the byte into hardware, without waiting for the byte to actually be sent. I think this was done to provide a performance increase, but means that the HAL does not comply with standard Mbed behavior because it cannot return a NACK for a byte until later on, after it's been sent. (and the performance increase didn't mean a whole lot anyway, because it still spinlocks for byte N before sending byte N+1). I have switched things around so it's now compliant and returns the correct result for the byte it was received on.
Migration actions required
Documentation
None, this just brings it into compliance with the existing API.
Pull request type
Test results
I ran my new comprehensive I2C test on both a Nucleo L452RE and a Nucleo H743ZI dev board, and all test cases passed. I also ran the EEPROM test case, and everything passed except the 4096 byte ("full chip") operations. This is due to another bug.
Reviewers
@0xc0170
@jeromecoutant