Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stm32] Add FDCAN support for H7 series #1206

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

WasabiFan
Copy link
Contributor

@WasabiFan WasabiFan commented Sep 2, 2024

This PR updates the stm32-fdcan driver to support the STM32H7 series.

Note that I don't have any prior knowledge of STM32's CAN IP so I am learning it as I go. I'd appreciate a fact-check on what I've done here.

High-level differences between the H7 FDCAN IP and prior families:

  • The message RAM size has been increased from 1.6K to 10K
  • The fixed message RAM layout has been replaced with a set of base address and count registers. SW configures the location and size of each message RAM region.
  • SW can configure the number of data bytes in each frame entry in the TX and RX structures. It now resets to 8 instead of 64.
  • There are some new features I haven't investigated (TTCAN, dedicated RX buffers, maybe others)
  • Misc registers have been re-arranged pursuant to the above.
  • The interrupt line select bits have improved granularity.

This PR does the following:

  • Update existing config register writes for the fields' new locations
  • Add initialization logic to configure the message RAM layout and message size
  • Add a new configuration struct and associated lbuild options to select the size of the message RAM structures
    • This replaces the hard-coded message ram layout computation.
    • For H7, these values are configurable and I have given them slightly higher defaults to take advantage of the larger message RAM. For other families, I have left these set to the fixed values to match the old behavior.
  • Updates the fdcan test so it can fill the new buffer space
  • Adds an example to get CI compilation coverage

Validation:

  • On an STM32H753 I have manually confirmed basic non-FD CAN TX and RX on both FDCAN instances.
    • The fdcan test passes with a caveat I'll note in a comment.
    • Note that the board I'm developing on is not one currently supported by modm (I may open a separate PR for that). The Nucleo example I have included here is untested.
  • I'm marking this PR as a draft for now as I need to do some more testing of filters and CAN-FD.
  • I've compiled the examples for all boards.

Pending TODOs:

  • Fix CAN test error on STM32H753 (last message dropped)
  • Cherry-pick chris-durand@ddf3d17
  • Disable HW prioritized queue mode
  • Validate CAN-FD features on STM32
  • Validate all functionality on SAM to confirm no regressions
  • Validate all functionality on a non-H7 STM32 (e.g. STM32G4) to confirm no regressions

I would appreciate if someone has a Nucleo STM32H7 board and/or a non-H7 board to double check my testing and confirm I haven't broken the other chip families. I have only tested on H7.

@WasabiFan
Copy link
Contributor Author

The H7 reference manual (and even the older G4 manual) have what I believe to be oversights/errors which make me a bit unsure about the data size configuration and message RAM layout. For example, the H7 manual says this about TX queue entries:

A Tx queue buffer allocates four 32-bit words in the message RAM. Therefore the start
address of the next available (free) Tx queue buffer is calculated by adding four times the Tx
queue put index FDCAN_TXFQS.TFQPI (0 … 31) to the Tx buffer start address
FDCAN_TXBC.TBSA.

I believe this to be wrong. If TXESC.TBDS is set such that CAN messages are 8 bytes, it is true. But if it is set to any other value, there needs to be space for those extra bytes. Emperically it does seem that setting TBDS to a higher value grows the queue entries accordingly.

For TX event entries, which are structured similarly, it does say what I believe to be correct:

The Tx event FIFO can be configured to a maximum of 32 elements. The Tx event FIFO
element is described in Tx FIFO. Depending on the configuration of the element size
(FDCAN_TXESC), between two and sixteen 32-bit words (Tn = 3 ..17) are used for storage
of a CAN message data field.

@@ -0,0 +1,82 @@
/*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example has been compiled but not run. I would appreciate a test run on one of the standard nucleo/discovery boards by someone who owns one.

src/modm/platform/can/stm32-fdcan/can.cpp.in Show resolved Hide resolved
src/modm/platform/can/stm32-fdcan/message_ram.hpp Outdated Show resolved Hide resolved
@@ -62,6 +107,8 @@ class Instance(Module):
properties["shared_irq_it0"] = False
properties["shared_irq_it1"] = False

properties["message_ram_length_words"] = 2560 if device.identifier.family == "h7" else 1696
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see a sensible place in the Cube database to pull this number from, but I may not have been looking in the right place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the size of 1696 words really correct here? According to the G4 reference manual it's 212 words (0.8 kB) per FDCAN instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I have no idea where I got this number.

@salkinium salkinium added advanced 🤯 feature 🚧 ci:hal Triggers the exhaustive HAL compile CI jobs labels Sep 6, 2024
@salkinium salkinium added this to the 2024q3 milestone Sep 6, 2024
@chris-durand
Copy link
Member

@salkinium I'll take a look at this in detail soon. I'm currently traveling within Asia and didn't have access to a computer for a week.

I also need the H7 CAN support at work and will have to ship it in a product. In a first attempt to implement this I've reused the message RAM definitions from the ATSAM MCAN driver which is same hardware IP. You can find the draft code here. It's tested in hardware on a custom H723ZG board.

The ATSAM driver message RAM class also takes a config non-type template parameter, similar to what was done in this PR. I'm wondering if it could be worth unifying both that we don't have maintain two implementations of the same thing. @WasabiFan Could you check whether that makes sense and looks feasible?

@WasabiFan
Copy link
Contributor Author

Thanks for taking a look.

I hadn't realized the SAM CAN IP was identical. I had assumed there was just some shared lineage. I'll take a closer look after work.

At first glance the message RAM looks to be the same and we ought to be able to share it. In fact, the config registers seem identical too, so we could consider sharing the whole driver. It would require some templating for the register defines and the interrupts would still be special-cased.

The main difference I see so far is message RAM addressing. The STM32H7 as I understand it uses a dedicated SRAM for the message RAM which is shared between the N FDCAN IP instances. Addresses loaded into the region start address fields are offsets into the 10k SRAM. Meanwhile, the SAM datasheet says "Multiple MCANs May Share the Same Message RAM" which seems to mean it addresses shared system memory directly. The modm driver for SAM MCAN allocates a local buffer in the CAN driver class and points the CAN IP to that. The MCAN driver does not need to apply an explicit allocation policy for the shared RAM. My PR for H7 currently allows the user to configure their message RAM region sizes and then computes offsets for each FDCAN instance so they sit contiguously in the SRAM. The MCAN message RAM class would need a small amount of rework to support this.

@chris-durand is the request to share the MessageRam class or the whole MCAN/FDCAN driver?

@WasabiFan
Copy link
Contributor Author

I've updated the PR with an implementation which shares the MessageRam class between SAM and STM32 FDCAN. There's a test failure I'll look into later; TODOs now posted in the PR description. I'd appreciate a review of this to get some feedback on the overall change in the meantime.

Note that the SAM CAN driver takes configuration from the user via a template parameter, while the STM32 implementation takes it from lbuild parameters. This is because the STM32 FDCAN instances share a hardware RAM. Each must know how much space the prior instance(s) take up.

I'm not fond of the size parameters in the message RAM config struct storing total element size (including header) rather than just size of the message body. The user shouldn't see or care how large the header is. However, the existing SAM driver does this and I am worried about introducing a subtle foot-gun if I redefine it now for users already configuring the SAM FDCAN driver by hand. So I've left it as-is.

Copy link
Member

@chris-durand chris-durand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, very nice! I'll run it on hardware on Wednesday when I'm back home.

src/modm/board/nucleo_h723zg/board.hpp Outdated Show resolved Hide resolved
src/modm/board/nucleo_h743zi/board.hpp Outdated Show resolved Hide resolved
src/modm/platform/can/stm32-fdcan/can.hpp.in Outdated Show resolved Hide resolved
src/modm/platform/clock/stm32/module.lb Outdated Show resolved Hide resolved
@chris-durand
Copy link
Member

I'm not fond of the size parameters in the message RAM config struct storing total element size (including header) rather than just size of the message body. The user shouldn't see or care how large the header is. However, the existing SAM driver does this and I am worried about introducing a subtle foot-gun if I redefine it now for users already configuring the SAM FDCAN driver by hand. So I've left it as-is.

I agree but I wouldn't want to fix it in a way that silently breaks user code. It could theoretically be done by introducing a new struct and deprecating the old, but I don't think it's worth the effort.

@WasabiFan WasabiFan force-pushed the h7-fdcan branch 2 times, most recently from 4cf6d9d to 6f7418a Compare September 17, 2024 06:24
@WasabiFan
Copy link
Contributor Author

I believe I've updated the PR with the review feedback. I also autosquashed all my fixup commits since I re-arranged some hunks between commits.

The ordering of the HW TX queue is now configurable.

To ensure I haven't regressed existing behavior, I'd appreciate a test in hardware on a) a non-H7 STM32 and b) an ATSAM board. c) one of the Nucleos whose board.hpp has been retrofitted (h723zg or h743zi) would be nice too if possible.

I still have some more validation which I will do on my H753 third-party board.

@chris-durand
Copy link
Member

To ensure I haven't regressed existing behavior, I'd appreciate a test in hardware on a) a non-H7 STM32 and b) an ATSAM board. c) one of the Nucleos whose board.hpp has been retrofitted (h723zg or h743zi) would be nice too if possible.

The unit test fails on both the G474RE and the H723ZG Nucleo board. I'll investigate tomorrow.

@WasabiFan
Copy link
Contributor Author

Sorry, I still haven't fixed that. I see the same on my H7 and haven't looked into it. The final message is being dropped. Those tests were passing before I swapped the message RAM implementation so I assume it's a legitimate bug.

WasabiFan and others added 4 commits September 18, 2024 22:32
The H7 series supports a user-configured message RAM layout, rather than
the static layout of past STM32 CAN IPs. The new message RAM is
identical in structure and configurability to the ATSAM FDCAN IP. This
commit merges the STM32 FDCAN and ATSAM FDCAN message RAM
implementations.

For H7 it also introduces lbuild parameters so the user can choose the amount
of space available for each RX queue, each TX queue, and filters. This is
independent for each FDCAN IP instance, although they share the same 10k RAM.
Queue Mode will send CAN frames in priority order according to their
arbitration ID. This is inconsistent with modm's software-managed queue
which is a strict FIFO. To make the behavior of the driver consistent
with an end-to-end FIFO order, we disable Queue Mode by default.

There is a new lbuild option, "tx_hw_queue_mode", which allows the user
to opt back into the old behavior if desired. This is mostly intended
for if they also set "buffer.tx" to zero and optionally increase
"message_ram.tx_fifo_elements".

Note that the fdcan unit test was implicitly assuming FIFO order. This
change also fixes that test for larger HW TX buffer sizes.
@WasabiFan
Copy link
Contributor Author

The test is fixed. All FDCAN tests are now passing on my STM32H753. It was a dumb off-by-one bug I introduced in the fdcan buffer test when I made it vary the number of messages based on available message RAM/queue space.

@WasabiFan
Copy link
Contributor Author

I did a sanity check of FD features (various larger frames up to 64b, higher data bit rate) with a physical loopback on the H753 and don't see any issues.

@WasabiFan WasabiFan marked this pull request as ready for review September 19, 2024 20:21
@chris-durand
Copy link
Member

chris-durand commented Sep 20, 2024

The test is fixed. All FDCAN tests are now passing on my STM32H753. It was a dumb off-by-one bug I introduced in the fdcan buffer test when I made it vary the number of messages based on available message RAM/queue space.

Nice!

There are still issues on the Nucleo boards. On the H7 the peripheral doesn't work at all clocked from 8 MHz HSE. I assume the frequency is just too low. When I switch the clock mux to Pll1Q the test passes. If we do that we'll have to increase the Pll1Q clock divider from 4 to 5 to not exceed the FDCAN maximum clock of 125 MHz. I'll check if it works on my custom board with a 24 MHz HSE crystal.

The G4 test fails on line 63, but it even does so without the changes from this PR. A message is received although it should be rejected. If I consume the message before the check all remaining tests pass. @WasabiFan Do you have any idea what could be wrong here?

FAIL: fdcan:63 : true == false
FAIL: fdcan:104 : 1801 == 18193
FAIL: fdcan:106 : 0 == 1
FAIL: fdcan:108 : 0 == 1
FAIL: fdcan:106 : 1 == 2
FAIL: fdcan:108 : 1 == 2
FAIL: fdcan:108 : 52 == 2
FAIL: fdcan:106 : 2 == 3
FAIL: fdcan:108 : 2 == 3
FAIL: fdcan:108 : 2 == 3
FAIL: fdcan:108 : 86 == 3
FAIL: fdcan:106 : 3 == 4
FAIL: fdcan:108 : 3 == 4
FAIL: fdcan:108 : 3 == 4
FAIL: fdcan:108 : 3 == 4
FAIL: fdcan:108 : 239 == 4
FAIL: fdcan:106 : 4 == 5
FAIL: fdcan:108 : 4 == 5
FAIL: fdcan:108 : 4 == 5
FAIL: fdcan:108 : 4 == 5
FAIL: fdcan:108 : 4 == 5
FAIL: fdcan:108 : 0 == 5

@chris-durand
Copy link
Member

On my custom board the 24 MHz HSE clock appears to work fine.

@WasabiFan
Copy link
Contributor Author

Re: the clocking, my H7 board has a 16MHz crystal and I confirmed FDCAN works both off that and off a PLL. Although, I'm only now realizing that I was overclocking it by a factor of 2x when running from the PLL because I misread the Fmax table. It's impressive that it still worked.

For those Nucleo boards, can we switch it to use PLL2Q instead?

Re: the G4 test, can you try commenting out the setStandardFilter so there are no standard acceptance criteria? I guess also try disabling the extended filter to confirm that isn't somehow affecting the standard ones. Accepting a message which shouldn't be accepted is a weird failure mode since it implies the acceptance filter is present and recognized but overly broad.

I took another look for differences between the G4 and H7 filter handling. I don't see any functional differences without explicit opt-in. I do notice is that the mask (ID2) layout is documented in a way that is incorrect or at least heavily confusing in the H7 version, but I interpret it to be poor documentation rather than a design difference. The SAM datasheet matches the G4's interpretation of these fields.

Screenshot 2024-09-21 011312

@WasabiFan
Copy link
Contributor Author

Reflecting on the message RAM size review comment posted above: there's a legitimate functional bug I need to fix. The way I implemented computation of each peripheral's message RAM base is wrong for the fixed layout (non-H7) when not all instance modules are included in lbuild. I'll need to make the address computation independent of the enabled peripherals.

@WasabiFan
Copy link
Contributor Author

Yesterday I fixed the message RAM allocation bug. That doesn't explain the G4 failure. It would be good to run the tests on a G4 with FDCAN0 disabled to confirm FDCAN1 still works. If you'd like me to take a closer look at the failure I can order a G4 Nucleo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
advanced 🤯 ci:hal Triggers the exhaustive HAL compile CI jobs feature 🚧
Development

Successfully merging this pull request may close these issues.

3 participants