Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM32: fix SPI write with data >8 Bit #15115

Closed
wants to merge 5 commits into from

Conversation

JojoS62
Copy link
Contributor

@JojoS62 JojoS62 commented Sep 28, 2021

Summary of changes

the low level SPI write calls msp_write_data and passes data in single bytes. Sending data formats with more than 8 Bit are sending only 8 Bit with leading zeros for larger data.
This PR passes the data as pointer to make the type casting depending on the bitshift value working.

Impact of changes

fix sending 16 Bit Problem as in #15113

Migration actions required

Documentation


Pull request type

[x] Patch update (Bug fix / Target update / Docs update / Test update / Refactor)
[] Feature update (New feature / Functionality change / New API)
[] Major update (Breaking change E.g. Return code change / API behaviour change)

Test results

urgent fix

[] No Tests required for this change (E.g docs only update)
[] Covered by existing mbed-os tests (Greentea or Unittest)
[] Tests / results supplied as part of this PR

Reviewers


@jeromecoutant
Copy link
Collaborator

@vznncv
@LMESTM

@ciarmcom ciarmcom added the release-type: patch Indentifies a PR as containing just a patch label Sep 28, 2021
@ciarmcom ciarmcom requested a review from a team September 28, 2021 09:00
@ciarmcom
Copy link
Member

@JojoS62, thank you for your changes.
@ARMmbed/mbed-os-maintainers please review.

0xc0170
0xc0170 previously approved these changes Sep 30, 2021
@0xc0170
Copy link
Contributor

0xc0170 commented Sep 30, 2021

CI started

@mergify mergify bot added needs: CI and removed needs: review labels Sep 30, 2021
@vznncv
Copy link
Contributor

vznncv commented Sep 30, 2021

Hi @JojoS62

When I implement SPI 3-Wire changes, I followed existing implementation of 4-wire logic:

int spi_master_block_write(spi_t *obj, const char *tx_buffer, int tx_length,
char *rx_buffer, int rx_length, char write_fill)
{
struct spi_s *spiobj = SPI_S(obj);
SPI_HandleTypeDef *handle = &(spiobj->handle);
int total = (tx_length > rx_length) ? tx_length : rx_length;
if (handle->Init.Direction == SPI_DIRECTION_2LINES) {
for (int i = 0; i < total; i++) {
char out = (i < tx_length) ? tx_buffer[i] : write_fill;
char in = spi_master_write(obj, out);
if (i < rx_length) {
rx_buffer[i] = in;
}
}
} else {

So I think that the following places should be fixed:

  1. 4-wire logic:

for (int i = 0; i < total; i++) {
char out = (i < tx_length) ? tx_buffer[i] : write_fill;
char in = spi_master_write(obj, out);
if (i < rx_length) {
rx_buffer[i] = in;
}
}

  1. 3-wire write logic:

for (int i = 0; i < tx_length; i++) {
msp_wait_writable(obj);
msp_write_data(obj, tx_buffer[i], bitshift);
}

  1. 3-wire read logic:

for (int i = 0; i < rx_length; i++) {
msp_wait_readable(obj);
rx_buffer[i] = msp_read_data(obj, bitshift);
}

Asynchronous API logic (spi_master_transfer) should be correct since it uses STM HAL library that processes 16-bit mode correctly.

Additionally I have question about SPI interface functions spi_transfer and spi_master_block_write. According documentation tx_length and rx_length are number of bytes to write/read:

/** Write a block out in master mode and receive a value
*
* The total number of bytes sent and received will be the maximum of
* tx_length and rx_length. The bytes written will be padded with the
* value 0xff.
*
* @param[in] obj The SPI peripheral to use for sending
* @param[in] tx_buffer Pointer to the byte-array of data to write to the device
* @param[in] tx_length Number of bytes to write, may be zero
* @param[in] rx_buffer Pointer to the byte-array of data to read from the device
* @param[in] rx_length Number of bytes to read, may be zero
* @param[in] write_fill Default data transmitted while performing a read
* @returns
* The number of bytes written and read from the device. This is
* maximum of tx_length and rx_length.
*/
int spi_master_block_write(spi_t *obj, const char *tx_buffer, int tx_length, char *rx_buffer, int rx_length, char write_fill);

/** Begin the SPI transfer. Buffer pointers and lengths are specified in tx_buff and rx_buff
*
* @param[in] obj The SPI object that holds the transfer information
* @param[in] tx The transmit buffer
* @param[in] tx_length The number of bytes to transmit
* @param[in] rx The receive buffer
* @param[in] rx_length The number of bytes to receive
* @param[in] bit_width The bit width of buffer words
* @param[in] event The logical OR of events to be registered
* @param[in] handler SPI interrupt handler
* @param[in] hint A suggestion for how to use DMA with this transfer
*/
void spi_master_transfer(spi_t *obj, const void *tx, size_t tx_length, void *rx, size_t rx_length, uint8_t bit_width, uint32_t handler, uint32_t event, DMAUsage hint);

But existing asynchronous API implementation (spi_master_transfer) and suggested changes interpreter them as total SPI frame number.

@LMESTM, @0xc0170 Which definition is correct? Are tx_length and rx_length size of corresponding buffers in bytes or number of SPI frames?

@mbed-ci
Copy link

mbed-ci commented Sep 30, 2021

Jenkins CI Test : ✔️ SUCCESS

Build Number: 1 | 🔒 Jenkins CI Job | 🌐 Logs & Artifacts

CLICK for Detailed Summary

jobs Status
jenkins-ci/mbed-os-ci_unittests ✔️
jenkins-ci/mbed-os-ci_build-cloud-example-ARM ✔️
jenkins-ci/mbed-os-ci_cmake-cloud-example-ARM ✔️
jenkins-ci/mbed-os-ci_cmake-cloud-example-GCC_ARM ✔️
jenkins-ci/mbed-os-ci_build-cloud-example-GCC_ARM ✔️
jenkins-ci/mbed-os-ci_build-greentea-GCC_ARM ✔️
jenkins-ci/mbed-os-ci_build-greentea-ARM ✔️
jenkins-ci/mbed-os-ci_build-example-ARM ✔️
jenkins-ci/mbed-os-ci_cmake-example-ARM ✔️
jenkins-ci/mbed-os-ci_cmake-example-GCC_ARM ✔️
jenkins-ci/mbed-os-ci_build-example-GCC_ARM ✔️
jenkins-ci/mbed-os-ci_greentea-test ✔️

@0xc0170
Copy link
Contributor

0xc0170 commented Sep 30, 2021

Question about the buffer sizes. everything is defined in bytes.

@JojoS62
Copy link
Contributor Author

JojoS62 commented Sep 30, 2021

yes, the class documentation says also bytes. But a 16 Bit operation can not send 1 byte, so how to handle this case? The buffer may also be to small and the type cast can crash.
I have used and checked only the 3-wire case, but yes, also the 4-wire spi_master_block_write() looks wrong. It takes bytes and sends them as larger words.
I have checked the grandpa implementation for LPC1768, its the same there in

char out = (i < tx_length) ? tx_buffer[i] : write_fill;

So it seems blockwrites with >8 bit are not working correctly for some longer time. But 16 bit writes are important e.g. for data transfer to a display, otherwise you have to fiddle with byte swapping for a large amount of data.

I'm also confused by

* * ::spi_master_block_write writes `tx_length` words to the bus - Verified by ::fpga_spi_test_common_no_ss and ::fpga_spi_test_common

this says words, what I experienced also as the working behavior in <6.15.0

@0xc0170
Copy link
Contributor

0xc0170 commented Sep 30, 2021

yes, the class documentation says also bytes. But a 16 Bit operation can not send 1 byte, so how to handle this case? The buffer may also be to small and the type cast can crash.

My understanding is that it's user's responsibility to use format/write properly (. You set up the format and then provide buffer (size in bytes, but should be multiple of format specified bits).

@JojoS62
Copy link
Contributor Author

JojoS62 commented Sep 30, 2021

I have checked again the previous behaviour (mbed 6.14.0), it was calling HAL_SPI_Transmit which gets a pointer and a size for datawords, so with 16 bit setting it was sending size * 16 bit.
This is what I understand also from spi_hal.h defined behaviour, in this case the documentation does not fit.

same topics:
#12448
#10399

@vznncv
Copy link
Contributor

vznncv commented Oct 2, 2021

@0xc0170

  1. Question about the buffer sizes. everything is defined in bytes.
    My understanding is that it's user's responsibility to use format/write properly (. You set up the format and then provide buffer (size in bytes, but should be multiple of format specified bits).

    Do I understand correctly that tx_buffer, tx_length, rx_buffer, rx_length in the functions spi_master_block_write and spi_master_transfer should be interpreted in the following ways:

    1. 8-bit SPI mode:

      const uint8_t *tx_frames = (const uint8_t *)tx_buffer;
      int tx_frames_number = tx_length;
      uint8_t *rx_frames = (uint8_t *)rx_buffer;
      int rx_frames_number = rx_length;
      
      // transmit/receive 8-bit frames ...
      
    2. 16-bit SPI mode:

      const uint16_t *tx_frames = (const uint16_t *)tx_buffer;
      MBED_ASSERT(tx_length % 2 == 0);
      int tx_frames_number = tx_length / 2;
      uint16_t *rx_frames = (uint16_t *)rx_buffer;
      MBED_ASSERT(rx_length % 2 == 0);
      int rx_frames_number = rx_length / 2;
      
      // transmit/receive 16-bit frames ...
      
    3. 32-bit SPI mode:

      const uint32_t *tx_frames = (const uint32_t *)tx_buffer;
      MBED_ASSERT(tx_length % 4 == 0);
      int tx_frames_number = tx_length / 4;
      uint32_t *rx_frames = (uint32_t *)rx_buffer;
      MBED_ASSERT(rx_length % 4 == 0);
      int rx_frames_number = rx_length / 4;
      
      // transmit/receive 32-bit frames ...
      

    note: currently only spi_master_transfer has such behavior:

    words = length >> bitshift;

    case SPI_TRANSFER_TYPE_TXRX:
    rc = HAL_SPI_TransmitReceive_IT(handle, (uint8_t *)tx, (uint8_t *)rx, words);
    break;
    case SPI_TRANSFER_TYPE_TX:
    rc = HAL_SPI_Transmit_IT(handle, (uint8_t *)tx, words);
    break;
    case SPI_TRANSFER_TYPE_RX:
    // the receive function also "transmits" the receive buffer so in order
    // to guarantee that 0xff is on the line, we explicitly memset it here
    memset(rx, SPI_FILL_CHAR, length);
    rc = HAL_SPI_Receive_IT(handle, (uint8_t *)rx, words);
    break;

  2. Mbed OS has SPI tests for 16/32 bit mode (hal/tests/TESTS/mbed_hal_fpga_ci_test_shield/spi/main.cpp). Are they run during CI/CD? (although it seems that a bug with a word receiving compensates a bug with word transmitting, so the test doesn't fail).

  3. Extra question about fill_char. spi_master_block_write has an extra argument char write_fill.
    What is expected fill_char behavior in 16/32 bit mode?

@JojoS62
Copy link
Contributor Author

JojoS62 commented Oct 3, 2021

  1. I have no preference about byte or framelength, it should just be the same for all targets and clearly documented in the API reference. There are some more implementations that use the template from LPC1768 I guess, and all are sending only a byte in larger frames. Sending blocks of data was a later extension, but of course it makes sense e.g. for sending display data fast.
  2. The FPGA test is interesting, didn't knew that before.
  3. the fill char should have the same size as the frames for reading / writing. It could be promoted to a larger type, but the SPI should be fast and not try to catch every possible combination.

edit to 1)
but I tend to use size in frames instead of bytes. It makes the asserts unnecessary and reduces the extra conversions. When I want to send 16 bit frame blocks, I usually have the number of 16 bit items. Compatibility cannot be an issue, the function cannot have worked for a long time. I wonder why this was not reported before, except the few issues that I mentioned already.

@jeromecoutant
Copy link
Collaborator

2. Mbed OS has SPI tests for 16/32 bit mode (`hal/tests/TESTS/mbed_hal_fpga_ci_test_shield/spi/main.cpp`). Are they run during CI/CD? (although it seems that a bug with a word receiving compensates a bug with word transmitting, so the test doesn't fail).

For example, in the above Jenkins CI Test comment, if you go in the Logs & Artifacts link,
you can see that 4 targets are with FPGA

@jeromecoutant
Copy link
Collaborator

So let's merge ? or it needs update ?

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 6, 2021

If no objections, I'll merge this tomorrow.

@JojoS62
Copy link
Contributor Author

JojoS62 commented Oct 6, 2021

this PR makes the SPI working like before in mbed-os-6.14, I would appreciate this.
The point is only, that this does not comply with the documentation. This can be another issue, to check also other targets and the FPGA test.

@vznncv
Copy link
Contributor

vznncv commented Oct 6, 2021

this PR makes the SPI working like before in mbed-os-6.14,

No, data reading isn't fixed.

this PR makes the SPI working like before in mbed-os-6.14, I would appreciate this.

I don't like idea of "bug" compatibility fixes.

If you run the following code with mbed-os-6.14:

#include "mbed.h"

static DigitalOut user_led(LED1, 1);

int main()
{
    PinName mosi = PB_5;
    PinName miso = PB_4;
    PinName sclk = PB_3;
    PinName ssel = PB_6;

    DigitalOut ssel_out(ssel, 1);

    int count = 0;

    size_t tx_word_len;
    size_t rx_word_len;
    const uint16_t tx_data[8] = {0x1122, 0x3344}; // allocate extra memory to prevent "out of range" error
    uint16_t rx_data[8] = {0};

    while (true) {
        user_led = 1;
        printf("Demo run %i\n", count);

        // 3-wire demo
        // note: don't read data in 3-wire mode to get the same 4-wire results
        tx_word_len = 2;
        rx_word_len = 0;
        {
            SPI spi(mosi, NC, sclk);
            spi.format(16);

            ssel_out = 0;

            // transfer by 1 word per SPI::write call
            for (size_t i = 0; i < tx_word_len; i++) {
                spi.write(tx_data[i]);
            }
            wait_us(64);
            // transfer by bulk SPI::write call
            spi.write((const char *)tx_data, tx_word_len * sizeof(*tx_data),
                      (char *)rx_data, rx_word_len * sizeof(*rx_data));
            wait_us(64);
            // asynchronous transfer
            spi.transfer((const char *)tx_data, tx_word_len * sizeof(*tx_data),
                         (char *)rx_data, rx_word_len * sizeof(*rx_data), nullptr);
            wait_us(64);

            ssel_out = 1;
        }

        // 4-wire demo
        tx_word_len = 2;
        rx_word_len = 2;
        {
            SPI spi(mosi, miso, sclk);
            spi.format(16);

            ssel_out = 0;

            // transfer by 1 word per SPI::write call
            for (size_t i = 0; i < tx_word_len; i++) {
                spi.write(tx_data[i]);
            }
            wait_us(64);
            // transfer by bulk SPI::write call
            spi.write((const char *)tx_data, tx_word_len * sizeof(*tx_data),
                      (char *)rx_data, rx_word_len * sizeof(*rx_data));
            wait_us(64);
            // asynchronous transfer
            spi.transfer((const char *)tx_data, tx_word_len * sizeof(*tx_data),
                         (char *)rx_data, rx_word_len * sizeof(*rx_data), nullptr);
            wait_us(64);

            ssel_out = 1;
        }


        ThisThread::sleep_for(100ms);
        user_led = 0;
        ThisThread::sleep_for(1900ms);
        count++;
    }

    return 0;
}

You will see the following SPI results:

  1. 3-wire mode

3-wire-bug

  1. 4-wire mode

4-wire-bug

I.e actual int SPI::write(const char *tx_buffer, int tx_length, char *rx_buffer, int rx_length) behavior differs from SPI::transfer and depends on SPI mode (3 wire or 4 wire). It isn't correct. I think that all SPI methods should handle data in 16 bit mode in the same way (i.e. result should be the same)

My understanding is that it's user's responsibility to use format/write properly (. You set up the format and then provide buffer (size in bytes, but should be multiple of format specified bits).

I have no preference about byte or framelength, it should just be the same for all targets and clearly documented in the API reference

Since poring guide (https://os.mbed.com/docs/mbed-os/v6.15/porting/spi-port.html) doesn't provide clear information about 16-bit mode implementation, I'd suggest to adhere to SPI class documentation where tx_length and rx_length are explicitly called as bytes:

/** Write to the SPI Slave and obtain the response.
*
* The total number of bytes sent and received will be the maximum of
* tx_length and rx_length. The bytes written will be padded with the
* value 0xff.
*
* @param tx_buffer Pointer to the byte-array of data to write to the device.
* @param tx_length Number of bytes to write, may be zero.
* @param rx_buffer Pointer to the byte-array of data to read from the device.
* @param rx_length Number of bytes to read, may be zero.
* @return
* The number of bytes written and read from the device. This is
* maximum of tx_length and rx_length.
*/
virtual int write(const char *tx_buffer, int tx_length, char *rx_buffer, int rx_length);

/** Start non-blocking SPI transfer using 8bit buffers.
*
* This function locks the deep sleep until any event has occurred.
*
* @param tx_buffer The TX buffer with data to be transferred. If NULL is passed,
* the default SPI value is sent.
* @param tx_length The length of TX buffer in bytes.
* @param rx_buffer The RX buffer which is used for received data. If NULL is passed,
* received data are ignored.
* @param rx_length The length of RX buffer in bytes.
* @param callback The event callback function.
* @param event The event mask of events to modify. @see spi_api.h for SPI events.
*
* @return Operation result.
* @retval 0 If the transfer has started.
* @retval -1 If SPI peripheral is busy.
*/
template<typename Type>
int transfer(const Type *tx_buffer, int tx_length, Type *rx_buffer, int rx_length, const event_callback_t &callback, int event = SPI_EVENT_COMPLETE)
{
if (spi_active(&_peripheral->spi)) {
return queue_transfer(tx_buffer, tx_length, rx_buffer, rx_length, sizeof(Type) * 8, callback, event);
}
start_transfer(tx_buffer, tx_length, rx_buffer, rx_length, sizeof(Type) * 8, callback, event);
return 0;
}

Such behavior may be implemented with changes like the followings in the stm_spi_api.c:

...
static inline int spi_get_word_from_buffer(const void *buffer, int bitshift)
{
    if (bitshift == 1) {
        return *((uint16_t *)buffer);
#ifdef HAS_32BIT_SPI_TRANSFERS
    } else if (bitshift == 2) {
        return *((uint32_t *)buffer);
#endif /* HAS_32BIT_SPI_TRANSFERS */
    } else {
        return *((uint8_t *)buffer);
    }
}

static inline void spi_put_word_to_buffer(void *buffer, int bitshift, int data)
{
    if (bitshift == 1) {
        *((uint16_t *)buffer) = data;
#ifdef HAS_32BIT_SPI_TRANSFERS
    } else if (bitshift == 2) {
        *((uint32_t *)buffer) = data;
#endif /* HAS_32BIT_SPI_TRANSFERS */
    } else {
        *((uint8_t *)buffer) = data;
    }
}
...
static int spi_master_one_wire_transfer(spi_t *obj, const char *tx_buffer, int tx_length,
                                        char *rx_buffer, int rx_length) {
    ...
    const int word_size = 0x01 << bitshift;
    ...
       for (int i = 0; i < tx_length; i += word_size) {
            msp_wait_writable(obj);
            msp_write_data(obj, spi_get_word_from_buffer(tx_buffer + i, bitshift), bitshift);
       }
    ...
    // SPI_IP_VERSION_V2
    ...
    /* Receive data */
        for (int i = 0; i < rx_length; i += word_size) {
            msp_wait_readable(obj);
            spi_put_word_to_buffer(rx_buffer + i, bitshift, msp_read_data(obj, bitshift));
        }
    ...
    // SPI_IP_VERSION_V1
    ...
        for (int i = 0; i < rx_length; i += word_size) {
            core_util_critical_section_enter();
            LL_SPI_Enable(SPI_INST(obj));
            /* Wait single SPI clock cycle. */
            wait_ns(baudrate_period_ns);
            LL_SPI_Disable(SPI_INST(obj));
            core_util_critical_section_exit();

            msp_wait_readable(obj);
            spi_put_word_to_buffer(rx_buffer + i, bitshift, msp_read_data(obj, bitshift));
        }
    ...
int spi_master_block_write(spi_t *obj, const char *tx_buffer, int tx_length,
                           char *rx_buffer, int rx_length, char write_fill)
{
    ...
    const int bitshift = datasize_to_transfer_bitshift(handle->Init.DataSize);
    MBED_ASSERT(tx_length >> bitshift << bitshift == tx_length);
    MBED_ASSERT(rx_length >> bitshift << bitshift == rx_length);
    int total = (tx_length > rx_length) ? tx_length : rx_length;
    int write_fill_frame = write_fill;
    for (int i = 0; i < bitshift; i++) {
        write_fill_frame = (write_fill_frame << 8) | write_fill;
    }

    if (handle->Init.Direction == SPI_DIRECTION_2LINES) {
        const int word_size = 0x01 << bitshift;
        for (int i = 0; i < total; i += word_size) {
            int out = (i < tx_length) ? spi_get_word_from_buffer(tx_buffer + i, bitshift) : write_fill_frame;
            int in = spi_master_write(obj, out);
            if (i < rx_length) {
                spi_put_word_to_buffer(rx_buffer + i, bitshift, in);
            }
        }
    } else {
    ...

Such solution has the following advantages/disadvantages:

  • (+) behavior matches SPI class documentation (I think that main interface class documentation should be primary source of truth of API behavior)
  • (+) all methods handle data in 16-bit mode in the same ways (SPI::write or SPI::transfer, 4-wire or 3-wire mode)
  • (-) it doesn't match behavior of current version and mbed-os 6.14 or lower

@JojoS62 @0xc0170

What do you think about it? If it's ok, @JojoS62 please implement "bytes" approach.

@JojoS62
Copy link
Contributor Author

JojoS62 commented Oct 20, 2021

its still on my todo list, I was struggeling with cmake.
I will use the modifications that @vznncv suggested.
Found also your hidden secrets in https://github.com/ARMmbed/mbed-os/tree/master/targets/TARGET_STM#readme
That is also interesting about performance and a (target specific) DMA implementation.
@vznncv When you have tested already your modifications and you have more time, than you can take over and I'll close my PR, no problem.

@mergify mergify bot dismissed 0xc0170’s stale review October 27, 2021 17:03

Pull request has been modified.

@JojoS62
Copy link
Contributor Author

JojoS62 commented Oct 27, 2021

its driving me nuts. I have used the test program from @vznncv on a H743. There the known problem was shown, but also the last asynch transfer was completely missing.
Then I used a F407, with different result: 3-wire does not work at all with this mix, 4-wire looks ok exept bulk write.

F407VG / 16 bit
grafik

for 8 bit, the result is ok.

@jeromecoutant
Copy link
Collaborator

its driving me nuts. I have used the test program from @vznncv on a H743. There the known problem was shown, but also the last asynch transfer was completely missing. Then I used a F407, with different result: 3-wire does not work at all with this mix, 4-wire looks ok exept bulk write.

Note that ST SPI IP has changed with STM32H7:
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_STM/TARGET_STM32H7/spi_device.h#L22

@JojoS62
Copy link
Contributor Author

JojoS62 commented Oct 28, 2021

yes, thanks, I have seen also the different treatment in the source code.
It may also be a problem with contructing and destructing the SPI object (too fast), I haven't checked this yet. For a test, I will focus first now on the F4 and try static SPI instances to avoid possible initialization problems. This should work also.
I see also that my rebase pulled in a lot of commits, I will fix it also.
With F4 and mbed-os-6.14.0, I get the same test result as Konstantin.

@JojoS62
Copy link
Contributor Author

JojoS62 commented Oct 28, 2021

@vznncv I've tried now your modifications, but I get the same wrong result with mbed-os-6.15, my modification and also your modification. I hope I got it right, its better when you send a git patch or the whole file.

mbed-6.14.0 3-Wire 1 word:
grafik

mbed-6.15.0 3-Wire 1 word:
grafik

mbed-6.15.0 + modifications:
grafik

it is strange, it is sending too many clocks. How can this happen? Can you confirm?
I have tested on a F407.

in spi_master_write. This fixes the problem with too many spi_sck
@JojoS62
Copy link
Contributor Author

JojoS62 commented Oct 28, 2021

this is now working, but is not using spi_master_one_wire_transfer(), so it still needs to be checked why this is not working.

grafik

@JojoS62 JojoS62 marked this pull request as draft October 28, 2021 21:46
@JojoS62
Copy link
Contributor Author

JojoS62 commented Oct 29, 2021

now it looks fine on the F407, also with LL SPI. I added a SPI_1LINE_TX(handle);before enabling SPI, this prevented it to start generating clocks. There is only one glitch in clock, but before setting spi enable.
Now it needs to be tested also on H7, is someone else in for testing?

on F407:
grafik

on H743:
mbed-os-6-spi-trial-1812d4fd5df4a13e127b6c13aa9bb9f2dee53437-4Wire-NOK

@JojoS62
Copy link
Contributor Author

JojoS62 commented Oct 30, 2021

some points I've found so far:

  • the driver for F4 is working now
  • disabling/enabling is expensive and should be avoided in read/write. For the the F4, it is working with a fixed setting
  • F4: after initializing, a set direction avoids start firing clocks
  • H7: set direction is locked when SPI is enabled. disable/enable is also expensive here
  • the testprogram had an error in casting to (uint8_t*) for transfer function. The template evaluates the bit_length to 8 bits instead of 16.
  • the testprogram is not working yet with transfer async 4-wire.
  • transfer async 4 wire calls HAL_SPI_TransmitReceive_IT. Validatings are ok, data is written to TXDR, but no data is sent on SPI
  • transfer async 4 wire also does not work in mbed-os-6.14.0, so before Konstantins changes
  • when this function is used with HAL_SPI_TransmitReceive instead of HAL_SPI_TransmitReceive_IT, the data is sent properly
  • when used with 8 Bit format, the async 4 wire is sending
  • a test program with CubeMX, same settings and same hardware is working with HAL_SPI_TransmitReceive_IT

the problem is definetly the complex enable/disable sequence on the H7. And also the mix of LL and HAL makes it damned hard to handle.
I recommend to switch back to HAL functions, they are safer more compatible between MCU series. For better performance when sending data blocks, the transfer function should be used.

@vznncv
Copy link
Contributor

vznncv commented Oct 31, 2021

Hi @JojoS62

I'll check your changes with my boards (F411 and H743) in 2-3 day.

Notes:
I see that you have changed 3-Wire SPI logic for optimization purposes: keep SPI enabled with TX during idle. But there are some notes:

  • according F4 reference manual (and code of HAL library), the SPI should disabled when we need to change transfer direction
  • you have deleted LL_SPI_SetTransferDirection(SPI_INST(obj), LL_SPI_HALF_DUPLEX_TX); line from transmit part, but I don't see any line that restores TX mode after data receiving.

@JojoS62
Copy link
Contributor Author

JojoS62 commented Nov 1, 2021

* according F4 reference manual (and code of HAL library), the SPI should disabled when we need to change transfer direction

yes, that will be better. Also the transfer for H7 needs to start in disabled state. I wanted to shorten the gap between two writes, it takes also on the fast H7 a few microsecnds.

I have published the current testcode on github:
https://github.com/JojoS62/testSPI

And played also with higher SPI speeds. When the H7 is used with PLL3P, then a modification in the init code is neccessary, it destroys a previous set PLL3 setting.
But for higher speeds, DMA would be better and that is missing in mbed-os. Because there are so many options, it will be easier to use CubeMX code in a subclassed SPI. Not portable, but I think such features are beyond a versatile OS.

for SPI in H7, there is a valuable resources in AN5543 and www.st.com/content/ccc/resource/training/technical/product_training/group0/52/17/7a/28/2b/90/41/7d/STM32H7-Peripheral-Serial_Peripheral_interface_SPI/files/STM32H7-Peripheral-Serial_Peripheral_interface_SPI.pdf/_jcr_content/translations/en.STM32H7-Peripheral-Serial_Peripheral_interface_SPI.pdf

@jeromecoutant
Copy link
Collaborator

FYI: I executed SPI non regression basic tests with all STM32 families with the current patches
They are all OK

@JojoS62
Copy link
Contributor Author

JojoS62 commented Nov 12, 2021

I will try to continue tomorrow and fix the issues that @vznncv mentioned.

@vznncv
Copy link
Contributor

vznncv commented Nov 28, 2021

Hi @JojoS62

Sorry for delayed answer. I was busy and didn't have enough time to check changes/fixes.

But finally I have updated my test example and test/check spi.

demo/example results

Example: https://github.com/vznncv/mbed-os-stm32-spi-3-wire-demo/tree/iss_spi_16bit

The example communicates with sensor (it sends data to 6 "free" 8-bit registers and then reads them).
I have updated this example to use 16-bit mode for data writing/reading. The test code isn't
efficient, since I need to switch between 8-bit and 16-bit mode often to send address byte, but anyway it allows to
automatically check the correctness of 16-bit transfer.
Additionally, I have added basic "loopback" example to test 16-bit mode with 4 wires.

For testing purposes I have used patches that I suggested (+some bug fixes that I made during debugging). The code with
patches: https://github.com/vznncv/mbed-os/tree/iss_spi_16bit (git patch: 0001-Fix-SPI-16-bit-logic.patch.txt)

I have tested it with 3 MCU:

  1. STM32F103C8, result logs:

  2. STM32F411CE, result logs:

  3. STM32H743VI, result logs:

Summary:

  1. synchronous API (3 and 4 wire modes) works without problems (or at least I haven't found any issues)
  2. asynchronous API works in some cases, but it has the following problems:
    • STM32H7 - 16 bit and 4-wire mode doesn't work, as you mentioned above
    • STM32F4, STM32F1 - 3-wire mode doesn't work correctly (there are some problems with HAL library)

SPI enabling/disabling optimization notes

  1. The default HAL library behavior for 3-wire mode (HAL_SPI_Transmit, HAL_SPI_Receive, HAL_SPI_Receive_IT
    , HAL_SPI_Transmit_IT) - disable SPI between transaction.

  2. For enabling/disabling optimization you need to adjust spi_irq_handler_asynch interrupt handler,
    since it calls the HAL_SPI_IRQHandler, that calls SPI_CloseRx_ISR (via SPI_RxISR_16BIT, ... callbacks), that disables SPI.

  3. disabling/enabling is expensive and should be avoided in read/write
    H7: set direction is locked when SPI is enabled. disable/enable is also expensive here

    I temporarily update spi_master_one_wire_transfer for data writing:

    LL_SPI_Disable(SPI_INST(obj));
    for (int i = 0; i < tx_length; i += word_size) {
        LL_SPI_Enable(SPI_INST(obj));
        msp_wait_writable(obj);
        msp_write_data(obj, spi_get_word_from_buffer(tx_buffer + i, bitshift), bitshift);
        msp_wait_not_busy(obj);
        LL_SPI_Disable(SPI_INST(obj));
    }
    

    But I don't note any significant changes. Only tiny delay (~0.5 us) between frames (I have checked it with
    STM32F411CE and 1 MHz SPI frequency). The base SPI transaction setup/validation code gives more overhead. I'm not
    sure if such optimization gives significant performance boost.

    Do you have any comparison table of SPI performance with/without optimization?

  4. I recommend to switch back to HAL functions, they are safer more compatible between MCU series. For better performance when sending data blocks, the transfer function should be used.

    Yes, they are safer and more compatible between MCU series, but unfortunately they don't implement correct 3-wire
    mode for SPI data reading (except STM32H7 series).

Summary

I'd suggest to split this pull request into 2 ones:

  1. Basic SPI fix to work with 16 bit mode correctly (at least synchronous API). If it's needed, I can create it.
  2. SPI 3-wire mode optimization, that you have suggested.

What do you think about it?

@vznncv
Copy link
Contributor

vznncv commented Dec 2, 2021

STM32H7 SPI async API (16 bit and 4-wire mode) notes

I have compared CubeMX demo program and stm_spi_api.c and found problem: STM32H7 HAL API assumes that SPI is disabled between function (HAL_SPI_TransmitReceive_IT, HAL_SPI_TransmitReceive, etc.) calls. It's needed for transfer size modification (CR2 register), that can be set only if SPI is disables. If SPI is enabled before synchronous API call (HAL_SPI_TransmitReceive) it doesn't hinder transfer, but may cause problems with asynchronous API.

After adding helper code that disables SPI before HAL_SPI_TransmitReceive_IT call and enables it after transmission, the problems with async API disappears: vznncv@44e2735

@mergify
Copy link

mergify bot commented Jan 17, 2022

This PR cannot be merged due to conflicts. Please rebase to resolve them.

@JojoS62
Copy link
Contributor Author

JojoS62 commented Jan 24, 2022

closed in favour of #15206
thanks @vznncv

@0xc0170 0xc0170 closed this Jan 26, 2022
@mergify mergify bot removed needs: work devices: st release-type: patch Indentifies a PR as containing just a patch labels Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants