Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAI: audio pll, sai peripheral driver, and small rtic synthesizing sample #143

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

SpinFast
Copy link
Contributor

@SpinFast SpinFast commented Nov 5, 2023

A work in progress port of the mcux sdk SAI peripheral driver to imxrt-hal

This has to be one of the more complicated peripherals to configure given the sheer number of clocking, polarity, timing, and input/output options.

The first SAI peripheral additionally seems to allow for more than a single tx/rx channel which adds to the complications. After an initial attempt I decided to retry basing things on the mcux sdk driver which works well enough.

The basic idea is a configuration struct is constructed followed by handles for tx or rx channels. I've gotten most of the enums/config structs setup and a bare bones sample sort of in progress.

@mciantyre mciantyre linked an issue Nov 6, 2023 that may be closed by this pull request
@mciantyre
Copy link
Member

Thanks for adding this. I'm not familiar with the SAI peripheral, but I'll keep studying the reference manual to ramp up.

No preference on designs; using a primary Sai object as a factory for TX / RX channels looks like it will work, especially if channels have distinct registers they're working with. If you need a RAL trick for fabricating new sai::Instance<N> objects for the channels, these advanced usage tips might help.

@SpinFast SpinFast force-pushed the imxrt-sai branch 2 times, most recently from 70001f1 to 1fd9486 Compare November 7, 2023 20:59
@SpinFast
Copy link
Contributor Author

SpinFast commented Nov 7, 2023

Thanks for adding this. I'm not familiar with the SAI peripheral, but I'll keep studying the reference manual to ramp up.

No preference on designs; using a primary Sai object as a factory for TX / RX channels looks like it will work, especially if channels have distinct registers they're working with. If you need a RAL trick for fabricating new sai::Instance<N> objects for the channels, these advanced usage tips might help.

That was helpful thank you. I have 3 layers to this cake.

The peripheral, opt in tx/rx things with pins, opt in tx/rx channels with a data pin each (each with its own FIFO and DMA address and such), and then in the end I'd like to be able to read/write to each channel or maybe all channels at the same time.

The audio frame data can represent multichannel audio itself, so for example mono, stereo, or N many audio channels being tx/rx'ed from a codec chip which itself has multiple dacs/adcs for talking to headphones, analog mics, line in/out, etc.

It's all very confusing as I'm relatively new to these serial audio formats but its interesting to learn as I go.

@SpinFast
Copy link
Contributor Author

SpinFast commented Nov 9, 2023

Added various needed clocking modules/functions for setting up the audio pll (pll4), connecting it up to each sai clock root and setting a divider up.

It's not entirely clear the pll setup is correct but I believe it to be. The pll is bypassed, disabled, and powered down before then updating the various pll num, denom, div_select options, and then restarting it. I'd definitely appreciate some feedback on this part at the moment @mciantyre

@SpinFast SpinFast force-pushed the imxrt-sai branch 2 times, most recently from 7201007 to 8ebe470 Compare November 10, 2023 18:28
@mciantyre
Copy link
Member

I think the PLL4 setup is working. If you have a scope / logic analyzer available, the hal_clock_out example might be helpful. Using a serial console menu, we can route clocks to the CLKO1 and CLKO2 outputs, which have test points on some EVKs. Sorry in advanced for not having it set up for 1060 MCUs and EVKs.

I gave this a shot on my 1010EVK. After configuring PLL4 with the reset function, I routed PLL4 to CLKO1 and introduced an extra divide-by-8. I saw 22.7 MHz on TP34, which about what I expected to see. I pushed my test if you'd like to reproduce.

Screenshot 2023-11-10 at 21 29 16

@SpinFast SpinFast force-pushed the imxrt-sai branch 2 times, most recently from ca90d3a to 93c3098 Compare November 13, 2023 23:40
@SpinFast
Copy link
Contributor Author

SpinFast commented Nov 13, 2023

As I noted on matrix, I concocted a python script to find optimal clock tree values given a desired output clock and it works pretty well! It requires python and gekko, and uses apopt to find a solution (in 1.6s on my machine). Sadly the solver (apopt) isn't FOSS so probably not great to add to the tree as I'm unsure of what the implications would be.

https://gist.github.com/SpinFast/aa1ec02f177b89858c7e1ffb9e77b543

@Dirbaio noted Z3 could perhaps be used to solve the same sort of problem and is FOSS licensed, it also happens to have a Rust wrapper. Maybe a build.rs clock tree builder could be concocted one day. Would be neat!

I setup the 1060evk and 1010evk clocks to match in board file for 10xx (presumably the other 10xxevks could vary if needed but I'd guess stereo 48KHz 16bit audio would be just fine anywhere)

In the end I'd like to playback audio with sai1 to the codec and sai3 to medium quality sound, so you can hear the sound out of the little speaker on the evk

@SpinFast
Copy link
Contributor Author

I got the sai1 root clock working like I'd expect I believe in looking at the output of the hal_clock_out sample. That's a helpful example to have around

@teburd
Copy link
Member

teburd commented Nov 14, 2023

Glad to see this being done, I did try out the clock out sample with this PR and got what measured around 2.3MHz, I would have expected 48000*16*2 = 1536000 so maybe a divider is off or something.

@SpinFast SpinFast force-pushed the imxrt-sai branch 5 times, most recently from 925799f to 4caa116 Compare November 16, 2023 18:03
@SpinFast
Copy link
Contributor Author

Depends on imxrt-rs/imxrt-iomuxc#40

@SpinFast SpinFast force-pushed the imxrt-sai branch 8 times, most recently from b1ef698 to d64ae47 Compare November 23, 2023 19:26
@SpinFast
Copy link
Contributor Author

I think this is mostly ready to review, still waiting on the chiptool merge and release I guess for the wm8960 driver, I might give that another week before simply including the generated files in a release and moving on, leaving the generation at build stuff in branch

@SpinFast SpinFast force-pushed the imxrt-sai branch 6 times, most recently from c6ba788 to 34a4da3 Compare December 17, 2023 15:48
@SpinFast
Copy link
Contributor Author

1176 issue is tied to an issue in the 1176 svd, see imxrt-rs/imxrt-ral#45

Copy link
Contributor

@Florob Florob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my 2 cents:

  • As a personal preference I would capitalize things like "PLL", "SAI"; "I2S, etc." in commit descriptions and comments
  • It looks to me like Rx is not in a usable state yet, is that correct?
  • Do you have a plan for how RX pin config works on setups where BCLK is shared (or at least I think that is what 1020/1060 EVKs are doing)? Can we e.g. let () implement RxBclk?

//24576kHz/16 = 1536000.000 Hz
//Ideal bclk is (48000*2*16) = 1536000
//24576kHz/512 = 48kHz
//Ideal frame sync is 48000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message additionally suggests that this might be WM8960 specific.
Fortunately this is AFAICT usable much more generally. 24.576MHz can often be used for various 48kHz based sampling rates with 16 or 32bit. AKM has nice tables showing this in their datasheets.E.g. AK4452 Tables 3+4/6+7 show this is usable up to hex speed. BICK requirements probably further limit it to 16bit @ 384kHz.

clock_gate::sai::<1>(),
//#[cfg(not(feature = "imxrt1010"))]
//clock_gate::sai::<2>(),
//clock_gate::sai::<3>(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this unfinished?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this commented code would work. The board crate can't detect an imxrt-hal crate feature called imxrt1010, since they're different crates. And as I write this, board doesn't have its own imxrt1010 feature.

I have two ideas that might work:

  1. Each board has its own clock gate collection (one example, another example). Since the SAI instance used by a board depends on the board, we can put the clock gate locator there. Board configuration considers the combination of these common clocks and the board-specific clocks, so it's equivalent to what we're going for.
  2. Change each 10xx board to expose a constant, like SAI_INSTANCE, that's used like clock_gate::sai::<{board_impl::SAI_INSTANCE}>() in this collection of common clock gate locators.

I prefer the first approach. That should make it easier to introduce a board that doesn't immediately support an SAI example, and there's precedent.

@@ -66,6 +66,13 @@ pub(crate) mod ccm {
clock_gate::lpi2c::<4>(),
];

/// All SAI clock gates.
pub const SAI_CLOCK_GATES: &[clock_gate::Locator] = &[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced adding these arrays (here and for 1010) is sensible.
Other *_CLOCK_GATES contain gates that are fed from the same clock root. That way they can easily be disabled before making changes to the clock root. SAI however has a separate clock root per instance, so we usually don't have to gate all of the instances.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I remember suggesting / doubling-down on the 1010 approach in an experiment, but I was being hasty. Sorry for misleading!

If each SAI peripheral has its own clock root, then we can skip this collection. ee76313 notes that we're taking a similar approach for FlexIO.


/// Get the status register of the transmitter, this can be used in conjuction with
/// status field masks to determine the state of the SAI peripheral.
pub fn status(&mut self) -> u32 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should follow the example set by other status() methods returning a Status struct generated by bitflags.

///
/// sai_tx.set_int_mask(TCSR::FEIE::mask | TCSR::FWIE::mask | TCSR::FRIE::mask);
/// ```
pub fn set_int_mask(&mut self, mask: u32) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use some more type safety. I think it would be good if it followed the same API as flexpwm::Submodule::set_interrupts().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it could, will take a closer look at the flexpwm driver for ideas

src/common/sai.rs Outdated Show resolved Hide resolved
src/common/sai.rs Outdated Show resolved Hide resolved
//

/// How frequently (milliseconds) should we poll audio
const AUDIO_POLL_MS: u32 = 1000 * (board::PIT_FREQUENCY / 1_000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the PIT is only used for printing debug information now. Variable/constant names and comments should probably be updated accordingly.

clock_gate::sai::<1>(),
//#[cfg(not(feature = "imxrt1010"))]
//clock_gate::sai::<2>(),
//clock_gate::sai::<3>(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this commented code would work. The board crate can't detect an imxrt-hal crate feature called imxrt1010, since they're different crates. And as I write this, board doesn't have its own imxrt1010 feature.

I have two ideas that might work:

  1. Each board has its own clock gate collection (one example, another example). Since the SAI instance used by a board depends on the board, we can put the clock gate locator there. Board configuration considers the combination of these common clocks and the board-specific clocks, so it's equivalent to what we're going for.
  2. Change each 10xx board to expose a constant, like SAI_INSTANCE, that's used like clock_gate::sai::<{board_impl::SAI_INSTANCE}>() in this collection of common clock gate locators.

I prefer the first approach. That should make it easier to introduce a board that doesn't immediately support an SAI example, and there's precedent.

@@ -66,6 +66,13 @@ pub(crate) mod ccm {
clock_gate::lpi2c::<4>(),
];

/// All SAI clock gates.
pub const SAI_CLOCK_GATES: &[clock_gate::Locator] = &[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I remember suggesting / doubling-down on the 1010 approach in an experiment, but I was being hasty. Sorry for misleading!

If each SAI peripheral has its own clock root, then we can skip this collection. ee76313 notes that we're taking a similar approach for FlexIO.

) {
// Set the mclk pin to be an output
unsafe {
ral::write_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR1, SAI1_MCLK_DIR: 1);
Copy link
Member

@mciantyre mciantyre Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI drew my eye here. Regarding chip portability:

  • For the 1010, 1020, and 1060 series, we should target the proper field based on the const generic N. As written, an Sai<3, ...>::split will (re)configure the SAI1 clock direction, not the SAI3 clock direction.
  • For the 1170 series, we should target the proper register and field based on the const generic N.

If splitting depends on a chip-specific behavior, we might only want to expose it when a HAL chip feature is turned on. The way we enable DMA APIs for peripherals is a reference for how it could work. Otherwise, I think we'll need a common module to take a dependency on chip, which hasn't been done yet.

Other observations:

  • For both the 1010 and the 1060, there are other bit fields in this GPR1 register. A write_reg! will set those other fields to zero. Consider using a modify_reg! here to preserve other bits that are already set high.
  • If the user isn't providing a &mut to the IOMUXC_GPR instance, we'll want to know we're in a critical section while we perform our read-modify-write.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like this suffice? Also is this instance() call kosher?

// could this stay in common/sai.rs?
#[cfg(family = "imxrt10xx")]
unsafe fn configure_sai_mclk_dir<const N: u8>(direction: u8) {
    match N {
        1 => ral::modify_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR1, SAI1_MCLK_DIR: direction),
        2 => ral::modify_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR1, SAI2_MCLK_DIR: direction),
        3 => ral::modify_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR1, SAI3_MCLK_DIR: direction),
        _ => panic!("Unsupported SAI instance: {}", N),
    }
}

#[cfg(family = "imxrt11xx")]
unsafe fn configure_sai_mclk_dir<const N: u8>(direction: u8) {
    match N {
        1 => ral::modify_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR0, SAI1_MCLK_DIR: direction),
        2 => ral::modify_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR1, SAI2_MCLK_DIR: direction),
        3 => ral::modify_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR2, SAI3_MCLK_DIR: direction),
        4 => ral::modify_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR2, SAI4_MCLK_DIR: direction),
        _ => panic!("Unsupported SAI instance: {}", N),
    }
}
Suggested change
ral::write_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR1, SAI1_MCLK_DIR: 1);
configure_sai_mclk_dir::<N>(1);

Copy link
Member

@mciantyre mciantyre Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like this suffice?

I haven't tested, but I'm not sure if the family = "imxrt10xx" branch

2 => ral::modify_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR1, SAI2_MCLK_DIR: direction),

will compile for all MCUs. The 1010 MCUs don't have a SAI2_MCLK_DIR field. But don't worry; CI should catch that. If that's the only issue, a conditional compile like

#[cfg(not(chip="imxrt1010"))]
2 => ral::modify_reg!(ral::iomuxc_gpr, ral::iomuxc_gpr::IOMUXC_GPR::instance(), GPR1, SAI2_MCLK_DIR: direction),

might fix it.

Separately, I recommend using the instance validity trick shown here to turn those panics into unreachable branches. A 1010 user might appreciate when their accidental configure_sai_mclk_dir::<2>(1) turns into a compile error instead of panicking at runtime.

The input direction: u8 could be is_output: bool to clarify the configuration we're performing. Additionally, a bool ensures that we can only pass two values into this one-bit-wide field.

Also is this instance() call kosher?

That'll have the same chance for race conditions as the previous code. If it's convenient to hide the IOMUXC_GPR dependency from our users, we can wrap the entire match in a critical_section.

could this stay in common/sai.rs?

Let's see how it plays out. If it remains in common/sai.rs, I think we'll need to have a configure_sai_mclk_dir implementation for when there is no family cfg selection. I'm not sure what that implementation would do.

@SpinFast
Copy link
Contributor Author

SpinFast commented Jun 7, 2024

I'm still looking to work on this at some point this summer and clean it up, but if anyone wishes to continue the work please don't let me stop you!

@mciantyre
Copy link
Member

I pushed some commits after a deeper review of the SAI clock contributions. If they look questionable, we can drop them.

@Florob
Copy link
Contributor

Florob commented Jun 24, 2024

I have fixups for bitflags based Interrupts/Status as well as a rebase to current main (RTIC v2) on https://github.com/Florob/imxrt-hal/tree/imxrt-sai, if someone wants to apply those here.

@dwbrite
Copy link

dwbrite commented Jun 26, 2024

For a demo sai1 impl on teensy we'll need:

TX_DATA0
    GPIO_B1_01
TX_DATA1 xor RX_DATA3
    GPIO_B0_12
TX_DATA2 xor RX_DATA2
    GPIO_B0_11
TX_DATA3 xor RX_DATA1
    GPIO_B0_10
RX_DATA0
    GPIO_B1_00

// SAI1 Clock and Sync

TX_BCLK / TX_SYNC nc on 4.0 (generally tied to rx counterparts?) GPIO_B1_02 / GPIO_B1_03 on 4.1
RX_BCLK
    GPIO_AD_B1_11
RX_SYNC
    GPIO_AD_B1_10
MCLK
    GPIO_AD_B1_09

@dwbrite
Copy link

dwbrite commented Jun 26, 2024

And, for my own sake, the action items in this thread, since it can be difficult to keep track:

merge interrupt/API fixes+ rebase to main
🔲 address SAI chip portability + bug fixes
🔲 delete SAI_CLOCK_GATES array?
🔲 per-board clock gate collections?
🔲 support teensy4 😉

Is there anything I'm missing?

@mciantyre
Copy link
Member

I think that's the gist!

My imxrt-sai branch has Florob's and my suggestions. If it's OK with @SpinFast, I'm happy to apply the fixups and force push to this branch.

@SpinFast
Copy link
Contributor Author

SpinFast commented Jun 26, 2024

I think that's the gist!

My imxrt-sai branch has Florob's and my suggestions. If it's OK with @SpinFast, I'm happy to apply the fixups and force push to this branch.

Please do! Let me know if there's anything easy I can do to help.

Provides a pll4 reset function with all the options to set the
div_select, num, denom values with asserts ensuring bounds are kept
based on the data sheet. It would be nice to assert these at compile
time if possible and could be a future improvement to ensure the pll
options are valid.

The sai clock gating, selection, and divisor settings are also added
enabling the full clocking of a sai peripheral from the audio pll to the
peripheral input clock.

The board clocks for sai1 are setup to generate an appropriate clock for
a 48KHz 16bit stereo i2s stream by default that a wolfson wm8960 codec
can pick up for playback.
SAI can be modeled like a UART in some regards where there are a Tx/Rx
pair. Unlike a UART though SAI in iMXRT can have multiple data channels
connected to the same peripheral. While each channel has its own fifo
and pin they are still part of the same peripheral instance. Meaning the
clock, status, FIFO watermark, and more are all the shared.

This adds some complications. Use of the Tx and Rx together is also
optional, its quite possible only Tx or Rx in use.

So modeling this API requires some interesting use of generics, but in
effect the Sai struct is a builder that enables building a Sai Tx/Rx
optional pair given pins and builder like options.

The goal in the end is to have static const generated register fields
from perhaps a const fn friendly builder type (SaiConfig).

From there a Tx/Rx pair provide functionality for writing
frames of audio samples per channel, managing interrupt masks, checking
status, and enabling/disabling the transmitter or receiver.

In some instances the Tx/Rx pair are *synchronized* meaning between the
pair one will drive the frame sync clock of the other.

This leads to a small potential race at the moment if a Tx and Rx are
setup as synchronized and the tx/rx is enabled/disabled independent of
the other that I can see. I don't have a great way of solving this
particular setup in a safe way at the moment.
Provides a sample application for playing back audio without DMA
directly writing to the FIFO of the SAI peripheral, which should result
in an I2S signaling to the WM8960 codec on the EVK boards. The codec
should then output a nice square wave on one channel and a sine wave on
the other.

The example in this case uses a PIT timer to print the status and a SAI1
interrupt noting a FIFO request to synthesize samples when needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SAI/I2S Support
5 participants