Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDL verification #24

Closed
gkasprow opened this issue Jul 22, 2020 · 53 comments
Closed

HDL verification #24

gkasprow opened this issue Jul 22, 2020 · 53 comments

Comments

@gkasprow
Copy link
Member

@sbourdeauducq did you look at SoC pins assignments? We want to produce it pretty soon
@filipswit did you run the SI-DDR test after routing?

@filipswit
Copy link
Collaborator

nope, I didn't.

@sbourdeauducq
Copy link
Member

@sbourdeauducq did you look at SoC pins assignments?

No. We are scrambling to get things to work on the ZC706 before the deadline, lots of difficult problems appeared when we tried to run non-trivial kernels.

@sbourdeauducq
Copy link
Member

Could you publish PDF schematics (we don't have Windows machines or Altium, and the online viewer is a bit unwieldy), and a machine-readable version of the FPGA pin assignments?

@filipswit
Copy link
Collaborator

@sbourdeauducq
Copy link
Member

Thx

@sbourdeauducq
Copy link
Member

What is the purpose of the PL watchdog? The PL is generally quite well-behaved, it's the PS that is often troublesome.

@sbourdeauducq
Copy link
Member

Table on Sheet 7 says " 1 | 1 => SD Card (not on board)", but now we have a SD card.

@sbourdeauducq
Copy link
Member

Clock-capable EEM pins (indexed 0) must go to CC pins!
The clock rules on the PL are the same as regular 7-series FPGAs.

@marmeladapk
Copy link
Member

@filipswit It would be better to tag a commit and make a (preliminary) release and post PDF there. It won't clutter up the repository (which can significantly increase in size due to many binary files) as PDF is an output file not a source. We follow this approach in other Sinara repositories.

@sbourdeauducq
Copy link
Member

I suggest connecting RGMII to the PS (MIO). We have a driver for it so we might as well use it; and other Zynq code (e.g. some people love embedded Linux) also typically expects the PS Ethernet controller.
What's the idea with Ethernet anyway? Are we going to use primarily the RJ45 jack (with PS MAC) or the SFP (with transceiver and PL MAC)?

@hartytp
Copy link

hartytp commented Jul 23, 2020

What's the idea with Ethernet anyway? Are we going to use primarily the RJ45 jack (with PS MAC) or the SFP (with transceiver and PL MAC)?

I'd love to have Ethernet on the RJ45 in addition to the standard 4 SFPs.

As an example use case: it's been an ambition of ours for a while to have a feature which lets us stream out diagnostic data via ethernet. e.g. ADC traces to implement scope-like functionality. Would be great to have this on satellites while still maintaining our 3 downstream DRTIO ports.

@sbourdeauducq
Copy link
Member

I2C can go to the PS so we can autodetect the cards without any gateware programmed (@jordens might like this).
It would be fun to run yosys and prjxray on the PS and have the card build its own gateware :)

Many of the other GPIO signals currently in PL bank 13 (fan, pushbutton, error LED, maybe some of the user LEDs, etc.) can go to the PS as well, but definitely not the WRPLL I2C. The Zynq architecture is PS-centric (for better or worse, IMO mostly the latter), and the card should reflect that.

Note: if we need to free resources, the QSPI NOR flash can go. The SD card works well here (both r+w) and we don't currently plan to use the flash at all.

@gkasprow
Copy link
Member Author

@filipswit Moreover, EEMx signals should be in the same bank. Remember when changing CC pins assignments.

@sbourdeauducq
Copy link
Member

Why do we have an EEPROM (IC8) with a dedicated I2C bus?
Do we need the DS2411R identifier device when we can get I2C EEPROMs with a unique ID burned in instead?

@gkasprow
Copy link
Member Author

@sbourdeauducq Ethernet is routed to PL for flexibility. The embedded MAC can talk to it via EMIO easily.

@gkasprow
Copy link
Member Author

I2C goes to PL for flexibility. We had issues with PS I2C reliability and had to use IPcore instead. But it was an issue with early silicon versions.

@sbourdeauducq
Copy link
Member

Ethernet is routed to PL for flexibility. The embedded MAC can talk to it via EMIO easily.

It's more a compromise than "flexibility": This requires the gateware to be programmed for Ethernet to work, and then we cannot download gateware from the network during boot, for example.
Also, I am worried about timing issues with the gateware routing delays between EMIO and PL I/O pins.

We had issues with PS I2C reliability

PS peripherals are generally garbage indeed, but in the case of I2C I think we can use the MIO in GPIO mode and bit-bang from software. That should work around the idiocy built into the PS I2C controller. If you're worried we could demonstrate that on a devkit.

@sbourdeauducq
Copy link
Member

I'm a little skeptical of the analog hack that involves AC coupling of the fan PWM signal. Has this been tested before?

@sbourdeauducq
Copy link
Member

Don't we have the same I2C bug with the FTDI chip that caused contention on Kasli v2?

@sbourdeauducq
Copy link
Member

sbourdeauducq commented Jul 23, 2020

The SRST signal absolutely needs to go on the FTDI chip GPIO directly, with a configuration that is strictly identical to a known-good programming adapter. Controlling it via FTDI I2C, switches and expanders in OpenOCD (or worse, Vivado, which already crashes every 15min when connected to a Zynq device) will be an absolute disaster and result in insane levels of frustration.

For example, SRST control mostly works with the Olimex ARM-USB-TINY-H, but I could not get it to work with the internal JTAG module of the ZC706.

Note that even with SRST control, there are still plenty of PS JTAG bugs and weird hardware states that require a power cycle to get out of.

Even if we do manage to get the FTDI chip and the other I2C devices to behave, SRST-on-I2C would create one more such state where the Zynq chip locks up the I2C bus and control cannot be recovered as the programmer needs to pulse SRST to access the Zynq.

@sbourdeauducq
Copy link
Member

And, someone should check, but I think that if you do not send a SRST pulse (constant level is not sufficient) then the ARM cores do not appear on the JTAG chain, only the PL does. So this signal is really important.

@sbourdeauducq
Copy link
Member

SRST also needs to be exposed on J8.

@sbourdeauducq
Copy link
Member

Nitpick: do we need a "general purpose pushbutton"? Kasli doesn't have one and there's no use case for it AFAIK.

@sbourdeauducq
Copy link
Member

As an example use case: it's been an ambition of ours for a while to have a feature which lets us stream out diagnostic data via ethernet. e.g. ADC traces to implement scope-like functionality. Would be great to have this on satellites while still maintaining our 3 downstream DRTIO ports.

What about using the DRTIO aux channel for this?

@hartytp
Copy link

hartytp commented Jul 23, 2020

What about using the DRTIO aux channel for this?

@dnadlinger but I believe the idea was to keep things distributed to avoid bottlenecks of doing everything through the master

@dnadlinger
Copy link
Member

I recently discussed this with @pmldrmota, as he is looking into this with our summer student Miray. Basically, using the aux channel would be the preferred solution if we can get the "relay" bandwidth on the root of the tree, i.e. the core device, high enough (some dozen Mb/s to Ethernet) without interfering with kernel RPCs too much.

If that turns out to be hard (which it might well on or1k), putting Ethernet on the satellites would be a "scalable" fallback option.

@filipswit
Copy link
Collaborator

What is the purpose of the PL watchdog? The PL is generally quite well-behaved, it's the PS that is often troublesome.

I can remove it or connect to PS

Do we need the DS2411R identifier device when we can get I2C EEPROMs with a unique ID burned in instead?

I can remove it

I'm a little skeptical of the analog hack that involves AC coupling of the fan PWM signal. Has this been tested before?

It's a copy from Kasli, so I assume it works.

Don't we have the same I2C bug with the FTDI chip that caused contention on Kasli v2?

What kind of bug?

he SRST signal absolutely needs to go on the FTDI chip GPIO directly,

Ok will swap it from I2C expander to FTDI GPIO.

Nitpick: do we need a "general purpose pushbutton"? Kasli doesn't have one and there's no use case for it AFAIK.

I thought might be useful. Will be removed.

@sbourdeauducq
Copy link
Member

sbourdeauducq commented Jul 26, 2020

What is the purpose of the PL watchdog? The PL is generally quite well-behaved, it's the PS that is often troublesome.

I can remove it or connect to PS

There are some PS states which are unrecoverable without a power cycle (and some include JTAG breakage). I don't know if using the POR signal instead of a power cycle would work. Maybe it is worth keeping the watchdog as an option that can pulse POR, with an easy option to disable it (e.g. jumper or solder bridge) so it does not get in the way of early development and/or can be permanently disabled if it causes trouble.
Also, keep in mind that loading a program into the Zynq from JTAG (PS+PL) takes dozens of seconds during which the Zynq chip is frozen, and if you pulse POR during that time then it won't work. Booting it from the SD card is also not instantaneous, and a POR pulse will interrupt the boot process. Any watchdog circuit must be designed accordingly.

Do we need the DS2411R identifier device when we can get I2C EEPROMs with a unique ID burned in instead?

I can remove it

OK.

I'm a little skeptical of the analog hack that involves AC coupling of the fan PWM signal. Has this been tested before?

It's a copy from Kasli, so I assume it works.

AFAIK nobody tested it.

Don't we have the same I2C bug with the FTDI chip that caused contention on Kasli v2?

What kind of bug?

sinara-hw/Kasli#78

he SRST signal absolutely needs to go on the FTDI chip GPIO directly,

Ok will swap it from I2C expander to FTDI GPIO.

Make sure this matches a layout that OpenOCD and - if possible - Vivado already supports.

@marmeladapk
Copy link
Member

marmeladapk commented Jul 26, 2020 via email

@filipswit
Copy link
Collaborator

Schematics and actual pin assignment:

Kasli-SOC.PDF

Kasli-SOC.zip

@sbourdeauducq
Copy link
Member

Which OpenOCD cable file should I use to set up SRST?

@sbourdeauducq
Copy link
Member

sbourdeauducq commented Aug 4, 2020

How is the watchdog getting disabled during JTAG programming (which takes more than 700ms)?
The POR pulse does not need to be as long as 1s.
What is the purpose of C75? Glitch filtering? I wonder if slow rising edges on POR_B are good. And glitch filtering may not be necessary. The TRM states: "The PS_POR_B reset pin is held Low until all PS power supplies are at their required voltage levels and PS_CLK is active. It can be asynchronously asserted and is internally synchronized and filtered. The filter prevents High-going glitches from entering the PS while the signal is intended to be held Low. It does not filter Low-going glitches when the signal is intended to be held high. Any Low-going glitch that is detected causes an immediate reset of the device."

If there are too many issues with this watchdog chip and associated circuitry (IC29A, IC29B, C75...), we might as well remove them, simply keeping IC30. Also, remove PB2 (oftentimes, ARTIQ boards are used remotely without physical access to pushbuttons) and connect I2C_exp_POR to MR# of IC30.

I suggest connecting I2C_exp_POR to a FTDI GPIO directly. Otherwise, the Zynq may lock up the I2C bus (all it takes is holding SCL or SDA low), which is a common situation, and then we can't POR it to recover control.
Considering the state of FTDI chips and associated software, this requires careful prior testing:

  • the GPIO is actually working with the FTDI chip configured as intended on this board (JTAG, I2C, UART)
  • toggling the GPIO does not break JTAG, UART, or I2C. FTDI GPIO controllers are extremely buggy and there are lots of surprises like that.
  • using OpenOCD with default cable files does not trigger POR
  • plugging the USB cable with the default (UART) driver attaching does not trigger POR

IC30 is only monitoring 3.3V, and 3.3V comes before VCCDDR according to the table on page 14. Isn't VCCDDR a PS supply?

I still think more signals should be routed to the MIO instead of the PL, in particular the RGMII ones where timing is critical and where we may want to boot from the network including the PL. PL Ethernet, if required, can still be done with SFP and transceiver. Also, have you copied exactly the Ethernet circuit of a devkit? Is the PHY functional immediately after power up without having to fiddle with MDIO?

What is the idea with the "internal" I2C bus?

@filipswit
Copy link
Collaborator

filipswit commented Aug 4, 2020

When we connect PS_POR to IC30 MR#, we can skip IC29 and simplify it a little.

Is there any difference to which FTDI output I connect PS_POR?

From UG: "The PS_POR_B input is required to be asserted to GND during the power-on sequence
until VCCPINT, VCCPAUX and VCCO_MIO0 have reached minimum operating levels" so P3V3 is last rising PS supply.

Actually I can connect tems sensors to SHARED_I2C. IC8 can be removed as well, I just havent noticed there is IC5 eeprom on MUX sheet.

Ethernet is coppied from verified design: https://github.com/BerkeleyLab/Marble

@sbourdeauducq
Copy link
Member

Is there any difference to which FTDI output I connect PS_POR?

As long as the four tests above are OK, no.

@sbourdeauducq
Copy link
Member

PS peripherals are generally garbage indeed, but in the case of I2C I think we can use the MIO in GPIO mode and bit-bang from software.

It works:
https://git.m-labs.hk/M-Labs/zynq-rs/pulls/58

Please connect I2C (and other things) to the PS. As long as there is no case for gateware acceleration/control, things should go to the PS by default.

@filipswit
Copy link
Collaborator

I made above changes. @sbourdeauducq I2C and interrupts, and switch reset are routed to PS now.
Below are actual sch and pin assignment. If everything is fine, during weekend I will put final layout on repo.
Kasli-SOC.zip

Kasli-SOC.PDF

@sbourdeauducq
Copy link
Member

Which OpenOCD cable file should I use to set up SRST?

What about this question?

Considering the state of FTDI chips and associated software, this requires careful prior testing:

Have you done these tests?

RGMII should still go to the PS.

I also recommend connecting at least LED_ERR to the PS so that errors that occur before PL startup can be reported on the front panel as well.

Should there be pullups on I2C_SW_RESET so that the I2C switches are automatically reset when the Zynq is reset (please check if a simple pullup resistor would work: timing, state of MIO during reset)? Then the other I2C_SW_RESET line to the FTDI chip can perhaps be removed, since we could POR the Zynq to recover from a stuck I2C bus situation.

@filipswit
Copy link
Collaborator

Which OpenOCD cable file should I use to set up SRST?

Sorry I've never used OpenOCD. I cant figure what are 'cable files'.

I connected it with this connecion list:
− RXD(5) - TDI
− TXD(1) - TCK
− RTS(3) - TDO
− CTS(11) - TMS
− DTR(2) - TRST
− DCD(10) - SRST

from http://openocd.org/doc/pdf/openocd.pdf

Considering the state of FTDI chips and associated software, this requires careful prior testing:

Have you done these tests?

I don't have possibilities to make these tests.

Should there be pullups on I2C_SW_RESET so that the I2C switches are automatically reset when the Zynq is reset (please check if a simple pullup resistor would work: timing, state of MIO during reset)? Then the other I2C_SW_RESET line to the FTDI chip can perhaps be removed, since we could POR the Zynq to recover from a stuck I2C bus situation.

I will add PU/PD with default mounted PU and leave I2C_SW_RESET line to the FTDI as is, when will be not needed will mark it as not mounted in future.

Because we connect Ethernet to PS, it will cause some more layout rework with LVDS and supplies and probabbly something more. @sbourdeauducq so now, I can connect some more stuff to PS, could you please point which exactly signals you would like to be conneccted to PS instead of PL.

@sbourdeauducq
Copy link
Member

Sorry I've never used OpenOCD. I cant figure what are 'cable files'.

I don't have possibilities to make these tests.

Someone ought to figure this out, study it in detail, and run the tests carefully before board fabrication. Both the FTDI chip and the Zynq JTAG/reset interface are obnoxious crap, which, combined together, really have a lot of potential for frustrating problems if we are not very careful.
Vivado support is also desirable, if possible at all without breaking OpenOCD. For example the built-in JTAG adapter of ZC706 works with Vivado as far as Zynq stuff works with Vivado, but is very unreliable with OpenOCD due to obscure SRST-related problems. It may be a good idea to have someone spend time on understanding why this is the case. I just wanted to my code to run so I used an external Olimex cable instead (incompatible with Vivado), and "only" spent two days on this problem.
I do not currently have more time for these onerous tasks, but not doing it right now is just kicking the can down the road and making the problem worse since it will then involve PCB rework and/or fabrication of adapters to connect external JTAG cables to the broken internal programming system. If you want to go that route, please make the JTAG, POR, and SRST signals easy to disconnect and rewire to an external adapter (for JTAG and SRST, ARM-USB-TINY-H works mostly correctly on ZC706).

Because we connect Ethernet to PS, it will cause some more layout rework with LVDS and supplies and probabbly something more.

If you want some simplification you can drop both SPI flash chips - we don't plan to use it, the SD card does the job.

@sbourdeauducq so now, I can connect some more stuff to PS, could you please point which exactly signals you would like to be conneccted to PS instead of PL.

  • I2C (except WRPLL): definitely PS
  • GPIO_INTx, I2C_SW_RESET: same as above
  • RGMII, MDIO, PHY_RSTn: definitely PS
  • Fan PWM: not sure about this one since I don't know if the PS PWM controller works correctly (if I did I would say move to PS), and wasting CPU cycles on toggling the fan GPIO sounds annoying. Maybe just leave it to PL.
  • DDBUS: good idea to leave on PL since we can use it e.g. with Microscope, but maybe rename it to UART_PL (and the other one UART_PS).
  • LEDs: ERROR LED should go to PS as I said, for LEDs in general maybe a mix of PS/PL is good since LEDs are sometimes useful to debug/test simple gateware.

Why are there still two general purpose I2C buses (SDA_int/SCL_int and I2C_2V5_SW_SDA/I2C_2V5_SW_SCL)?

@sbourdeauducq
Copy link
Member

Are you leaving the 1-wire ID thing? It's still on the latest schematics.

@gkasprow
Copy link
Member Author

The less various interfaces the better. More useful are ID chips with MAC address

@filipswit
Copy link
Collaborator

@sbourdeauducq are you sure you are watching the latest files?

@sbourdeauducq
Copy link
Member

No, my mistake. But please remove https://github.com/sinara-hw/Kasli-SOC/tree/master/PDF

@filipswit
Copy link
Collaborator

I will do it with next repo update.

@filipswit
Copy link
Collaborator

SPI Flash was removed due to routing problems.
I added 2.54 mm jumper for POR and SRST signals. I hope that this will be fine for debugging.
image

Kasli-SOC.PDF

Kasli-SOC_20200920.zip

@sbourdeauducq
Copy link
Member

For POR I would put the jumper before the supervisory IC. The manual is clear that POR should be asserted until supplies are stable, so removing the jumper would otherwise always cause problems.

The issue with these signals really comes from:

  1. crappy FTDI chips
  2. crappy drivers
  3. crappy JTAG software
  4. stupid attitude from Xilinx

@sbourdeauducq
Copy link
Member

Anyway, this solution forgoes Vivado hardware manager support. Only dissecting Xilinx/Digilent adapters for supported POR/SRST configurations, putting a Xilinx compatible JTAG header, or putting one of those silly Digilent FTDI modules would enable it.
I hate Vivado and the current solution with probably-working FTDI GPIOs and jumpers for easy debugging/workarounds would be somewhat okay with me personally, but it will cause problems if you/others want to develop with the official Xilinx toolflow.

@sbourdeauducq
Copy link
Member

If I understand correctly, the IC42 inverter would fix the equivalent of sinara-hw/Kasli#78.
I'm still not a fan of multimastering I2C and would prefer a "I2C proxy firmware" that talks over UART (that's easily developed, and loaded by JTAG into the Zynq OCM). Do we keep the FTDI-I2C at all (@jordens)?

@jordens
Copy link
Member

jordens commented Sep 30, 2020

Yes

@hartytp
Copy link

hartytp commented Sep 30, 2020

@sbourdeauducq can we move this to its own issue to make following the conversation easier?

Do we keep the FTDI-I2C at all (@jordens)?

The USB-I2C interface provides useful functionality which I'd be sad to loose. What would you replace it with? Edit: okay, re-reading, you want to proxy it. Well, in that case, if you'd provide the same level of functionality it's not something I feel overly strongly about. The proxy solution feels like more work though given the underlying issue now seems to be satisfactorily resolved on Kasli...

@sbourdeauducq
Copy link
Member

As I mentioned: replace it with a I2C proxy firmware.
The hardware does not have problems but the ARTIQ I2C firmware implementation does not support multimastering, and likely other implementations such as libftdi do not support it well either. Multimastering is complex and fragile and implementing it correctly is harder than loading proxy firmware/gateware.
You will hit bugs if you try to access the I2C bus while ARTIQ is running and attempting to use it at the same time.

@sbourdeauducq
Copy link
Member

Anyway as long as the hardware doesn't have problems, I don't really care about this issue, and we can always DNP the related components later.

@sbourdeauducq
Copy link
Member

We have firmware now:
https://nixbld.m-labs.hk/build/129246
https://git.m-labs.hk/M-Labs/artiq-zynq/src/branch/master/demo.json
(Still a few things missing like Si5324 programming but that shouldn't take long)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants