Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I2C SDA 0.8V when pulled low by Si5324 #78

Closed
hartytp opened this issue Jun 30, 2020 · 66 comments
Closed

I2C SDA 0.8V when pulled low by Si5324 #78

hartytp opened this issue Jun 30, 2020 · 66 comments

Comments

@hartytp
Copy link

hartytp commented Jun 30, 2020

https://chat.m-labs.hk/m-labs/pl/9anbfbwbw3yr8j7xctdsx6isxr

@hartytp
Copy link
Author

hartytp commented Jun 30, 2020

@gkasprow @marmeladapk any ideas?

@hartytp
Copy link
Author

hartytp commented Jun 30, 2020

(This currently seems to stop some boards booting with ARTIQ).

@hartytp
Copy link
Author

hartytp commented Jun 30, 2020

Si5324 specifies a max of 0.4V at 3V3 (3mA)

image

@hartytp
Copy link
Author

hartytp commented Jul 1, 2020

@gkasprow any thoughts about this? I can send you a board showing these symptoms if that will help you debug?

@marmeladapk
Copy link
Member

@hartytp I'll take a look at it next week.

@hartytp
Copy link
Author

hartytp commented Jul 2, 2020

Thanks @marmeladapk !

I have one more Kasli v2.0 at the moment. It also displays this issue. Do you need me to post it to you, or do you have other v2.0 with this problem?

It's possible that there has been some hardware damage to the two boards I have, but it seems unlikely since they both displayed this straight out of the box and I've never seen it before...

@marmeladapk
Copy link
Member

marmeladapk commented Jul 2, 2020 via email

@dnadlinger
Copy link
Member

I have one more Kasli v2.0 at the moment.

sb0 now has two of our boards, so I'm not sure how many we have left. (Thinking about it, perhaps you meant to ship only one – apologies.)

@gkasprow
Copy link
Member

gkasprow commented Jul 2, 2020

@hartytp It looks like the pullup current is 6mA or more.
If you enable all I2C ports, the pullup currents add up.
With 16x10k we get 3.3/625R=5.3mA
We also have 2k2 which adds another 1.5mA.
So, it looks like all I2C ports are enabled which do not make much sense because I2C addresses overlap.

@DonaldKellett
Copy link

DonaldKellett commented Jul 3, 2020

Following m-labs/artiq#1480 , the voltage at the SDA pin in the shared I2C bus remains at 0.85V, which causes an "Si5324 failed to ack register" error following a command to write to address 0x68. Please find attached a sigrok trace trace.txt related to this issue. To reproduce the trace, connect wires D0, D1 and GND of your logic analyzer to SDA, SCL (shared I2C bus) and GND respectively on the Kasli 2.0 board.

@gkasprow
Copy link
Member

gkasprow commented Jul 3, 2020

@DonaldKellett are you sure that multiple ports are NOT enabled on the I2C switches?

@hartytp
Copy link
Author

hartytp commented Jul 3, 2020

@gkasprow how would that cause this? The only pull-up connected to shared SDA is R189 isn't it?

@hartytp
Copy link
Author

hartytp commented Jul 3, 2020

Never mind, turns out I didn't remotely understand how the I2C switches work...

image

I take your point: if you enable lots of channels at once then you get a much stronger pull-up, which would completely explain why this happens (and why it didn't happen with the CTI tests)...

@gkasprow
Copy link
Member

gkasprow commented Jul 3, 2020

@hartytp there are two types of I2C switches. The ones that let you choose 1 of n outputs (I2C multiplexer). You simply enter the output number to the register and that's done. The ones we use are in fact I2C switches, not muxes.

@gkasprow
Copy link
Member

gkasprow commented Jul 3, 2020

to avoid such issues in the future, we can increase the pullups at the outputs of switches let's say 10-fold.

@sbourdeauducq
Copy link
Member

@gkasprow As you can see from the I2C logic analyzer trace (open it with sigrok), just before the Si5324 failure, the firmware writes the 0x00 0x08 control bytes to the two respective switches (note that the LA registers the 0.8V as 0). This is supposed to select only one channel - correct? In m-labs/artiq#1480 we tried repeating the control byte, but no effect.

@hartytp
Copy link
Author

hartytp commented Jul 4, 2020

@gkasprow if your hypothesis is that the switch is selecting multiple channels at once, can we quickly test that by just shorting a few of the other outputs to ground and seeing if that affects the output? If it doesn’t then that can’t be the issue...

@gkasprow
Copy link
Member

gkasprow commented Jul 5, 2020

the question is what is the state of the registers before the firmware writes the 0x00 and 0x08...

@DonaldKellett
Copy link

I've just looked into the I2C_SW test points on my Kasli 2.0 board and the voltage readings of the pins are as follows:

  • SDA: approx. 1.15V
  • SCL: approx. 2.45V

Please find attached the sigrok trace and annotations for these test points. As far as I can tell, this trace is more or less the same than the previous one, but I managed to get some interesting annotations from PulseView by selecting "24xx EEPROM" as the stack decoder on top of the I2C decoder.

To reproduce the trace, connect D0, D1 and GND of the logic analyzer to SDA, SCL (in I2C_SW) and GND of the Kasli 2.0 board respectively.

@gkasprow
Copy link
Member

gkasprow commented Jul 6, 2020

@DonaldKellett this still does not respond to the question I asked. Please check if the other ports are not enabled! Just probe the switch outputs while trying to talk to the Silabs chip.

@DonaldKellett
Copy link

DonaldKellett commented Jul 7, 2020

I looked into this again, and, from my understanding of the above discussion, I would need to probe the outputs of the SCx/SDx pins on my board to determine whether more than one of them are active at any given time. So, a quick look at the Kasli schematics indicates that these pins are located on IC14, which is a tiny IC at the back of my board and therefore difficult to solder wires directly on it. I then tried looking at the PCB layout on page 20 of the Kasli schematics to look for appropriate soldering points away from the IC but the PDF does not display the PCB layout very well and I couldn't open the PCB_Kasli.PCBDOC file online in the Altium 365 viewer.

@gkasprow do you happen to know what is the best place to solder on another port on this pcb? Update: I think I figured it out.

@marmeladapk
Copy link
Member

@DonaldKellett You could probe I2C signals on EEM connectors.

@marmeladapk
Copy link
Member

marmeladapk commented Jul 8, 2020

I probed Kasli v2.0 and compared it with v1.1. Measurements:

v2.0, measured on shared bus (so closest to Si5324):
tek00020

Low level from Si5324 during reads is around 830 mV (I measured it with scope).


v2.0, measured on 3V3_SW bus (between switches and voltage translator):
tek00022

Low level from Si5324 during reads is around 1.1 V.


v2.0, measured on FPGA I2C bus (between FPGA and voltage translator):
tek00023

Low level from Si5324 during reads is around 1.1 V.


v1.1, measured on Si5324 bus:
tek00025

Low level from Si5324 during reads is around 0 V.


v1.1, measured on 3V3_SW bus (between switches and voltage translator):
tek00026

Low level from Si5324 during reads is around 530 mV.


v1.1, measured on FPGA I2C bus (between FPGA and voltage translator):
tek00027

Low level from Si5324 during reads is around 0 V.

Place Si5324 bus Between switches and v. translator FPGA bus
v2.0 830 mV 1.1 V 1.1 V
v1.1 0 V 530 mV 0 V
v1.0 120 mV 140 mV 0 V

Artix datasheet specifies low level voltage of 2V5 CMOS to be 0.7 V. So it's only a coincidence that it works on my boards.

Four things changed between v1.1 and v2.0:

  1. voltage translator was changed from bus repeater with voltage translation (TCA9517) to voltage translator (PCA9306)
  2. pullup resistors on FPGA bus and Si5324 bus were changed from 2k2 to 10k
  3. Si5324 now shares bus with 2 GPIO extenders and SFP3
  4. Si5324 doesn't have TCA9517 bus repeater on its bus now

See #46 for rationale of those changes.

Initial 830 mV level surprised me. I know that switches are set only to enable only this bus. So Si5324 has to drive 10k || 2k2 || 10k || 10k to ground which is around 1k3. However in v1.0 it had to drive 2k2 || 2k2 to ground which is a slightly stronger pullup. Either way 3 mA sink should be enough.

In v1.1 Si5324 was "shielded" from rest of the I2C by a repeater. To check if 4. changes anything I'll measure v1.0 later this week, but AFAIR there were no issues with Si5324 there.

Sharing bus with other devices probably doesn't matter. After measuring v1.0 I'll check if changing resistors between switches and v. translator from 2k2 to 10k will help.

@jordens
Copy link
Member

jordens commented Jul 8, 2020

Argh.

@hartytp
Copy link
Author

hartytp commented Jul 8, 2020

@marmeladapk One thing I don't understand from your measurements is that it looks like there is a 270mV voltage drop across the TCA9548ARGER (830mV low-level on SHARED_SDA and 1V1 on I2C_3V3_SW_SDA). The max reistance is specified as being 30Ohm (see below). Unless I'm missing something, that would suggest there is 9mA flowing through it, which would suggest that the pull-up is really only a couple of hundred ohms, wouldn't it, or am I misunderstanding things?

image

@jordens
Copy link
Member

jordens commented Jul 8, 2020

And the other strange thing is that the FPGA drives below 0.1 V on v1.1 but only to about 0.3 V on v2.0.

@marmeladapk
Copy link
Member

v1.0, Si5324 bus:
obraz


v1.0, between switches and repeater:
obraz


v1.0, FPGA bus:

obraz

So it seems that lack of repeater (4.) directly before Si5324 is not a problem. Perhaps having two PCA9306 translating voltage to the same bus is problematic? (USB)

I'll continue on Friday.

@gkasprow
Copy link
Member

gkasprow commented Jul 8, 2020

@marmeladapk did you check if the switches have multiple outputs enabled?

@hartytp
Copy link
Author

hartytp commented Jul 9, 2020

@marmeladapk did you check if the switches have multiple outputs enabled?

@gkasprow I don't think that can be the issue here (although, I agree it's worth sanity checking).

Among other things, my calculation above suggests that 9mA must be flowing through the I2C switch (and, that's based on worst-case assumptions about switch resistance). With 10k pull-ups, that would require 27 channels, which is more than we have....

@gkasprow
Copy link
Member

gkasprow commented Jul 9, 2020

This could be assembly error - somebody assembled resistors that are 10 or 100x lower value.

@hartytp
Copy link
Author

hartytp commented Jul 10, 2020

Insert inverter on enable pin, so that by default IC22 is disabled (preferred by me)

I don't have the layout in front of me, but is that something we can easily hack into the existing boards with a scalpel and some glue?

@jordens
Copy link
Member

jordens commented Jul 10, 2020

We're already programming the FTDI EEPROM. I bet there is an option to solve this properly.
https://github.com/quartiq/kasli-i2c/blob/master/kasli-ft4232h.conf.in
Try adding suspend_pull_downs=true.

@jordens
Copy link
Member

jordens commented Jul 10, 2020

I don't have the layout in front of me, but is that something we can easily hack into the existing boards with a scalpel and some glue?

Why not just look into programming the EEPROM properly?

@hartytp
Copy link
Author

hartytp commented Jul 10, 2020

Why not just look into programming the EEPROM properly?

If that works reliably then it's fine by me.

@marmeladapk
Copy link
Member

marmeladapk commented Jul 10, 2020

@jordens It may not work if you're connected to the terminal (so FTDI is not USB is not in suspend mode). But I'll check.

@sbourdeauducq
Copy link
Member

I still think the enable pin should be inverted in v2.1 to make the hardware friendlier.

@jordens
Copy link
Member

jordens commented Jul 10, 2020

@jordens It may not work if you're connected to the terminal (so FTDI is not USB is not in suspend mode).

I meant look for the actual option that sets the state of the FTDI interfaces that are not being used. Note that the four ports are four different and independent USB interfaces. Connecting to one doesn't mean much for the others.

@hartytp
Copy link
Author

hartytp commented Jul 10, 2020

@jordens I hadn't seen https://github.com/quartiq/kasli-i2c/blob/master/kasli-ft4232h.conf.in before (I've been using ftprog). That's a nice util.

@jordens
Copy link
Member

jordens commented Jul 10, 2020

If that works reliably then it's fine by me.

I can't see how properly setting up the EEPROM could be nearly as unreliable as "a scalpel and some glue".

@jordens
Copy link
Member

jordens commented Jul 10, 2020

I still think the enable pin should be inverted in v2.1 to make the hardware friendlier.

Because of the implied incompatibility this is a bad and shortsighted idea.
Why not just configure it properly?

@sbourdeauducq
Copy link
Member

sbourdeauducq commented Jul 10, 2020

Removing IC22 is easier than programming FTDI chips, especially since FTDI chips are prone to all sort of bugs (example: https://twitter.com/marcan42/status/695292366639378433).

@sbourdeauducq
Copy link
Member

Why not just configure it properly?

Why not just make hardware that is well-behaved by default, especially since the contention may cause permanent damage?

@sbourdeauducq
Copy link
Member

Can't the incompatibility issue be resolved with ~3 lines of code that look at the hardware version and invert the pin?

@hartytp
Copy link
Author

hartytp commented Jul 10, 2020

Removing IC22 is easier than programming FTDI chips

Removing IC22 would break all the USB I2C tooling which would be a big regression IMHO

@hartytp
Copy link
Author

hartytp commented Jul 10, 2020

Another option is to just add a couple of current limiting resistors between the FTDI chip and IC22. That way we're guaranteed that programming errors can't cause hardware damage (which feels like something we should aim for) at close to 0 extra cost/complexity...

@marmeladapk
Copy link
Member

@jordens

suspend_pull_downs doesn't help regardless if console is opened.

The EEPROM contents have no effect on the selected mode with the exception of selecting the TXDEN for RS485 mode when asynchronous serial interface has been selected in software. If the device is reset, then the 4 channels must be reconfigured into the required mode.

So programming FTDI cannot fix it by default unless there are other options to set in EEPROM (there aren't to my knowledge).

I don't have the layout in front of me, but is that something we can easily hack into the existing boards with a scalpel and some glue?

@hartytp It's possible but not easy. If you want to add inverter know, easier fix would be to lift 8th pin of IC22 (EN pin) and add inverter (transistor) on top of it + some wires to pulldown resistor nearby.

I still think the enable pin should be inverted in v2.1 to make the hardware friendlier.

I'm also for it. It will be a proper fix (and currently seems the only possible permanent one).

@marmeladapk
Copy link
Member

Not sure if this is related to the original problem, but @hartytp's board that @DonaldKellett was using has developed what looks like a hardware failure with I2C_SW SCL and SDA now permanently stuck low.

@sbourdeauducq This looks like a hardware failure. Probably ESD. Let's hope that it's the voltage translator that's fried, not FPGA inputs.

@jordens
Copy link
Member

jordens commented Jul 10, 2020

Yes. There is no EEPROM fix. I went through the entire thing, including the not-yet exposed bits of libftdi. That pin is DTR and is an active low output. My proposal ist the following:

  • Disconnect BDBUS4, pin 26 from its via
  • Connect the via to BDBUS6, pin 28
  • In I2C software always drive both BDBUS4 and 6.

DCD is an input with a weak enough pull-up. This is forwards and backwards compatible, can be done in rework, doesn't need new chips, doesn't change the logic, and needs no version detection and special treatment in software.

@hartytp
Copy link
Author

hartytp commented Jul 10, 2020

Any objections to pushing out a v2.1 release with minor fixes soon?

@jordens
Copy link
Member

jordens commented Jul 10, 2020

That seems much to early given that a lot of the board hasn't really been tested thoroughly yet. Little data/solutions on the fan, wr clock recovery, 4 sfp, smps noise, panel mechanics, just to name a few.

@hartytp hartytp mentioned this issue Jul 10, 2020
Closed
@marmeladapk
Copy link
Member

I'll check @jordens proposal the next time I'll be in our lab.

@jordens would you have a moment to check if I2C from FTDI works as expected? Or maybe you have already checked it? (even without the fix for current problem)

@jordens
Copy link
Member

jordens commented Jul 14, 2020

If you fix the fundamentally broken ARTIQ I2C implementation it works fine. Don't ever actively drive SCL/SDA high (unless you want to do some risky high speed tricks), especially not in a multi-master topology. That may also be the root of a lot of other problems (including what prompted #46 and the issues with Banker and Fastino I2C).
Unfortunately the switch reset polarity is now switched which requires a special treatment.

@jordens
Copy link
Member

jordens commented Jul 14, 2020

m-labs/artiq#1484

@marmeladapk
Copy link
Member

marmeladapk commented Jul 24, 2020

I checked that @jordens' solution works fine (though I didn't cut the trace yet, I made a non-permanent fix).

ARTIQ implementation and @jordens' bitbang kasli-i2c work fine. However pyftdi in MPSSE mode has problems. I checked that I can talk to other devices but muxes don't seem to react to changes in their registers.

  • I can talk to Si5324 (readback of chip id) when mux is switched (and left) by ARTIQ or kasli-i2c on boot and I get correct values.
  • I can talk to muxes, write to them and read back values I wrote.
  • When I reset the mux and set it again I can't talk to Si5324 and there's no signal on shared bus. Even if I enable all ports on both muxes, nothing happens on any bus.

I've seen this behaviour before and it was caused my simultaneous changes of SDA and SCL (I2C from misoc). Now it doesn't seem to be the case but I'm not sure if it's worth dwelling on this further since pyftdi in MPSSE mode has terrible timings and weird voltage levels (see screenshot).
tek00037

@sbourdeauducq
Copy link
Member

The I2C proxy bitstream route avoids all these problems at once.

@jordens
Copy link
Member

jordens commented Jul 26, 2020

The i2c mpsse didn't seem worth using. I added it to compare with bitbanging and it wasn't faster in the typical use cases. I wouldn't be surprised if pyftdi doesn't implenent i2c-mpsse correctly (iirc it's not a function of the mpsse but a use case).

@pathfinder49
Copy link

pathfinder49 commented Feb 22, 2021

Yes. There is no EEPROM fix. I went through the entire thing, including the not-yet exposed bits of libftdi. That pin is DTR and is an active low output. My proposal ist the following:

* Disconnect BDBUS4, pin 26 from its via

* Connect the via to BDBUS6, pin 28

* In I2C software always drive both BDBUS4 and 6.

DCD is an input with a weak enough pull-up. This is forwards and backwards compatible, can be done in rework, doesn't need new chips, doesn't change the logic, and needs no version detection and special treatment in software.

I've got an unpatched kasli 2.0.0. Is this the agreed way to fix this on existing boards? Is the software part of the fix included in artiq? Edit: looks like its been merged in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants