cpuinfo returns 0 for cpuinfo_get_l2_caches_count on some devices #4654

ScottTodd · 2021-01-28T23:25:00Z

Discord discussion: https://discord.com/channels/689900678990135345/689906000043573354/804488403936215090

If cpuinfo_get_l2_caches_count() returns 0 on some machines, so our code that tries to pick the number of workers based on the l2 cache count bails to a single threaded fallback:

https://github.com/google/iree/blob/50d9823218605f1707abb375419df47c5d2ef28c/iree/task/topology.c#L356-L378

The text was updated successfully, but these errors were encountered:

benvanik · 2021-01-28T23:56:48Z

Already checked: https://github.com/google/iree/blob/79684f2a96b749e2e5bc049b9185de9e77b47020/iree/task/topology.c#L347-L350

benvanik · 2021-01-28T23:58:13Z

Leaving this here for now as we are working around it by bailing to a fallback of 1 thread when this happens. If someone who has a machine that returns this they can debug on/file an upstream cpuinfo bug/etc that'd be helpful. Otherwise you get 1 thread :)

bjacob · 2021-01-29T00:30:28Z

Notes:

It is common even for real CPUs (not just emulators) on the ARM architecture to have no L2 cache, either because they skip from L1 to L3 (typical on some phone-class cpus) or because they have nothing at all beyond L1 (for some microcontrolled CPUs).
Of course, when running on emulator, it's even more unsurprising that there would be 0 L2 cache.

TLDR: this seems normal at both the qemu and cpuinfo level, and will happen not only on qemu.

benvanik · 2021-01-29T01:30:38Z

This is super useful information @bjacob - thanks! The current strategy (heh "strategy" is a stretch) is just a placeholder anyway - would love to sync more about reasonable defaults/specializations/etc and I was mostly just waiting until mahesh's queue is flushed with the linalg on tensors work so we have some parallel workloads and the CPU threading lands in the HAL rewrite so we'd have something concrete to talk about.

bjacob · 2021-01-29T02:05:16Z

no problem! whenever you want to get back to this, you could take a look at this code that I wrote with much help from Marat for ruy's needs --- It's not exactly the same as you're doing, but it does wrestle with a similar issue of using cpuinfo distinguishing shared vs non-shared ("local") caches:
https://github.com/google/ruy/blob/2887692065c38ef6617f423feafc6b69dd0a0681/ruy/cpuinfo.cc#L42-L83
the key part is this condition, which Marat wrote down for me:
https://github.com/google/ruy/blob/2887692065c38ef6617f423feafc6b69dd0a0681/ruy/cpuinfo.cc#L59-L63
With this logic, we detect which cache is effectively the last level of cache that is local to each processor, which would indeed be the L2 cache on a majority of current CPUs but would be the L1 cache when there is no L2 --- i.e. avoiding making any assumptions about level N cache being special for any particular value of N.

benvanik · 2021-01-29T02:10:06Z

That's fantastic code and effectively what I was reaching for when I originally wrote this and then punted on. I'll see if I can adapt that. Did you find any way to test that besides actually grabbing a device with certain characteristics? (it'd be cool if cpuinfo could mock out certain devices for testing - like "pretend you are X" - maybe it can?)

bjacob · 2021-01-29T02:11:48Z

Also note about ARM architecture CPUs: in addition to the above-mentioned case where there is no L2 cache, there are other CPUs with only L1 and L2, with the L2 cache being shared across cores!

Maybe ask Marat directly about cpuinfo mocking - I don't know personally.

bjacob · 2021-01-29T02:35:38Z

Ah, cpuinfo does support mocking: in its public include/ directory, right besides cpuinfo.h, you got cpuinfo-mock.h:
https://github.com/pytorch/cpuinfo/blob/master/include/cpuinfo-mock.h

Fallback to 1 thread for when cpuinfo_get_cores_count()==0 and core count when cpuinfo_get_l2_caches_count()==0. Issue #4654 is tracking making this better.

ScottTodd added bug 🐞 Something isn't working hal/cpu Runtime Host/CPU-based HAL backend labels Jan 28, 2021

ScottTodd assigned benvanik Jan 28, 2021

benvanik changed the title ~~Dylib driver creation fails on devices which report zero l2 caches through cpuinfo~~ cpuinfo returns 0 for cpuinfo_get_l2_caches_count on some devices Jan 28, 2021

benvanik removed their assignment Jan 28, 2021

benvanik added the help wanted Extra attention is needed label Jan 28, 2021

benvanik added a commit that referenced this issue Jan 29, 2021

Add fallbacks for 0 cpuinfo queries.

6de6fdf

Fallback to 1 thread for when cpuinfo_get_cores_count()==0 and core count when cpuinfo_get_l2_caches_count()==0. Issue #4654 is tracking making this better.

benvanik mentioned this issue Feb 5, 2021

MobileBert Performance Regression on Pixel 4 CPU at 8d74bfa28dd35203fba700aa1f1d88c200c848cd #4686

Closed

benvanik mentioned this issue Mar 5, 2022

Decompose cpuinfo usage and plumb through new information. #8469

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpuinfo returns 0 for cpuinfo_get_l2_caches_count on some devices #4654

cpuinfo returns 0 for cpuinfo_get_l2_caches_count on some devices #4654

ScottTodd commented Jan 28, 2021 •

edited by benvanik

Loading

benvanik commented Jan 28, 2021

benvanik commented Jan 28, 2021

bjacob commented Jan 29, 2021 •

edited

Loading

benvanik commented Jan 29, 2021

bjacob commented Jan 29, 2021 •

edited

Loading

benvanik commented Jan 29, 2021

bjacob commented Jan 29, 2021

bjacob commented Jan 29, 2021

cpuinfo returns 0 for cpuinfo_get_l2_caches_count on some devices #4654

cpuinfo returns 0 for cpuinfo_get_l2_caches_count on some devices #4654

Comments

ScottTodd commented Jan 28, 2021 • edited by benvanik Loading

benvanik commented Jan 28, 2021

benvanik commented Jan 28, 2021

bjacob commented Jan 29, 2021 • edited Loading

benvanik commented Jan 29, 2021

bjacob commented Jan 29, 2021 • edited Loading

benvanik commented Jan 29, 2021

bjacob commented Jan 29, 2021

bjacob commented Jan 29, 2021

ScottTodd commented Jan 28, 2021 •

edited by benvanik

Loading

bjacob commented Jan 29, 2021 •

edited

Loading

bjacob commented Jan 29, 2021 •

edited

Loading