-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpuinfo returns 0 for cpuinfo_get_l2_caches_count on some devices #4654
Comments
Leaving this here for now as we are working around it by bailing to a fallback of 1 thread when this happens. If someone who has a machine that returns this they can debug on/file an upstream cpuinfo bug/etc that'd be helpful. Otherwise you get 1 thread :) |
Notes:
TLDR: this seems normal at both the qemu and cpuinfo level, and will happen not only on qemu. |
This is super useful information @bjacob - thanks! The current strategy (heh "strategy" is a stretch) is just a placeholder anyway - would love to sync more about reasonable defaults/specializations/etc and I was mostly just waiting until mahesh's queue is flushed with the linalg on tensors work so we have some parallel workloads and the CPU threading lands in the HAL rewrite so we'd have something concrete to talk about. |
no problem! whenever you want to get back to this, you could take a look at this code that I wrote with much help from Marat for ruy's needs --- It's not exactly the same as you're doing, but it does wrestle with a similar issue of using cpuinfo distinguishing shared vs non-shared ("local") caches: |
That's fantastic code and effectively what I was reaching for when I originally wrote this and then punted on. I'll see if I can adapt that. Did you find any way to test that besides actually grabbing a device with certain characteristics? (it'd be cool if cpuinfo could mock out certain devices for testing - like "pretend you are X" - maybe it can?) |
Also note about ARM architecture CPUs: in addition to the above-mentioned case where there is no L2 cache, there are other CPUs with only L1 and L2, with the L2 cache being shared across cores! Maybe ask Marat directly about cpuinfo mocking - I don't know personally. |
Ah, cpuinfo does support mocking: in its public |
Fallback to 1 thread for when cpuinfo_get_cores_count()==0 and core count when cpuinfo_get_l2_caches_count()==0. Issue #4654 is tracking making this better.
Discord discussion: https://discord.com/channels/689900678990135345/689906000043573354/804488403936215090
If
cpuinfo_get_l2_caches_count()
returns 0 on some machines, so our code that tries to pick the number of workers based on the l2 cache count bails to a single threaded fallback:https://github.com/google/iree/blob/50d9823218605f1707abb375419df47c5d2ef28c/iree/task/topology.c#L356-L378
The text was updated successfully, but these errors were encountered: