-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update makefiles to build with local tools instead of /build/toolchain #2
Comments
From [email protected] on February 12, 2009 13:02:45 (note that Google Code does not automatically include commit logs into issues) issue #2: build using local tools
I also made a decision to simplify the exports dirs
I believe Linux is in pretty good shape now. |
From [email protected] on February 14, 2009 07:53:45 We don't build on RHEL3: If typedef conflicts arise (such as on RHEL3), currently you'll haveto manually resolve by defining one of our DR_DO_NOT_DEFINE_*defines (DR_DO_NOT_DEFINE_uint, DR_DO_NOT_DEFINE_ushort, etc.).Eventually we'll have a pre-make step that automates this. |
From [email protected] on February 15, 2009 10:37:59 r14 : build windows without vmware toolchain
|
From [email protected] on February 16, 2009 11:44:15 split issues mentioned in 1st post as issue #17 , issue #18 , and issue #19 resolving this one since developers with certain setups can now build Status: Verified |
This patch adds the appropriate macros, tests and codec entries to encode the following variants: LDFF1B { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1B { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1B { <Zt>.B }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #3}] LDFF1H { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1SB { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}] LDFF1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}] Issue #3044
#5850) This patch adds the appropriate macros, tests and codec entries to encode the following variants: LDFF1B { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1B { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1B { <Zt>.B }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #3}] LDFF1H { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1SB { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>}] LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #1}] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}] LDFF1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}] Issue #3044
This patch adds the appropriate macros, tests and codec entries to encode the following variants: LD1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #3] LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #3] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] PRFB <prfop>, <Pg>, [<Xn|SP>, <Zm>.D] PRFD <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #3] PRFH <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #1] PRFW <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #2] ST1B { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #3] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D] ST1H { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #1] ST1H { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D] ST1W { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #2] ST1W { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D] Issue: #3044
This patch adds the appropriate macros, tests and codec entries to encode the following variants: LD1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #3] LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #3] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #1] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, LSL #2] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D] PRFB <prfop>, <Pg>, [<Xn|SP>, <Zm>.D] PRFD <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #3] PRFH <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #1] PRFW <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, LSL #2] ST1B { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #3] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D] ST1H { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #1] ST1H { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D] ST1W { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, LSL #2] ST1W { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D] Issue: #3044
This patch adds the appropriate macros, tests and codec entries to encode the following variants: LD1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1B { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #3] LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1] LD1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LD1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1] LD1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #2] LD1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1B { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #3] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1] LDFF1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1] LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #2] LDFF1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] PRFB <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] PRFB <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend>] PRFD <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #3] PRFD <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #3] PRFH <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #1] PRFH <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #1] PRFW <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #2] PRFW <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #2] ST1B { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] ST1B { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #3] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] ST1H { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #1] ST1H { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] ST1H { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #1] ST1H { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>] ST1W { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #2] ST1W { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] ST1W { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #2] ST1W { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>] Issue: #3044
This patch adds the appropriate macros, tests and codec entries to encode the following variants: LD1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1B { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #3] LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1] LD1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LD1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1] LD1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LD1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #2] LD1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1B { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1B { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #3] LDFF1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1] LDFF1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1] LDFF1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1SB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1SB { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #1] LDFF1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #1] LDFF1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2] LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend> #2] LDFF1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D, <extend>] LDFF1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend> #2] LDFF1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Zm>.S, <extend>] PRFB <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] PRFB <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend>] PRFD <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #3] PRFD <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #3] PRFH <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #1] PRFH <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #1] PRFW <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #2] PRFW <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #2] ST1B { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] ST1B { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #3] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] ST1H { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #1] ST1H { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] ST1H { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #1] ST1H { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>] ST1W { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend> #2] ST1W { <Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D, <extend>] ST1W { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend> #2] ST1W { <Zt>.S }, <Pg>, [<Xn|SP>, <Zm>.S, <extend>] Issue: #3044
This patch adds the appropriate macros, tests and codec entries to encode the following variants: LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD1H { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD2D { <Zt1>.D, <Zt2>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD2H { <Zt1>.H, <Zt2>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD2W { <Zt1>.S, <Zt2>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD3D { <Zt1>.D, <Zt2>.D, <Zt3>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD3H { <Zt1>.H, <Zt2>.H, <Zt3>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD3W { <Zt1>.S, <Zt2>.S, <Zt3>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD4D { <Zt1>.D, <Zt2>.D, <Zt3>.D, <Zt4>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD4H { <Zt1>.H, <Zt2>.H, <Zt3>.H, <Zt4>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD4W { <Zt1>.S, <Zt2>.S, <Zt3>.S, <Zt4>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LDNT1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LDNT1H { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LDNT1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] ST1H { <Zt>.<Ts> }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] ST1W { <Zt>.<Ts> }, <Pg>, [<Xn|SP>, <Xm>, LSL #2] ST2D { <Zt1>.D, <Zt2>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] ST2H { <Zt1>.H, <Zt2>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] ST2W { <Zt1>.S, <Zt2>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2] ST3D { <Zt1>.D, <Zt2>.D, <Zt3>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] ST3H { <Zt1>.H, <Zt2>.H, <Zt3>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] ST3W { <Zt1>.S, <Zt2>.S, <Zt3>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2] ST4D { <Zt1>.D, <Zt2>.D, <Zt3>.D, <Zt4>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] ST4H { <Zt1>.H, <Zt2>.H, <Zt3>.H, <Zt4>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] ST4W { <Zt1>.S, <Zt2>.S, <Zt3>.S, <Zt4>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2] STNT1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] STNT1H { <Zt>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] STNT1W { <Zt>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2] issues: #3044 Change-Id: Ic9d21cd27f8b2a52bd8bc2ced76ba79f5deae69a
…5904) This patch adds the appropriate macros, tests and codec entries to encode the following variants: LD1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD1H { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1H { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1H { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1SH { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1SH { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD1W { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD2D { <Zt1>.D, <Zt2>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD2H { <Zt1>.H, <Zt2>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD2W { <Zt1>.S, <Zt2>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD3D { <Zt1>.D, <Zt2>.D, <Zt3>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD3H { <Zt1>.H, <Zt2>.H, <Zt3>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD3W { <Zt1>.S, <Zt2>.S, <Zt3>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LD4D { <Zt1>.D, <Zt2>.D, <Zt3>.D, <Zt4>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD4H { <Zt1>.H, <Zt2>.H, <Zt3>.H, <Zt4>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD4W { <Zt1>.S, <Zt2>.S, <Zt3>.S, <Zt4>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] LDNT1D { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LDNT1H { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LDNT1W { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] ST1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] ST1H { <Zt>.<Ts> }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] ST1W { <Zt>.<Ts> }, <Pg>, [<Xn|SP>, <Xm>, LSL #2] ST2D { <Zt1>.D, <Zt2>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] ST2H { <Zt1>.H, <Zt2>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] ST2W { <Zt1>.S, <Zt2>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2] ST3D { <Zt1>.D, <Zt2>.D, <Zt3>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] ST3H { <Zt1>.H, <Zt2>.H, <Zt3>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] ST3W { <Zt1>.S, <Zt2>.S, <Zt3>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2] ST4D { <Zt1>.D, <Zt2>.D, <Zt3>.D, <Zt4>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] ST4H { <Zt1>.H, <Zt2>.H, <Zt3>.H, <Zt4>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] ST4W { <Zt1>.S, <Zt2>.S, <Zt3>.S, <Zt4>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2] STNT1D { <Zt>.D }, <Pg>, [<Xn|SP>, <Xm>, LSL #3] STNT1H { <Zt>.H }, <Pg>, [<Xn|SP>, <Xm>, LSL #1] STNT1W { <Zt>.S }, <Pg>, [<Xn|SP>, <Xm>, LSL #2]
This patch adds the appropriate macros, tests and codec entries to encode the following variants: LD1RQB { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, #<simm>}] LD1RQB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>] LD1RQD { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, #<simm>}] LD1RQD { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD1RQH { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, #<simm>}] LD1RQH { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1RQW { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, #<simm>}] LD1RQW { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] Issue: #3044
This patch adds the appropriate macros, tests and codec entries to encode the following variants: LD1RQB { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, #<simm>}] LD1RQB { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>] LD1RQD { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, #<simm>}] LD1RQD { <Zt>.D }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #3] LD1RQH { <Zt>.H }, <Pg>/Z, [<Xn|SP>{, #<simm>}] LD1RQH { <Zt>.H }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #1] LD1RQW { <Zt>.S }, <Pg>/Z, [<Xn|SP>{, #<simm>}] LD1RQW { <Zt>.S }, <Pg>/Z, [<Xn|SP>, <Xm>, LSL #2] Issue: #3044
Switches from using the tid in scheduler_launcher to distinguish inputs to the input ordinal. Tid values can be duplicated so they should not be used as unique identifiers across workloads. Tested: No automated test currently relies on the launcher; it is there for experimentation and as an example for how to use the scheduler, so we want it to use the recommended techniques. I ran it on the threadsig app and confirmed record and replay are using ordinals: =========================================================================== $ rm -rf drmemtrace.*.dir; bin64/drrun -stderr_mask 12 -t drcachesim -offline -- ~/dr/test/threadsig 16 2000 && bin64/drrun -t drcachesim -simulator_type basic_counts -indir drmemtrace.*.dir > COUNTS 2>&1 && clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -record_file record.zip > RECORD 2>&1 && clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -replay_file record.zip > REPLAY 2>&1 && tail -n 4 RECORD REPLAY Estimation of pi is 3.141592674423126 Received 89 alarms ==> RECORD <== Core #0: 16 15 16 15 16 0 15 16 15 8 16 6 5 7 Core #1: 9 3 12 16 11 16 8 0 16 0 16 1 16 Core #2: 3 14 16 14 16 0 15 16 8 16 2 6 8 1 10 Core #3: 13 3 13 9 11 12 16 6 16 6 16 2 4 ==> REPLAY <== Core #0: 16 15 16 15 16 0 15 16 15 8 16 6 5 7 Core #1: 9 3 12 16 11 16 8 0 16 0 16 1 16 Core #2: 3 14 16 14 16 0 15 16 8 16 2 6 8 1 10 Core #3: 13 3 13 9 11 12 16 6 16 6 16 2 4 =========================================================================== Issue: #5843
Switches from using the tid in scheduler_launcher to distinguish inputs to the input ordinal. Tid values can be duplicated so they should not be used as unique identifiers across workloads. Tested: No automated test currently relies on the launcher; it is there for experimentation and as an example for how to use the scheduler, so we want it to use the recommended techniques. I ran it on the threadsig app and confirmed record and replay are using ordinals: ``` $ rm -rf drmemtrace.*.dir; bin64/drrun -stderr_mask 12 -t drcachesim -offline -- ~/dr/test/threadsig 16 2000 && bin64/drrun -t drcachesim -simulator_type basic_counts -indir drmemtrace.*.dir > COUNTS 2>&1 && clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -record_file record.zip > RECORD 2>&1 && clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -replay_file record.zip > REPLAY 2>&1 && tail -n 4 RECORD REPLAY Estimation of pi is 3.141592674423126 Received 89 alarms ==> RECORD <== Core #0: 16 15 16 15 16 0 15 16 15 8 16 6 5 7 Core #1: 9 3 12 16 11 16 8 0 16 0 16 1 16 Core #2: 3 14 16 14 16 0 15 16 8 16 2 6 8 1 10 Core #3: 13 3 13 9 11 12 16 6 16 6 16 2 4 ==> REPLAY <== Core #0: 16 15 16 15 16 0 15 16 15 8 16 6 5 7 Core #1: 9 3 12 16 11 16 8 0 16 0 16 1 16 Core #2: 3 14 16 14 16 0 15 16 8 16 2 6 8 1 10 Core #3: 13 3 13 9 11 12 16 6 16 6 16 2 4 ``` Issue: #5843
Removes the original (but never implemented) report_time() heartbeat design in favor of the simulator passing the current time to a new version of next_record(). Implements QUANTUM_TIME by recording the start time of each input when it is first scheduled and comparing to the new time in next_record(). Switches are only done at instruction boundaries for simplicity of interactions with record-replay and skipping. Adds 2 unit tests. Adds time support with wall-clock time to the scheduler_launcher. This was tested manually on some sample traces. For threadsig traces, with DEPENDENCY_TIMESTAMPS, the quanta doesn't make a huge differences as the timestamp ordering imposes significant constraints. I added an option to ignore the timestamps ("-no_honor_stamps") and there we really see the effects of the smaller quanta with more context switches. =========================================================================== With timestamp deps and a 2ms quantum (compare to 2ms w/o deps below): $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -sched_time -verbose 1 -honor_stamps Core #0: 15 12 1 15 1 15 7 12 15 7 6 9 5 Core #1: 13 10 11 15 12 10 15 12 10 15 10 11 10 15 10 8 2 Core #2: 16 11 15 10 11 15 11 15 11 15 4 7 12 4 0 14 Core #3: 3 1 15 12 10 15 12 10 1 12 15 7 15 4 15 =========================================================================== Without, but a long quantum of 20ms: $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 20000 -sched_time -verbose 1 -no_honor_stamps Core #0: 0 5 8 12 16 4 11 15 Core #1: 1 4 9 14 0 7 9 0 Core #2: 2 6 10 13 1 6 13 1 Core #3: 3 7 11 15 2 3 8 10 14 16 =========================================================================== Without, but a smaller quantum of 2ms: $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -sched_time -verbose 1 -no_honor_stamps Core #0: 0 5 9 13 1 7 9 13 0 4 8 11 15 3 7 9 13 3 5 10 13 1 7 6 12 14 7 8 11 0 7 8 10 16 3 4 9 15 14 2 6 11 0 1 5 10 16 7 8 12 13 3 8 6 15 0 9 11 13 Core #1: 1 4 8 12 16 2 6 11 15 1 5 10 14 2 8 11 16 1 7 9 15 0 4 9 15 0 2 6 12 16 3 5 12 13 1 5 10 16 7 8 12 13 3 4 9 15 0 1 5 10 16 7 2 9 13 1 15 Core #2: 2 7 10 14 0 4 8 12 16 2 6 12 16 1 5 10 15 0 4 6 12 14 2 8 11 16 3 5 10 13 1 4 9 15 14 2 6 11 0 1 5 10 16 7 8 12 13 3 4 9 11 14 4 10 11 14 4 16 0 Core #3: 3 6 11 15 3 5 10 14 3 7 9 13 0 4 6 12 14 2 8 11 16 3 5 10 13 1 4 9 15 14 2 6 11 0 7 8 12 13 3 4 9 15 14 2 6 11 14 2 6 15 0 1 5 12 16 2 12 1 =========================================================================== Without, but a tiny quantum of 200us: $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 200 -sched_time -verbose 1 -no_honor_stamps Core #0: 0 4 7 11 15 2 6 10 14 1 7 9 12 16 5 10 12 4 8 3 11 12 1 8 11 16 7 8 15 12 0 6 12 7 13 10 2 8 15 16 2 3 15 6 11 7 13 6 10 1 8 5 10 1 3 6 14 11 7 15 2 4 12 13 5 9 10 15 6 9 10 7 6 2 1 11 3 14 16 12 13 5 1 11 6 8 2 10 0 7 5 10 0 3 14 1 15 6 7 5 4 0 12 14 9 10 16 14 8 10 11 13 7 9 0 13 3 9 1 13 3 5 2 16 14 4 15 0 6 4 15 0 13 12 8 10 1 3 4 15 2 14 3 5 11 16 13 6 15 10 2 12 6 4 9 12 6 15 10 1 7 8 11 2 14 13 5 10 1 12 8 5 0 14 8 3 4 16 6 15 11 2 1 8 3 4 0 14 7 4 2 6 14 7 11 10 1 9 13 2 14 12 3 5 10 0 9 13 11 8 12 7 16 10 0 3 5 10 0 3 13 11 15 12 13 11 8 3 13 15 9 12 7 2 8 0 4 6 15 9 3 6 15 14 12 5 15 8 0 3 6 2 0 1 6 13 0 3 6 2 11 9 4 2 0 10 4 2 0 10 4 13 0 10 4 13 0 1 15 2 12 1 15 2 0 11 15 6 13 1 15 4 16 14 11 4 16 14 10 4 16 14 11 5 2 13 9 3 4 6 1 11 7 2 16 15 12 4 5 1 12 10 6 13 9 4 2 5 15 3 10 5 15 12 11 16 1 14 7 2 13 9 4 10 6 8 14 3 2 15 7 4 0 6 8 4 0 6 7 12 3 16 6 1 4 16 15 9 12 3 5 13 8 11 0 6 1 4 16 2 7 14 10 5 13 12 3 0 9 8 11 10 6 1 14 16 2 7 11 0 9 1 4 3 5 13 12 3 5 13 4 11 10 6 8 15 11 5 8 4 11 5 13 15 3 6 8 15 11 16 2 7 12 3 5 8 4 14 16 2 15 12 0 5 13 4 14 3 2 4 14 3 8 10 1 16 6 13 4 7 5 13 10 16 11 9 2 12 3 5 6 10 1 7 0 15 12 14 8 2 10 3 11 9 15 16 7 13 2 12 3 13 2 0 16 14 5 6 16 14 5 15 4 1 11 9 6 10 14 2 0 16 14 9 6 12 14 8 4 10 9 0 13 1 12 8 15 Core #1: 1 5 ... <ommitted rest for space but all are as long as Core #0> =========================================================================== Issue: #5843
Removes the original (but never implemented) report_time() heartbeat design in favor of the simulator passing the current time to a new version of next_record(). Implements QUANTUM_TIME by recording the start time of each input when it is first scheduled and comparing to the new time in next_record(). Switches are only done at instruction boundaries for simplicity of interactions with record-replay and skipping. Adds 2 unit tests. Adds time support with wall-clock time to the scheduler_launcher. This was tested manually on some sample traces. For threadsig traces, with DEPENDENCY_TIMESTAMPS, the quanta doesn't make a huge differences as the timestamp ordering imposes significant constraints. I added an option to ignore the timestamps ("-no_honor_stamps") and there we really see the effects of the smaller quanta with more context switches. With timestamp deps and a 2ms quantum (compare to 2ms w/o deps below): ``` $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -sched_time -verbose 1 -honor_stamps Core #0: 15 12 1 15 1 15 7 12 15 7 6 9 5 Core #1: 13 10 11 15 12 10 15 12 10 15 10 11 10 15 10 8 2 Core #2: 16 11 15 10 11 15 11 15 11 15 4 7 12 4 0 14 Core #3: 3 1 15 12 10 15 12 10 1 12 15 7 15 4 15 ``` Without, but a long quantum of 20ms: ``` $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 20000 -sched_time -verbose 1 -no_honor_stamps Core #0: 0 5 8 12 16 4 11 15 Core #1: 1 4 9 14 0 7 9 0 Core #2: 2 6 10 13 1 6 13 1 Core #3: 3 7 11 15 2 3 8 10 14 16 ``` Without, but a smaller quantum of 2ms: ``` $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 2000 -sched_time -verbose 1 -no_honor_stamps Core #0: 0 5 9 13 1 7 9 13 0 4 8 11 15 3 7 9 13 3 5 10 13 1 7 6 12 14 7 8 11 0 7 8 10 16 3 4 9 15 14 2 6 11 0 1 5 10 16 7 8 12 13 3 8 6 15 0 9 11 13 Core #1: 1 4 8 12 16 2 6 11 15 1 5 10 14 2 8 11 16 1 7 9 15 0 4 9 15 0 2 6 12 16 3 5 12 13 1 5 10 16 7 8 12 13 3 4 9 15 0 1 5 10 16 7 2 9 13 1 15 Core #2: 2 7 10 14 0 4 8 12 16 2 6 12 16 1 5 10 15 0 4 6 12 14 2 8 11 16 3 5 10 13 1 4 9 15 14 2 6 11 0 1 5 10 16 7 8 12 13 3 4 9 11 14 4 10 11 14 4 16 0 Core #3: 3 6 11 15 3 5 10 14 3 7 9 13 0 4 6 12 14 2 8 11 16 3 5 10 13 1 4 9 15 14 2 6 11 0 7 8 12 13 3 4 9 15 14 2 6 11 14 2 6 15 0 1 5 12 16 2 12 1 ``` Without, but a tiny quantum of 200us: ``` $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 200 -sched_time -verbose 1 -no_honor_stamps Core #0: 0 4 7 11 15 2 6 10 14 1 7 9 12 16 5 10 12 4 8 3 11 12 1 8 11 16 7 8 15 12 0 6 12 7 13 10 2 8 15 16 2 3 15 6 11 7 13 6 10 1 8 5 10 1 3 6 14 11 7 15 2 4 12 13 5 9 10 15 6 9 10 7 6 2 1 11 3 14 16 12 13 5 1 11 6 8 2 10 0 7 5 10 0 3 14 1 15 6 7 5 4 0 12 14 9 10 16 14 8 10 11 13 7 9 0 13 3 9 1 13 3 5 2 16 14 4 15 0 6 4 15 0 13 12 8 10 1 3 4 15 2 14 3 5 11 16 13 6 15 10 2 12 6 4 9 12 6 15 10 1 7 8 11 2 14 13 5 10 1 12 8 5 0 14 8 3 4 16 6 15 11 2 1 8 3 4 0 14 7 4 2 6 14 7 11 10 1 9 13 2 14 12 3 5 10 0 9 13 11 8 12 7 16 10 0 3 5 10 0 3 13 11 15 12 13 11 8 3 13 15 9 12 7 2 8 0 4 6 15 9 3 6 15 14 12 5 15 8 0 3 6 2 0 1 6 13 0 3 6 2 11 9 4 2 0 10 4 2 0 10 4 13 0 10 4 13 0 1 15 2 12 1 15 2 0 11 15 6 13 1 15 4 16 14 11 4 16 14 10 4 16 14 11 5 2 13 9 3 4 6 1 11 7 2 16 15 12 4 5 1 12 10 6 13 9 4 2 5 15 3 10 5 15 12 11 16 1 14 7 2 13 9 4 10 6 8 14 3 2 15 7 4 0 6 8 4 0 6 7 12 3 16 6 1 4 16 15 9 12 3 5 13 8 11 0 6 1 4 16 2 7 14 10 5 13 12 3 0 9 8 11 10 6 1 14 16 2 7 11 0 9 1 4 3 5 13 12 3 5 13 4 11 10 6 8 15 11 5 8 4 11 5 13 15 3 6 8 15 11 16 2 7 12 3 5 8 4 14 16 2 15 12 0 5 13 4 14 3 2 4 14 3 8 10 1 16 6 13 4 7 5 13 10 16 11 9 2 12 3 5 6 10 1 7 0 15 12 14 8 2 10 3 11 9 15 16 7 13 2 12 3 13 2 0 16 14 5 6 16 14 5 15 4 1 11 9 6 10 14 2 0 16 14 9 6 12 14 8 4 10 9 0 13 1 12 8 15 Core #1: 1 5 ... <ommitted rest for space but all are as long as Core #0> ``` Issue: #5843
Adds printing '.' for every record and '-' for waiting to the scheduler unit tests and udpates all the expected output. This makes it much easier to understand some of the results as now the lockstep timing all lines up. Adds -print_every to the launcher and switches to printing letters for a better output of what happened on each core (if #inputs<=26). Example: ``` $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 60000 -print_every 5000 Core #0: GGGGGGGGG,HH,F,B,G,I,A,CC,G,BB,A,FF,AA,GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG Core #1: D,C,D,B,H,FF,EE,CC,II,AA,C,D,G,HH,D,G,II,G,I,G,I,G,HH,BB,II,BB,C,H,I,AA,C,F,I,H,II,AA,C,H,A,H,F,CC,DD,C,BB,HH,CC,F,BB,C,D,H,BB,D,B,EE,I,E,DD,B,F,H,A,D,C,D,E,B,D,I,D,AA,E,DD,EE,CC,II,C,D,I,AA,DD,B,E,I,D,C,E,FF,E,BB,EE,FF,E,AA,D,E,DD,H,BB,HH,D,H,BB,I,AA,II,H,A,FF,H,I,HH,DD,I,H,F,DD,I,A,HH,AA,CC,BB,CC,BB,D,B,FF,H,F,D,I,DD,FF,C,A,C,AA,F,AA,EE,A,D,E,FF,AA,F,A,E,A,E,DD,EE,F,E,F,A Core #2: F,E,F,C,F,H,I,B,HH,II,FF,CC,G,H,DD,E,A,G,H,G,DD,G,F,D,A,H,I,FF,H,C,A,CC,II,A,FF,C,I,F,CC,B,FF,C,B,H,CC,B,D,B,DD,B,F,I,F,II,D,A,DD,I,D,H,E,H,I,D,HH,FF,BB,II,AA,EE,B,A,BB,E,II,A,BB,A,HH,E,AA,E,F,A,DD,HH,F,H,A,E,I,FF,I,B,F,II,A,FF,D,H,DD,I,AA,F,D,FF,AA,D,A,HH,A,H,F,A,FF,C,F,B,F,C,F,AA,B,FF,D,F,DD,B,C,H,CC,B,C,E,D,EE,C,E,D,EE,F,DD,E,F,D,A,DD,E,D,EE,D,E,D,AA,D,A,DD,F,D,C,D Core #3: E,A,F,A,D,I,DD,BB,AA,BB,DD,G,EE,AA,H,G,D,B,G,B,G,II,F,HH,B,AA,I,B,A,HH,CC,HH,F,A,FF,C,HH,BB,F,D,F,C,FF,H,C,FF,DD,AA,I,B,II,AA,I,A,B,A,F,A,C,I,B,H,A,F,C,A,C,EE,F,D,EE,CC,E,BB,E,DD,E,CC,B,EE,C,EE,B,I,E,D,E,II,H,B,EE,I,EE,B,II,F,EE,A,D,AA,DD,HH,F,A,F,HH,D,A,II,H,F,II,FF,CC,B,AA,F,A,C,FF,D,C,D,CC,B,C,DD,H,I,F,CC,A,F,C,FF,E,A,DD,E,D,A,FF,AA,EE,F,DD,FF,E,F,EE,FF,AA,EEEEEEEEEEEEEEEE ``` Issue: #5843
Adds printing '.' for every record and '-' for waiting to the scheduler unit tests and udpates all the expected output. This makes it much easier to understand some of the results as now the lockstep timing all lines up. Adds -print_every to the launcher and switches to printing letters for a better output of what happened on each core (if #inputs<=26). Example: ``` $ clients/bin64/scheduler_launcher -trace_dir drmemtrace.*.dir/trace -num_cores 4 -sched_quantum 60000 -print_every 5000 Core #0: GGGGGGGGG,HH,F,B,G,I,A,CC,G,BB,A,FF,AA,GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG Core #1: D,C,D,B,H,FF,EE,CC,II,AA,C,D,G,HH,D,G,II,G,I,G,I,G,HH,BB,II,BB,C,H,I,AA,C,F,I,H,II,AA,C,H,A,H,F,CC,DD,C,BB,HH,CC,F,BB,C,D,H,BB,D,B,EE,I,E,DD,B,F,H,A,D,C,D,E,B,D,I,D,AA,E,DD,EE,CC,II,C,D,I,AA,DD,B,E,I,D,C,E,FF,E,BB,EE,FF,E,AA,D,E,DD,H,BB,HH,D,H,BB,I,AA,II,H,A,FF,H,I,HH,DD,I,H,F,DD,I,A,HH,AA,CC,BB,CC,BB,D,B,FF,H,F,D,I,DD,FF,C,A,C,AA,F,AA,EE,A,D,E,FF,AA,F,A,E,A,E,DD,EE,F,E,F,A Core #2: F,E,F,C,F,H,I,B,HH,II,FF,CC,G,H,DD,E,A,G,H,G,DD,G,F,D,A,H,I,FF,H,C,A,CC,II,A,FF,C,I,F,CC,B,FF,C,B,H,CC,B,D,B,DD,B,F,I,F,II,D,A,DD,I,D,H,E,H,I,D,HH,FF,BB,II,AA,EE,B,A,BB,E,II,A,BB,A,HH,E,AA,E,F,A,DD,HH,F,H,A,E,I,FF,I,B,F,II,A,FF,D,H,DD,I,AA,F,D,FF,AA,D,A,HH,A,H,F,A,FF,C,F,B,F,C,F,AA,B,FF,D,F,DD,B,C,H,CC,B,C,E,D,EE,C,E,D,EE,F,DD,E,F,D,A,DD,E,D,EE,D,E,D,AA,D,A,DD,F,D,C,D Core #3: E,A,F,A,D,I,DD,BB,AA,BB,DD,G,EE,AA,H,G,D,B,G,B,G,II,F,HH,B,AA,I,B,A,HH,CC,HH,F,A,FF,C,HH,BB,F,D,F,C,FF,H,C,FF,DD,AA,I,B,II,AA,I,A,B,A,F,A,C,I,B,H,A,F,C,A,C,EE,F,D,EE,CC,E,BB,E,DD,E,CC,B,EE,C,EE,B,I,E,D,E,II,H,B,EE,I,EE,B,II,F,EE,A,D,AA,DD,HH,F,A,F,HH,D,A,II,H,F,II,FF,CC,B,AA,F,A,C,FF,D,C,D,CC,B,C,DD,H,I,F,CC,A,F,C,FF,E,A,DD,E,D,A,FF,AA,EE,F,DD,FF,E,F,EE,FF,AA,EEEEEEEEEEEEEEEE ``` Issue: #5843
When debugging i#6499 we noticed that drcachesim was producing 0 byte read/write records for some SVE load/store instructions: ``` ifetch 4 byte(s) @ 0x0000000000405b3c a54a4681 ld1w (%x20,%x10,lsl #2) %p1/z -> %z1.s read 0 byte(s) @ 0x0000000000954e80 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e84 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e88 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e8c by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e90 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e94 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e98 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e9c by PC 0x0000000000405b3c ifetch 4 byte(s) @ 0x0000000000405b4 ``` This turned out to be due to drdecode being linked into drcachesim twice: once into the drcachesim executable, once into libdynamorio. drdecode uses a global variable to store the SVE vector length to use when decoding so we end up with two copies of that variable and only one was being initialized. To fix this properly we would need to refactor the libraries so that there is only one copy of the sve_veclen global variable, or change the way that the decoder gets the vector length so its no longer stored in a global variable. In the mean time we have a workaround which makes sure both copies of the variable get initialized and drcachesim produces correct results. With that workaround in place however, the results were still wrong. For expanded scatter/gather instructions when you are using an offline trace, raw2trace doesn't have access to the load/store instructions from the expansion, only the original app scatter/gather instruction. It has to create the read/write records using only information from the original scatter/gather instruction and it uses the size of the memory operand to determine the size of each read/write. This works for x86 because the x86 IR uses the per-element data size as for the memory operand of scatter/gather instructions. This doesn't work for AArch64 because the AArch64 codec uses the maximum data transferred (per-element data size * number of elements) like other SIMD load/store instructions. We plan to make the AArch64 IR consistent with x86 by changing it to use the same convention as x86 for scatter/gather instructions but in the mean time we can work around the inconsistency by fixing the size in raw2trace based on the instruction's opcode. Issues: #6499, #5365
When debugging i#6499 we noticed that drcachesim was producing 0 byte read/write records for some SVE load/store instructions: ``` ifetch 4 byte(s) @ 0x0000000000405b3c a54a4681 ld1w (%x20,%x10,lsl #2) %p1/z -> %z1.s read 0 byte(s) @ 0x0000000000954e80 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e84 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e88 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e8c by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e90 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e94 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e98 by PC 0x0000000000405b3c read 0 byte(s) @ 0x0000000000954e9c by PC 0x0000000000405b3c ifetch 4 byte(s) @ 0x0000000000405b4 ``` This turned out to be due to drdecode being linked into drcachesim twice: once into the drcachesim executable, once into libdynamorio. drdecode uses a global variable to store the SVE vector length to use when decoding so we end up with two copies of that variable and only one was being initialized. To fix this properly we would need to refactor the libraries so that there is only one copy of the sve_veclen global variable, or change the way that the decoder gets the vector length so its no longer stored in a global variable. In the mean time we have a workaround which makes sure both copies of the variable get initialized and drcachesim produces correct results. With that workaround in place however, the results were still wrong. For expanded scatter/gather instructions when you are using an offline trace, raw2trace doesn't have access to the load/store instructions from the expansion, only the original app scatter/gather instruction. It has to create the read/write records using only information from the original scatter/gather instruction and it uses the size of the memory operand to determine the size of each read/write. This works for x86 because the x86 IR uses the per-element data size as for the memory operand of scatter/gather instructions. This doesn't work for AArch64 because the AArch64 codec uses the maximum data transferred (per-element data size * number of elements) like other SIMD load/store instructions. We plan to make the AArch64 IR consistent with x86 by changing it to use the same convention as x86 for scatter/gather instructions but in the mean time we can work around the inconsistency by fixing the size in raw2trace based on the instruction's opcode. Issues: #6499, #5365, #5036
Adds a new interface trace_analysis_tool::preferred_shard_type() to the drmemtrace framework to allow tools to request core-sharded operation. The cache simulator, TLB simulator, and schedule_stats tools override the new interface to request core-sharded mode. Unfortunately, it is not easy to detect core-sharded-on-disk traces in the launcher, so the user must now pass `-no_core_sharded` when using such traces with core-sharded-preferring tools to avoid the trace being re-scheduled yet again. Documentation for this is added and it is turned into a fatal error since this re-scheduling there is almost certainly user error. In the launcher, if all tools prefer core-sharded, and the user did not specify -no_core_sharded, core-sharded (or core-serial) mode is enabled, with a -verbose 1+ message. ``` $ bin64/drrun -stderr_mask 0 -t drcachesim -indir ../src/clients/drcachesim/tests/drmemtrace.threadsig.x64.tracedir/ -verbose 1 -tool schedule_stats:cache_simulator Enabling -core_serial as all tools prefer it <...> Schedule stats tool results: Total counts: 4 cores 8 threads: 1257600, 1257602, 1257599, 1257603, 1257598, 1257604, 1257596, 1257601 638938 instructions <...> Core #0 schedule: AEA_A_ Core #1 schedule: BH_ Core #2 schedule: CG Core #3 schedule: DF_ <...> Cache simulation results: Core #0 (traced CPU(s): #0) L1I0 (size=32768, assoc=8, block=64, LRU) stats: Hits: 123,659 <...> ``` If at least one tool prefers core-sharded but others do not, a -verbose 1+ message suggests running with an explicit -core_sharded. ``` $ bin64/drrun -stderr_mask 0 -t drcachesim -indir ../src/clients/drcachesim/tests/drmemtrace.threadsig.x64.tracedir/ -verbose 1 -tool cache_simulator:basic_counts Some tool(s) prefer core-sharded: consider re-running with -core_sharded or -core_serial enabled for best results. ``` Reduces the scheduler queue diagnostics by 5x as they seem too frequent in short runs. Updates the documentation to mention the new defaults. Updates numerous drcachesim test output templates. Keeps a couple of tests using thread-sharded by passing -no_core_serial. Fixes #6949
From [email protected] on February 11, 2009 15:46:52
The makefiles are currently assuming that the VMware toolchain is present.
For a proprietary product, IMHO having the toolchain standardized and
either in the repository or on a server is the right way to go, in order to
properly build old versions.
However, for an open source project we want to support building on as wide
a variety of toolchains as possible, and we don't really need to fix bugs
in old versions (though we could support that by listing not only minimum
versions but also maximum in our Makefiles). (Plus, we can't exactly
commit, say, the Windows DDK into our repository.) So instead of only
supporting a single version, long-term we should try to expand support.
There are many issues there as different versions of a compiler bring up
different subtle issues.
Here is a short list of what we currently require:
Linux:
Windows:
Both:
We can split future work (such as switching to cmake, getting version #s
from top level instead of being hardcoded in certain places, parallelizing
the build, re-enabling dependency checking, etc.) off into separate Issues
but for now this issues covers getting things working for the initial
developers (we'll reduce the priority once that's covered).
Original issue: http://code.google.com/p/dynamorio/issues/detail?id=2
The text was updated successfully, but these errors were encountered: