Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API tests #154

Open
broskoTT opened this issue Oct 14, 2024 · 0 comments
Open

API tests #154

broskoTT opened this issue Oct 14, 2024 · 0 comments
Labels
api redesign Related to the ongoing push for redesigning UMDs api

Comments

@broskoTT
Copy link
Contributor

A plan is to have a set of API tests:

  • To showcase UMD usage to clients
  • Which is arch agnostic
  • Which showcases (like this simplest one) how cumbersome is the current minimal UMD setup.

A lot of stuff should be taken from existing tests.
At this point it seems harder to rewrite existing tests rather than starting a new one.

The tests should also somewhat follow code examples shown in this diagram: https://docs.google.com/drawings/d/1-m1azdsBqMA0A6ATYRMfkhyeuOJuGCEI62N5a96LXj0/edit

@broskoTT broskoTT added the api redesign Related to the ongoing push for redesigning UMDs api label Oct 14, 2024
@broskoTT broskoTT linked a pull request Oct 14, 2024 that will close this issue
@broskoTT broskoTT removed a link to a pull request Oct 14, 2024
broskoTT added a commit that referenced this issue Oct 15, 2024
A plan is to have a set of API tests:
- To showcase UMD usage to clients
- Which is arch agnostic
- Which showcases (like this simplest one) how cumbersome is the current
minimal UMD setup.

A first PR for #154
broskoTT added a commit that referenced this issue Oct 17, 2024
A continuation on working on #154 
The first test showcases the basic usage of cluster descriptor.

The second test tests whether ClusterDescriptor can manage multiple
clusters of cards, cluster meaning connected chips (through ethernet).
It actually shows explicitly that this is not currently supported. So
this test fails on a system with multiple unconnected clusters of cards
(simple 2xN150 setup would suffice). We don't have a machine like that
in UMD pool, so the test passes on CI.
broskoTT added a commit that referenced this issue Oct 17, 2024
This change adds minimal setup to read/write including remote chips.
I've tested this on a setup with 3 N300 cards.

Motivation was that I wanted to change wait_for_non_mmio_flush to
include chip_id. Then I wanted to write a test. Then it didn't make
sense writing a test for this if basic IO is not setup.

This test adds to the collection of API tests which we should work on
reducing. Related to #154 . Related to #157

Minor changes:
- get_soc_descriptor had two different definitions, consolidated those.
- added is_chip_remote for convenience
broskoTT added a commit that referenced this issue Oct 21, 2024
wait_for_non_mmio_flush should be per chip_id. It is used like this in
tt_metal, but API doesn't offer it per chip.
This is related to #157 and preparing UMD for changes in tt_metal.

Also related to #154 since I've added a test which shows how this flush
is used.

- Does not break existing workflows
- tt_metal corresponding PR:
tenstorrent/tt-metal#13949
- tt_debuda change: Not used
broskoTT added a commit that referenced this issue Oct 23, 2024
Currently tt_SiliconDevice accepts one map for everything. But that goes
against our effort to separate cluster vs chip responsibilities, related
to #157
This change includes:
- Adding chip_id to setup_core_to_tlb_map. There should be a map per
mmio chip
- Started a chip api tests file
- Added an api example/test on how TLB setup functions at the moment.
This also does some minor configuration testing, whether it throws or
not.
- Minor cosmetic changes to other API tests

Contributes to #154 since it adds more api tests.
This change will require tt_metal changes.

- Breaks existing usages of setup_core_to_tlb_map
- tt_metal corresponding PR:
tenstorrent/tt-metal#13949
- tt_debuda change: Not used
broskoTT added a commit that referenced this issue Oct 24, 2024
This attribute is tied to a chip. Although it could be argued that
currently we only support a cluster of chips of same architecture,
therefore this would be the same for all chips in a single cluster.
Still, this makes sense and further unblocks supporting multiple
architectures in a single cluster/driver. It will also allow it to be
lowered to TTDevice class which will is arch specific (currently
architecture_implementation).

Related to #157 , contributes to #154 .

- Breaks existing usages of get_pcie_base_addr_from_device
- tt_metal corresponding PR:
tenstorrent/tt-metal#13949
- tt_debuda change: Not used
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api redesign Related to the ongoing push for redesigning UMDs api
Projects
None yet
Development

No branches or pull requests

1 participant