Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchical Placement Injection fails with a high number of variants #1083

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

stevenmburns
Copy link
Collaborator

The following results in a crash:

mkdir -p work/cascode_current_mirror_ota
cd work
export ALIGN_WORK_DIR=`pwd`
cd cascode_current_mirror_ota
schematic2layout.py ../../examples/cascode_current_mirror_ota/ -n 40

@854768750 I think this should work. The error is:

PnR.placer.SeqPair.SeqPair INFO : Enumerated search
PnR.placer.Placer.setPlacementInfoFromJson ERROR : instance_name: X_M1 concrete_template_name: PMOS_4T_86154205_X3_Y1 not found.
Segmentation fault

@854768750
Copy link
Collaborator

The following results in a crash:

mkdir -p work/cascode_current_mirror_ota
cd work
export ALIGN_WORK_DIR=`pwd`
cd cascode_current_mirror_ota
schematic2layout.py ../../examples/cascode_current_mirror_ota/ -n 40

@854768750 I think this should work. The error is:

PnR.placer.SeqPair.SeqPair INFO : Enumerated search
PnR.placer.Placer.setPlacementInfoFromJson ERROR : instance_name: X_M1 concrete_template_name: PMOS_4T_86154205_X3_Y1 not found.
Segmentation fault

Okay I can look into this.

@854768750
Copy link
Collaborator

How to use external placement information? It seems still using the placer. I followed #849 and used "--reference_placement_verilog_json" but not worked.

@stevenmburns
Copy link
Collaborator Author

@854768750 Every routing run now uses the placement injection. The placer is run using one PnRDB instance. The router is run using a different PnRDB instance.
If you want to just iterate on the placement injection you can try this:

mkdir -p work/cascode_current_mirror_ota
cd work
export ALIGN_WORK_DIR=`pwd`
cd cascode_current_mirror_ota
schematic2layout.py ../../examples/cascode_current_mirror_ota/ -n 40

then you can redo the router stage (were the error is happening) by running

schematic2layout.py ../../examples/cascode_current_mirror_ota/ -n 40 --flow_start 3_pnr:route

The main information carried between the placer and router is in: 3_pnr/__placer_dump__.json

@854768750
Copy link
Collaborator

At the second run (using external placement json file), the hiernode information is different from the first run. In the first run, the node "CASCODED_CMC_PMOS_31196023_PG0" block 3 has two variants while in the second run, there is only one variant.

@stevenmburns
Copy link
Collaborator Author

Yes, there is only one variant given for each internal hierarchy. There is a new map file where all the abstract template names for internal hierarchies have been replaced with a concrete template name. (Or put another way, we don't need to have more than one variant for each internal hiearchy.)

@854768750
Copy link
Collaborator

Yes, there is only one variant given for each internal hierarchy. There is a new map file where all the abstract template names for internal hierarchies have been replaced with a concrete template name. (Or put another way, we don't need to have more than one variant for each internal hiearchy.)

okay that is because the required variant is removed. I will use the new map file.

@stevenmburns
Copy link
Collaborator Author

You must already be using the new map file because it is what was used to generate the new PnRDB for the router.

@854768750
Copy link
Collaborator

854768750 commented Apr 28, 2022

After the first run, the node "CASCODED_CMC_PMOS_31196023_PG0" block 3 chooses the variant "PMOS_4T_86154205_X1_Y3". But the external placement json file uses "PMOS_4T_86154205_X3_Y1".
Also in the second run, there are 2 nodes called "CASCODED_CMC_PMOS_31196023_PG0" in the hiertree. However, only the second node (index 2) is used, because the its parent node has "child" value 2. And in the second run, the placer got 2 "CASCODED_CMC_PMOS_31196023_PG0" nodes. Should there be only one?

@854768750
Copy link
Collaborator

Also this example is time-consuming. When I set the placer_sa_iterations==100 or 1000, there is no error.

@stevenmburns
Copy link
Collaborator Author

You should use the --flow_start technique above to fix the long runtime. There error is when we run the router on stored data from the placer.

Let me look that the particular example above. What the placer did before is not very important. It needs to follow what is being specified in the placement verilog. I'll look through to make sure usage of CASCODED_CMC_PMOS_31196023_PG0 makes sense.

@stevenmburns
Copy link
Collaborator Author

After you running routing (and in this case get a crash) two files get created in 3_pnr/inputs. One called: CASCODE_CURRENT_MIRROR_OTA.abstract_verilog.json and a second called: CASCODE_CURRENT_MIRROR_OTA.scaled_placement_verilog.json.
These two files are used to specify the problem for the router. The first is the verilog file used to build the PnRDB model. The second has the particular placement and selection assignment for all the instances. There are two concrete instances of CASCODE_CMC_PMOS_31196023_PG0 in the .scaled_placement_verilog.json file because to there are two instances in the netlists (which came from topological identification on different transistors). The placer selected different implementations so that is why there are two. There were forty generated by the placer. Only the two that are used are specified in these two files now.

@854768750
Copy link
Collaborator

How to make the second PnRDB the corresponding to the selection assignment? Use scaled_placement_verilog.json to build PnRDB?

@854768750
Copy link
Collaborator

More findings:
During the second run of PnRDB constrcution, the first "CASCODED_CMC_PMOS_31196023_PG0" uses "PMOS_4T_86154205_X3_Y1", the second "CASCODED_CMC_PMOS_31196023_PG0" uses "PMOS_4T_86154205_X1_Y3". This looks the same as scaled_placement_verilog.json.
But at the beginning of the placer, the nodeVec has two "CASCODED_CMC_PMOS_31196023_PG0" with "PMOS_4T_86154205_X1_Y3"

@854768750
Copy link
Collaborator

I guess it may be some something between the second PnRDB and the second placer. Is the second PnRDB built based on scaled_placement_verilog.json?

@stevenmburns
Copy link
Collaborator Author

You need to build the PnRDB with:

CASCODE_CURRENT_MIRROR_OTA.abstract_verilog.json

Then you feed in the placement and selection information from

CASCODE_CURRENT_MIRROR_OTA.scaled_placement_verilog.json

This is all happening for you when you run:

schematic2layout.py ../../examples/cascode_current_mirror_ota/ -n 40 --flow_start 3_pnr:route

The python code also creates the map file (in memory) and the lef file. You should be able to debug just using this.

@stevenmburns
Copy link
Collaborator Author

@854768750 I don't know what you mean by "At the beginning of the placer". Placer variant numbers have been reassigned from the original numbering when the placer was run. Only the variant that are needed are included and if they where _25 and _36 during placement then will still be _0 and _1 in the ".abstract_verilog.json" and ".scaled_placement_verilog.json" files. We only need to build up the variants that are need for this particular design.

We want to be able to import design that were generated by other placers so we need functionality like this.

@854768750
Copy link
Collaborator

@@ -21,6 +21,7 @@ def _ReadVerilogJson( DB, j, add_placement_info=False):

temp_node = PnR.hierNode()
temp_node.name = module['name'] if 'name' in module else module['abstract_name']
temp_node.concrete_name = module['concrete_name'] if 'concrete_name' in module else module['name']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hiernode needs concrete name, so that the placer can match placement json file to each hiernode

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenmburns It seems the PnRDB is build in this function, not in C++ code. Is it possible to parse the concrete name in this function?

@@ -161,7 +161,7 @@ def gen_abstract_verilog_d( verilog_d):
assert 'name' not in module
module['name'] = module['abstract_name']
del module['abstract_name']
del module['concrete_name']
#del module['concrete_name']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here. However, this causes error in the other place. What I need is to make hiernode info match the placement json file.

@stevenmburns
Copy link
Collaborator Author

What should be do now for this? I'm not sure I understand what you need here. The concrete template names are in the scaled placement verilog files (the verilog that is being parsed by your C++ code) so you should be able to do what you need from that.
I don't think the concrete names should be in the "abstract verilog file" (the one being used to build the PnRDB).
They can be in a map file if you need it. Let me know if you want to meet.

@854768750
Copy link
Collaborator

What should be do now for this? I'm not sure I understand what you need here. The concrete template names are in the scaled placement verilog files (the verilog that is being parsed by your C++ code) so you should be able to do what you need from that. I don't think the concrete names should be in the "abstract verilog file" (the one being used to build the PnRDB). They can be in a map file if you need it. Let me know if you want to meet.

Okay, I think the C++ code needs to parse concrete template names (now it does not). Is this your point?

@854768750
Copy link
Collaborator

The PnRDB in the second run is built in align/pnr/manipulate_hierarchy.py ?

@stevenmburns
Copy link
Collaborator Author

Yes. It is there just to prepare the design for reloading into the router.
We can do it differently, but we want to allow other placement engines to inject a placement here.
That is why I generated the abstract_verilog and the scaled_placement_verilog from the data in the placer.
What is the exact problem? Doesn't the scaled_placement_verilog provide a legal assignment of hierarchical templates for the given abstract_verilog.

Do you want to generate a smaller failing testcase? I can probably do that if it is necessary (and if I know more about want exactly is causing the crash.)

@854768750
Copy link
Collaborator

Yes. It is there just to prepare the design for reloading into the router. We can do it differently, but we want to allow other placement engines to inject a placement here. That is why I generated the abstract_verilog and the scaled_placement_verilog from the data in the placer. What is the exact problem? Doesn't the scaled_placement_verilog provide a legal assignment of hierarchical templates for the given abstract_verilog.

Do you want to generate a smaller failing testcase? I can probably do that if it is necessary (and if I know more about want exactly is causing the crash.)

The problem is when building the Hiernode in function gen_abstract_verilog_d, some informarion such as concrete_name is not parsed. Because there are two nodes both have the same name CASCODED_CMC_PMOS_31196023_PG0, we need to have some other things to identify them.

@854768750
Copy link
Collaborator

This test case is a good one. No need to have another one at now.

@stevenmburns
Copy link
Collaborator Author

I looked through the abstract_verilog_json file and it seems that I'm giving you two things incorrectly.

  1. There are two copies of the module CASCODED_CMC_PMOS_31196023_PG0 There should only be one.
  2. The leaf cells have an abstract_template_name that seems to be a concrete_template_name. That should be changed to the abstract version.

I'll see if I can fix that up.

@stevenmburns
Copy link
Collaborator Author

This latest change makes the abstract names and concrete names always be the same when we set up the problem from the router.
The names in the abstract_verilog_json are names with "_0" and "_1" suffixes. Each module in this file should have a unique module in the PnRDB.

There is crash now probably due to not have the constraint files named correctly. I'll work on that next.

@854768750
Copy link
Collaborator

okay, just let me know when it is solved. Maybe I also need to do some changes in PnR.

@stevenmburns
Copy link
Collaborator Author

@854768750 Can you look at this test case now? It is crashing in topo ordering.
You should be able to ignore the warnings about constraint file missing.

@854768750
Copy link
Collaborator

854768750 commented May 5, 2022 via email

@stevenmburns
Copy link
Collaborator Author

You shouldn't need to do any parsing of the names. Concrete and abstract names should be the same.

@854768750
Copy link
Collaborator

node 0 is topcell and has name "CASCODE_CURRENT_MIRROR_OTA_0". Is it possible to remove "_0" for topcell?

@854768750
Copy link
Collaborator

topcell in "CASCODE_CURRENT_MIRROR_OTA.scaled_placement_verilog.json" still has "_0"

@stevenmburns
Copy link
Collaborator Author

No, it is important that the top level name include the "_0". This is to be consistent with the intermediate levels of hierarchy that definitely need the "_0" (or "_1").

Is there some other name that needs to be changed to include an "_0"? Something else in the call to create the PnRDB.

@854768750
Copy link
Collaborator

I think this works now.

@stevenmburns
Copy link
Collaborator Author

@854768750 This looks good. I need to fix up some naming issues in the checker but it looks like the placement creation is good.

@stevenmburns
Copy link
Collaborator Author

@854768750 I'm ready for you to look at this again. There are a number of failing testcases:

================================================================================ short test summary info =================================================================================
FAILED tests/constraints/test_constraint.py::test_donotroute - AssertionError: ckt_donotroute (CKT_DONOTROUTE_0_0):Number of DRC errorrs: 3
FAILED tests/constraints/test_constraint.py::test_netpriority - AssertionError: ckt_netpriority (CKT_NETPRIORITY_0_0):Number of DRC errorrs: 2
FAILED tests/constraints/test_constraint.py::test_placecloser - AssertionError: ckt_placecloser (CKT_PLACECLOSER_0_0):Number of DRC errorrs: 2
FAILED tests/pdk/finfet_pdk/test_placement_grid.py::test_scalings - AssertionError: ckt_scalings (CKT_SCALINGS_0_0):Number of DRC errors: 2
FAILED tests/pdk/finfet_pdk/test_placer.py::test_cmp_fast - AssertionError: ckt_cmp_fast (CKT_CMP_FAST_0_0):Number of DRC errors: 2
FAILED tests/pdk/finfet_pdk/test_placer.py::test_sub_1 - AssertionError: ckt_sub_1 (CKT_SUB_1_0_0):Number of DRC errors: 2
FAILED tests/pdk/finfet_pdk/test_placer.py::test_cmp_slow - AssertionError: ckt_cmp_slow (CKT_CMP_SLOW_0_0):Number of DRC errors: 2

They are all routing failures. A number of them are power routing related.

@854768750
Copy link
Collaborator

854768750 commented May 31, 2022 via email

@stevenmburns
Copy link
Collaborator Author

@854768750 @soneryaldiz All of them seem to not be doing power routing. (Or there are opens in the power and ground nets.)

@soneryaldiz What is test_donotroute suppose to check? There are three DRC errors now. I think there is supposed to be one, but I don't know how it gets passed the run_example check.

@soneryaldiz
Copy link
Collaborator

@stevenmburns , This test is supposed to check that a signal net is not routed given a DoNotRoute constraint. I will look into why it is failing.

@soneryaldiz
Copy link
Collaborator

@stevenmburns , when I run the test, it fails due to opens. However, the nets not to be routed should be excluded from the open check (see align/pnr/checkers.py line 300 and below. Debugging the code, hN.name is pointing to a concrete_name but not to the abstract name. This is breaking the code. Why is hN.name no longer the abstract name? hN also has concrete_name attribute.

@stevenmburns
Copy link
Collaborator Author

For routing there doesn't need to be a distinction between abstract and concrete name (that is a placement concept.)
They are being set to the same thing in the placement verilog file. I'll look at what you are doing and see why we need a distinction between the two.

@soneryaldiz
Copy link
Collaborator

soneryaldiz commented May 31, 2022

pnr_const_ds uses abstract names. This could explain some of the other failures as well (whenever there is a DoNotRoute).

@stevenmburns
Copy link
Collaborator Author

@soneryaldiz We are going to have to think about the right thing to do here. We might need to generate a second pnr_const_ds for the routing phase (separate from the placement phase.) [There are two different DBs now, and we are making changes to the design.] Then the names could be correct.

@854768750
Copy link
Collaborator

Do I need to do anything here now? Naming issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants