You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi VTR team,
I've been investigating the root cause of the inferior solution reported in issue #2726. After analyzing the VTR source code and performing some active debugging, I discovered that unrelated blocks with a high gain value are being unexpectedly added into a cluster, which is calculated in the vpr/src/pack/cluster_util.cpp file, specifically in the function get_molecule_gain. This issue occurs regardless of whether the “allow_unrelated_clustering” option is set to “auto” or “off.”
Expected Behaviour
The molecules condition_program_first2713^MIN~52-1[0] and condition_program_first2713^MIN~52-0[0] should not be added to the condition_program_first5685 cluster, as they have no logical connection to it.
Current Behaviour
I started by trying to localize the problem that leads to the block condition_program_first2713^MIN~52-1[0] being added to a cluster where all other blocks belong to condition_program_first5685. After some debugging, I focused on the following code snippet in the vpr/pack/cluster_util.cpp file:
for (j = pb->pb_stats->num_feasible_blocks - 1; j >= 0; j--) {
if (get_molecule_gain(pb->pb_stats->feasible_blocks[j], gain, cluster_att_grp, attraction_groups, num_molecule_failures) >
get_molecule_gain(molecule, gain, cluster_att_grp, attraction_groups, num_molecule_failures)) {
pb->pb_stats->feasible_blocks[j + 1] = pb->pb_stats->feasible_blocks[j];
} else {
pb->pb_stats->feasible_blocks[j + 1] = molecule;
break;
}
}
if (j < 0) {
pb->pb_stats->feasible_blocks[0] = molecule;
}
The value of gain determines whether the molecule should be given a higher priority. I added several VTR_LOG statements as breakpoints to observe the gain value of condition_program_first2713^MIN~52-1[0] in the function get_molecule_gain:
floatget_molecule_gain(t_pack_molecule* molecule, std::map<AtomBlockId, float>& blk_gain, AttractGroupId cluster_attraction_group_id, AttractionInfo& attraction_groups, int num_molecule_failures) {
... //declaration
num_introduced_inputs_of_indirectly_related_block = 0;
for (i = 0; i < get_array_size_of_molecule(molecule); i++) {
auto blk_id = molecule->atom_block_ids[i];
if (blk_id) {
if (blk_gain.count(blk_id) > 0) {
gain += blk_gain[blk_id];
VTR_LOG("gain += blk_gain[blk_id]: %s %s\n", atom_ctx.nlist.block_name(blk_id).c_str(), std::__cxx11::to_string(gain).c_str());
} else {
/* This block has no connection with current cluster, penalize molecule for having this block*/for (auto pin_id : atom_ctx.nlist.block_input_pins(blk_id)) {
auto net_id = atom_ctx.nlist.pin_net(pin_id);
VTR_ASSERT(net_id);
auto driver_pin_id = atom_ctx.nlist.net_driver(net_id);
VTR_ASSERT(driver_pin_id);
auto driver_blk_id = atom_ctx.nlist.pin_block(driver_pin_id);
num_introduced_inputs_of_indirectly_related_block++;
for (int iblk = 0; iblk < get_array_size_of_molecule(molecule); iblk++) {
if (molecule->atom_block_ids[iblk] && driver_blk_id == molecule->atom_block_ids[iblk]) {
//valid block which is driver (and hence not an input)
num_introduced_inputs_of_indirectly_related_block--;
VTR_LOG("num_introduced_inputs_of_indirectly_related_block: %s\n", std::__cxx11::to_string(num_introduced_inputs_of_indirectly_related_block).c_str());
break;
}
}
}
}
AttractGroupId atom_grp_id = attraction_groups.get_atom_attraction_group(blk_id);
if (atom_grp_id == cluster_attraction_group_id && cluster_attraction_group_id != AttractGroupId::INVALID()) {
float att_grp_gain = attraction_groups.get_attraction_group_gain(atom_grp_id);
gain += att_grp_gain;
VTR_LOG("gain += att_grp_gain: %s %s\n", atom_ctx.nlist.block_name(blk_id).c_str(), std::__cxx11::to_string(gain).c_str());
} elseif (cluster_attraction_group_id != AttractGroupId::INVALID() && atom_grp_id != cluster_attraction_group_id) {
gain -= attraction_group_penalty;
VTR_LOG("gain -= attraction_group_penalty: %s %s\n", atom_ctx.nlist.block_name(blk_id).c_str(), std::__cxx11::to_string(gain).c_str());
}
}
}
gain += molecule->base_gain * 0.0001; /* Use base gain as tie breaker TODO: need to sweep this value and perhaps normalize */VTR_LOG("gain += molecule->base_gain * 0.0001: %s\n", std::__cxx11::to_string(gain).c_str());
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001);
VTR_LOG("gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): %s\n", std::__cxx11::to_string(gain).c_str());
if (num_molecule_failures > 0 && attraction_groups.num_attraction_groups() > 0) {
gain -= 0.1 * num_molecule_failures;
VTR_LOG("gain -= 0.1 * num_molecule_failures: %s\n", std::__cxx11::to_string(gain).c_str());
}
VTR_LOG("------------------------------------\n");
return gain;
}
From the logs, I observed the following gain values before condition_program_first2713^MIN~52-1[0] was added to the cluster:
gain += blk_gain[blk_id]: condition_program_first2713^MIN~52-0[0] 0.000000
gain += blk_gain[blk_id]: condition_program_first2713^MIN~52-1[0] 0.000000
gain += molecule->base_gain * 0.0001: 0.003840
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): 0.003840
------------------------------------
gain += blk_gain[blk_id]: condition_program_first5685^lNOT~77 0.000000
gain += molecule->base_gain * 0.0001: 0.000100
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): 0.000100
------------------------------------
gain += blk_gain[blk_id]: condition_program_first2713^MIN~53-0[0] 0.000000
gain += blk_gain[blk_id]: condition_program_first2713^MIN~53-1[0] 0.000000
gain += molecule->base_gain * 0.0001: 0.003840
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): 0.003840
------------------------------------
gain += blk_gain[blk_id]: condition_program_first5685^lNOT~77 0.000000
gain += molecule->base_gain * 0.0001: 0.000100
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): 0.000100
------------------------------------
gain += blk_gain[blk_id]: condition_program_first2713^ADD~77-0[0] 0.000000
gain += blk_gain[blk_id]: condition_program_first2713^ADD~77-1[0] 0.000000
gain += molecule->base_gain * 0.0001: 0.003840
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): 0.003840
------------------------------------
gain += blk_gain[blk_id]: condition_program_first5685^lNOT~77 0.000000
gain += molecule->base_gain * 0.0001: 0.000100
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): 0.000100
------------------------------------
gain += blk_gain[blk_id]: condition_program_first5685^lNOT~76 0.000000
gain += molecule->base_gain * 0.0001: 0.000100
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): 0.000100
------------------------------------
gain += blk_gain[blk_id]: condition_program_first5685^lNOT~77 0.000000
gain += molecule->base_gain * 0.0001: 0.000100
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): 0.000100
------------------------------------
The molecule condition_program_first2713^MIN~52-1[0] was added to the cluster with a gain value of 0.003840, indicating that it had the highest priority among the current candidates. However, there were no blocks related to condition_program_first2713 in the cluster prior to this addition. From worst slack in the final FPGA solution, I confirmed that this inappropriate addition caused the inferior result.
Analysis
The root cause seems to stem from the gain computation process. Why did this unrelated block receive such a high gain value during clustering?
gain += blk_gain[blk_id]: condition_program_first2713^MIN~52-0[0] 0.000000
gain += blk_gain[blk_id]: condition_program_first2713^MIN~52-1[0] 0.000000
gain += molecule->base_gain * 0.0001: 0.003840
gain -= num_introduced_inputs_of_indirectly_related_block * (0.001): 0.003840
As shown in the report, neither condition_program_first2713^MIN~52-0[0] nor condition_program_first2713^MIN~52-1[0] initially received any gain, but both ended up with a gain value of 0.003840 due to the statement gain += molecule->base_gain * 0.0001.
Possible Solution
To solve the problem, I think some of following work should be done:
(1) Code Revision: Investigate whether the statement gain += molecule->base_gain * 0.0001 may introduce a potential bug. If so, the computation steps for gain should be revised to ensure accurate calculation.
(2) Improvement in Gain Calculation Strategy: The current gain calculation strategy may overemphasize certain clusters, leading to undesirable slack, critical path delay, and wire length in the final result. A more flexible and effective method to compute gain may be necessary.
(3) Optimization Process: The unrelated blocks were also ignored in the optimization steps. A stronger optimization process could potentially resolve this issue.
I am attempting to obtain a high-performance P&R solution but encountered an unexpected detour situation. I hope the findings above can help improve VTR's performance. Thank you!
Your Environment
VTR revision used: VTR master Oct 4th, 2024
Operating System and version: Ubuntu 20.04
Compiler version: gcc 9.4.0, cmake 3.20.0
The text was updated successfully, but these errors were encountered:
Hi VTR team,
I've been investigating the root cause of the inferior solution reported in issue #2726. After analyzing the VTR source code and performing some active debugging, I discovered that unrelated blocks with a high
gain
value are being unexpectedly added into a cluster, which is calculated in thevpr/src/pack/cluster_util.cpp
file, specifically in the functionget_molecule_gain
. This issue occurs regardless of whether the “allow_unrelated_clustering” option is set to “auto” or “off.”Expected Behaviour
The molecules
condition_program_first2713^MIN~52-1[0]
andcondition_program_first2713^MIN~52-0[0]
should not be added to the condition_program_first5685 cluster, as they have no logical connection to it.Current Behaviour
I started by trying to localize the problem that leads to the block
condition_program_first2713^MIN~52-1[0]
being added to a cluster where all other blocks belong tocondition_program_first5685
. After some debugging, I focused on the following code snippet in thevpr/pack/cluster_util.cpp
file:The value of gain determines whether the molecule should be given a higher priority. I added several VTR_LOG statements as breakpoints to observe the gain value of
condition_program_first2713^MIN~52-1[0]
in the functionget_molecule_gain
:From the logs, I observed the following gain values before
condition_program_first2713^MIN~52-1[0]
was added to the cluster:The molecule
condition_program_first2713^MIN~52-1[0]
was added to the cluster with a gain value of 0.003840, indicating that it had the highest priority among the current candidates. However, there were no blocks related tocondition_program_first2713
in the cluster prior to this addition. From worst slack in the final FPGA solution, I confirmed that this inappropriate addition caused the inferior result.Analysis
The root cause seems to stem from the gain computation process. Why did this unrelated block receive such a high gain value during clustering?
As shown in the report, neither
condition_program_first2713^MIN~52-0[0
] norcondition_program_first2713^MIN~52-1[0]
initially received any gain, but both ended up with a gain value of 0.003840 due to the statementgain += molecule->base_gain * 0.0001
.Possible Solution
To solve the problem, I think some of following work should be done:
(1) Code Revision: Investigate whether the statement
gain += molecule->base_gain * 0.0001
may introduce a potential bug. If so, the computation steps for gain should be revised to ensure accurate calculation.(2) Improvement in Gain Calculation Strategy: The current gain calculation strategy may overemphasize certain clusters, leading to undesirable slack, critical path delay, and wire length in the final result. A more flexible and effective method to compute gain may be necessary.
(3) Optimization Process: The unrelated blocks were also ignored in the optimization steps. A stronger optimization process could potentially resolve this issue.
Steps to Reproduce
Run the following command:
(
--allow_unrelated_clustering off
is optional)Packing_Devices_from_Two_Seperate_Parts_of_a_Netlist_into_a_Single CLB.zip
EArch_fixed_160_230_no_power.zip
The files related to problem above are:
cluster_util.cpp.txt
vpr_stdout_standard.log
vpr_stdout_allow_unrelated_clustering_off.log
Context
I am attempting to obtain a high-performance P&R solution but encountered an unexpected detour situation. I hope the findings above can help improve VTR's performance. Thank you!
Your Environment
The text was updated successfully, but these errors were encountered: