Reduce the space complexity of the HungarianMatcher module. #606

aioaneid · 2023-09-18T17:32:32Z

The memory reduction factor of the cost matrix is sum(#target objects) / max(#target objects).

That is achieved by no longer computing and storing matching costs between predictions and targets at different positions inside the batch. More exactly the original matrix of shape [batch_size * queries, sum(#target objects)] is shrinked to a tensor of shape [batch_size, queries, max(#target objects)].

Besides allowing much larger batch sizes, tested on the table structure recognition task using the Table Transformer (TATR) (125 queries, 7 classes) with Pubmed data, this change also results a) on CUDA at all batch sizes and on CPU with small batches in a small but meaningful speedup, b) on CPU with larger batch sizes in much higher speedups.

The processing time reduction computed as (1 - new_time / old_time) is shown below in various configurations:

Batch	1	2	3	4	5	6	7	8	16	32	64
CUDA	8.2%	1.6%	1.6%	0.9%	0.8%	0.9%	0.9%
CPU	1.6%	9.3%	7.7%	11.2%	13.9%	15.5%	23.1%	47.1%	70.6%	88.3%	95.0%

The memory reduction factor of the cost matrix is sum(#target objects) / max(#target objects). That is achieved by no longer computing and storing matching costs between predictions and targets at different positions inside the batch. More exactly the original matrix of shape [batch_size * queries, sum(#target objects)] is shrinked to a tensor of shape [batch_size, queries, max(#target objects)]. Besides allowing much larger batch sizes, tested on the table structure recognition task using the Table Transformer (TATR) (125 queries, 7 classes) with pubmed data, this change also results a) on CUDA at all batch sizes and on CPU with small batchs in a small but meaningful speedup, b) on CPU with larger batch sizes in much higher speedups. The processing time decrease computed as (1 - new_time / old_time) is shown below in various configuration: Batch | Device size | cuda cpu ------------------ 1 8.2% 1.6% 2 1.6% 9.3% 3 1.6% 7.7% 4 0.9% 11.2% 5 0.8% 13.9% 6 0.9% 15.5% 7 0.9% 23.1% 8 47.1% 16 70.6% 32 88.3% 64 95.0%

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 18, 2023

aioaneid force-pushed the optimize-hungarian-matcher branch from eff320f to 4833f70 Compare September 18, 2023 17:36

aioaneid changed the title ~~Reduce HungarianMatcher's space complexity.~~ Reduce space complexity of HungarianMatcher. Sep 18, 2023

aioaneid force-pushed the optimize-hungarian-matcher branch from 4833f70 to 3c2927d Compare September 18, 2023 17:43

aioaneid changed the title ~~Reduce space complexity of HungarianMatcher.~~ Reduce the space complexity of HungarianMatcher. Sep 18, 2023

aioaneid changed the title ~~Reduce the space complexity of HungarianMatcher.~~ Reduce the space complexity of the HungarianMatcher module. Sep 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce the space complexity of the HungarianMatcher module. #606

Reduce the space complexity of the HungarianMatcher module. #606

aioaneid commented Sep 18, 2023

Reduce the space complexity of the HungarianMatcher module. #606

Are you sure you want to change the base?

Reduce the space complexity of the HungarianMatcher module. #606

Conversation

aioaneid commented Sep 18, 2023