-
Notifications
You must be signed in to change notification settings - Fork 63
Performance Papers
CPU performance papers, often Intel or x86 specific.
Non-Determinism and Overcount on Modern Hardware Performance Counter Implementations
This paper describes cases where x86 PMU counters are inexact, usually due to overcounting when external events occur.
Attack Directories, Not Caches: Side-Channel Attacks in a Non-Inclusive World
This paper describes in some detail the structure of the SKX (Skylake-SP, Skylake-X, etc) non-inclusive L3 cache, including the snoop filter structure. Interesting even apart from any cache side channel possibilities.
Reverse Engineering of Cache Replacement Policies in Intel Microprocessors and Their Evaluation
This paper describes PLRU replacement models for the CPU cache, including experimental evaluation of the Core 2 Duo caches, with the result that three different strategies appear to be used across three models.
BlackjackBench: Portable Hardware Characterization with Automated Results Analysis This paper has micro-benchmarks designed to suss out hardware details like cache sizes, instruction latencies, etc and includes description of automatically interpreting the results.
Learning to Superoptimize Real-world Programs Using AI for superoptimization. Introduces a "Big Assembly" benchmark of 25K kernels extracted from open source projects.