Assembly Optimization Tips by Mark Larson

(masm32.com)

13 points | by htfy96 4 days ago

4 comments

  • ghaff 1 hour ago
    A fascinating peek into the fairly deep past (sigh) is Abrash's The Zen of Assembly language. Time pretty much overtook a planned Volume 2 but the Volume 1 is still a pretty fascinating read for a time when tweaking optimization for pre-fetch queues and the like was still a thing.
  • optymizer 22 minutes ago
    What's a good resource like this for modern CPUs (especially ARM)?
  • mshockwave 1 hour ago
    > (Intermediate)1. Adding to memory faster than adding memory to a register

    I'm not familiar with Pentium but my guess is that memory store is relatively cheaper than load in many modern (out-of-order) microarchitectures.

    > (Intermediate)14. Parallelization.

    I feel like this is where compilers come into handy, because juggling critical paths and resource pressures at the same time sounds like a nightmare to me

    > (Advanced)4. Interleaving 2 loops out of sync

    Software pipelining!

  • fwip 2 hours ago
    Looks like this was written in 2004, or thereabouts.
    • nickelas 1 hour ago
      I was wondering why it said P4. That's an old processor.