1

So, I face this question every day. When I write code, sometimes I just swap some conditionals around to clean up the spaghetti I make or replace a datatype by another, and suddenly my code runs much faster.

I'm not quite experienced enough to understand why, but my guess is that most of the time this is due to the optimizer picking up some specific pattern in the code that it can optimize.

The thing is that I prefer to develop my applications on Linux, but I also target windows, it's honestly a trouble testing on both platforms, so I just keep working on Linux, and later I test it on Windows only.

That always leave me with the question: "What if it made it slower on MSVC now?".

I wonder if there's any rule of thumb for when optimizations tend to generalize across compilers, or if the only way of knowing is by profiling it.

kale hops
  • 31
  • 1
  • 3
  • Changing the `if` condition around could also make the path easier to predict, and a more predictable path is often orders of magnitude faster. – user4581301 Jun 01 '22 at 18:30
  • 4
    My rule is to write for maintainability/readability, not performance. Easy to understand code means easy to optimize code – NathanOliver Jun 01 '22 at 18:32
  • 1
    Ultimately, you need to decide if you care this much about performance or not. If you do, you'll have to measure it on all compilers you care about, to be sure. If not, one less thing to worry about. – HolyBlackCat Jun 01 '22 at 18:41
  • If you have to rely on compiler optimizations to reach usable runtime speeds I guess you are really pushing the limits. I second maintainability. I don't think that you should rely on optimization always working the same. Since there is no standard for that, you could lose all runtime gains with a new compiler release optimizing your code a little differently. – PhilMasteG Jun 01 '22 at 18:41
  • 3
    Before by-hand fine-tune micro-optimizing, make sure you profile the before and the after. One of my coworkers micro-optimized a very lengthy routine, and made it about x10 larger, and about x100 slower, and x1000 less legible. He didn't believe me until I profiled the code before and the code after. (Both of us were fairly new to having an optimizing compiler, which back in the day were very expensive.) – Eljay Jun 01 '22 at 18:46
  • 1
    Compiler makers have an interest in making cross-platform code fast, and the principles are the same, so it usually works out. However, when in doubt, check with godbolt.org. You don't have to be great at assembly to see whether one version is vectorized while another is not or something similar – Homer512 Jun 01 '22 at 18:47
  • 2
    This would have been a reasonable question in 2006. 2022 > `The compiler is smarter than you!` In 1998 I could outsmart the Watson optimizing compiler. Today I don't stand a chance. – Captain Giraffe Jun 01 '22 at 18:56
  • Ok, so to be clear, I'm actually quite satisfied with the performance of my code, but in the process of clean up and refactoring I just kept getting more unexpected speedups from minor changes like moving an if from one place to another, which left me confused. – kale hops Jun 01 '22 at 19:20
  • If a change makes the source logic closer to what's efficient in asm on current CPUs, then it will often (but not always) help on most compilers. If not, then maybe it just happened to help one compiler see something clever, or influence its branch-likelihood guessing in one direction or the other for how it lays out the source. (Not-taken branches are often faster). – Peter Cordes Jun 02 '22 at 02:00
  • See [Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly?](https://stackoverflow.com/a/40355466) and [What is the efficient way to count set bits at a position or lower?](https://stackoverflow.com/a/34410357) re: writing source to hand-hold the compiler in the direction of making more efficient asm. – Peter Cordes Jun 02 '22 at 02:01
  • If you're on a Skylake CPU specifically, one thing that can cause seemingly random variations in performance with minor source changes is the microcode mitigation for the JCC erratum, which introduced new performance potholes that compilers only specifically avoid if you tell them to. See [How can I mitigate the impact of the Intel jcc erratum on gcc?](https://stackoverflow.com/q/61256646) (also covers clang and MSVC) - if those options make both versions perform nearly equal, it was very likely that recent CPU change that stopped part of a tight loop from using the uop cache. – Peter Cordes Jun 02 '22 at 02:04

0 Answers0