View Full Version : HPC - Branching Instruction Reduction (Inlining)

2012-Sep-22, 06:05 PM
This is a technique I would only recomend to someone that understands how thier specific compiler actualy works, and the optimizations it performs itself. Some compilers, (Cobol 2 >, HLASM, etc...) already perfrom this one automatically so there is no need with them, some, like C# do not.

This optimization, also goes against code re-usuablility/maintainablilty priciples, and should only be considered where performance is a must need, and usualy best to contain it to a one time/or limited usage type situation.

Mainly it involves going through your code, and removeing all function/method calls, and copying that code back to where it's used, inline.

It's usefulness is limited to fucntions/methods that are in often repeated loops or code.

This will not effect huge gains, but it will eliminate 2 to 4 instructions per function/method call transitioned to inline, (eliminates a branch and return, and the instuctions involved in passing and returning paramaters).

In some large loops this might be a benefit. But again, it's a very specific optimization that I would not normal recomend in day to day coding. However for HPC it might be something to consider if maintainablitiy is not as much of an issue as speed of operation.

2012-Sep-24, 12:17 AM
One major disadvantage is that if it's done manually, it's going to introduce a source of bugs, as it is likely that one or more identical statements or blocks of code won't be changed. This is where #include statements are very handy, as cpp or its logical equivalents will do the copy & paste for you.

More generically, knowing how to break up a problem into functions or subroutines is not trivial. There is some overhead involved with function calls; how much is compiler dependent. Luckily, most of my high-intensity programming has been done with Fortran, which tend to have very good optimizers in their compilers -- optimized Fortran was beating the speed of the best hand-coded assembler by the late 1980s -- so I've never had to worry about in-lining, and usually those kind of optimizations don't help enough to override the downstream maintenance problems (z/OS didn't have a Fortran pre-processor, and cpp usually mungs up Fortran code).

2012-Sep-24, 07:15 PM
Yes Inlcuding is defininately a big help with this when it;s available. Likewise most compilers do a decent job of inlining on thier own.

In the Case of .Net the inlining is done at run time when the MSIL is compiled to machine code, it keeps statitics on each fuction that's been JIT'd, and will inline based on those statitics when another function is converted that uses the already converted ones.

However compiler's do not do quite as well with inlining when they cross .dll or.exe boundries. This is the one case I can think of, where inlining (or inlcuding) the function in a dll or callable program into the main one might be usefull on a very rare occasion. I'm assumign that JIT's do a better job of inlining across .dll boundries, but I was not able to determine if that is the case with a little research on my end.