For the moment I compile the Kicad release version for speed comparison purpose. Some of the source changes behave well, others affects speed opposite to what I expect. Is there a simple way to see and/or setup gcc switches (probably -o3 for the release) by make / cmake parameter ? Is there anything prepared to produce a mix listing from a single file.cpp ?
Another thing: There are warnings about unused variables but I cannot see any warnings about type-mismatch (-Wall?)
Ok - everything went fine after edit of flags.make in correct directory. The downside was long compile time for huge listing of all files instead only for a single file of interest.
You can generally pass in compile flags through -DCMAKE_CXX_FLAGS when configuring, and you can switch between Debug and Release builds with the usual -DCMAKE_BUILD_TYPE=Debug parameter (default is Release).
We have quite a lot of type mismatches, most of them seem to be harmless, and the general policy seems to be to enable warnings by default only after cleaning them up (which is how -Wsuggest-override and -Werror=vla happened: someone enabled them locally, fixed all the problems, then submitted that and a default warning flag).
What kind of mixed listing do you want? C and assembler code mixed can be extracted with objdump -drS on the object files (.cpp.o below CMakeFiles).
The flags.make file is generated by CMake from the CMAKE_CXX_FLAGS setting (and a few others).
Was looking for execution speed. Using the debug compile object files shows very different behavior until not optimized by -o3. Therefore I switched to CMAKE_BUILD_TYPE=RELEASE. Against my expectations, float calculation executed 2-3 times (!) faster than using integer. As I dont use any debugger, a look into the asm listing shed some light on this. For Haswell and later MPU it seems true but probably for Atom and below integer will be faster. Examining the listing what gcc does with different integer, I also stumbled across my own sloppy type mismatch
Starting from the original Pentium somewhere in the 90ies, the FPU is part of the processor pipeline, and FPU instructions are executed out of order with integer math and have their own data dependency tracking.
What really slows things down is the requirement for Debug builds to update memory so the debugger can inspect values – this is implicit synchronization between floating and integer calculation.
Older CPUs benefit greatly from there being a lot more register space for floating point, newer CPUs add more integer registers and at the same time add vector instructions (so even more floating point registers), so floating point being faster than integers is not a big surprise.
In any case, for KiCad it doesn’t really matter, speed is mostly determined by memory access times and how well the compiler can avoid memory accesses – we don’t calculate that much.
If you want to help speed things up, compile with profiling enabled and look where we spend lots of CPU time – string comparisons that could go through a hash table, nested loops where inner loops could operate on smaller numbers of elements, stuff like that.