Following the resounding success on ARMs, let's try LTO on AVRs,
too. Advantage isn't all that well, binary size increases by 462
bytes and even an additional byte of RAM is needed.
According to @Wurstnase's research, this size increase is pretty
unique to the config.h.Profiling configuration. All other
configurations he tried actually showed a size drop.
Anyways, we have 15 to 17 clock cycles less on any step, so an
about 7% general stepping performance increase.
ATmega sizes '168 '328(P) '644(P) '1280
Program: 18078 bytes 127% 59% 29% 15%
Data: 2176 bytes 213% 107% 54% 27%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode statistics:
LED on occurences: 888.
LED on time minimum: 202 clock cycles.
LED on time maximum: 380 clock cycles.
LED on time average: 232.092 clock cycles.
smooth-curves.gcode statistics:
LED on occurences: 23648.
LED on time minimum: 220 clock cycles.
LED on time maximum: 423 clock cycles.
LED on time average: 255.22 clock cycles.
triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 220 clock cycles.
LED on time maximum: 380 clock cycles.
LED on time average: 245.575 clock cycles.
After researching this issue for the third time, I finally found
a proper solution: one can't keep an entire section without re-
writing the entire link script, but one can keep individual
symbols. That's what we do now, so we can use --gc-sections when
linking with SimulAVR support.
The problem came up again because -flto drops unused symbols, too.
This commit changes binary size drastically (1654 bytes less), so
let's take a new performance measurement snapshot:
ATmega sizes '168 '328(P) '644(P) '1280
Program: 17616 bytes 123% 58% 28% 14%
Data: 2175 bytes 213% 107% 54% 27%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode statistics:
LED on occurences: 888.
LED on time minimum: 218 clock cycles.
LED on time maximum: 395 clock cycles.
LED on time average: 249.051 clock cycles.
smooth-curves.gcode statistics:
LED on occurences: 23648.
LED on time minimum: 237 clock cycles.
LED on time maximum: 438 clock cycles.
LED on time average: 272.216 clock cycles.
triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 237 clock cycles.
LED on time maximum: 395 clock cycles.
LED on time average: 262.572 clock cycles.
Our standard performance test is to run these three G-code files
in SimulAVR and recording step pulse timings. While this certainly
doesn't cover everything related to possible performance
measurements, it's a good basic standard to compare code changes.
Current performance:
ATmega sizes '168 '328(P) '644(P) '1280
Program: 19808 bytes 139% 65% 32% 16%
Data: 2191 bytes 214% 107% 54% 27%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode statistics:
LED on occurences: 888.
LED on time minimum: 308 clock cycles.
LED on time maximum: 729 clock cycles.
LED on time average: 317.393 clock cycles.
smooth-curves.gcode statistics:
LED on occurences: 23648.
LED on time minimum: 308 clock cycles.
LED on time maximum: 726 clock cycles.
LED on time average: 354.825 clock cycles.
triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 308 clock cycles.
LED on time maximum: 719 clock cycles.
LED on time average: 336.327 clock cycles.
What 'make size' previously reported was misleading, because
it didn't count the .data section as Flash usage. However, this
section is actually written to Flash.
The .data section holds the data needed for inititalising
variables. As such it counts to both, Flash and RAM usage.
Nice verification: reported 'Program' size now matches upload
size reported by avrdude exactly.
There's now the tool 'avr-size', which makes reading such stuff
much easier:
avr-size -C build/teacup.elf
Example output:
AVR Memory Usage
----------------
Device: Unknown
Program: 23704 bytes
(.text + .data + .bootloader)
Data: 1543 bytes
(.data + .bss + .noinit)
EEPROM: 32 bytes
(.eeprom)
For now this is just a number of different configurations and a
makefile target, "make regressiontests", to build with them.
Further tests, e.g. using SimulAVR or the hostside simulator
to check actual behaviour of the firmware are welcome.
It's also possible to do this by stringifying MCU, but this
requires double redirection, which isn't easily readable in a .c
file. For stringification, see the bottom example at
https://gcc.gnu.org/onlinedocs/cpp/Stringification.html
- "Traumflug" and "Markus Hitter" are the same, mention him only
once.
- Add more common F_CPU choices in comments.
- Hint to another choice in Makefile-example.
Move builds for non-avr target (simulator) into a $(BUILD_FLAVOR)
build subdir (build/sim) to isolate it more completely and
cleanly from the AVR builds. This allows AVR and SIM to use common
build rules again.
Move newly bits out of Makefile-{SIM,AVR} and into Makefile-common.
This shouldn't change the running binary at all, so it shouldn't
harm. However, it allows to run Teacup inside SimulAVR and accessing
Teacups' serial line through the console/terminal.
For detailed instructions, see http://reprap.org/wiki/SimulAVR .
Also, some makefile cleanup:
- Remove obsolete 'depend' target.
- Move AVR-specific targets to AVR makefile.
- Add TARGET variable to identify target to make and to clean.
- Tidy up dependency make.
This code was accidentally removed long ago in a botched merge. This
patch recovers it and makes it build again. I've done minimal testing
and some necessary cleanup. It compiles and runs, but it probably still
has a few dust bunnies here and there.
I added registers and pin definitions to simulator.h and
simulator/simulator.c which I needed to match my Gen7-based config.
Other configs or non-AVR ports will need to define more or different
registers. Some registers are 16-bits, some are 8-bit, and some are just
constant values (enums). A more clever solution would read in the
chip-specific header and produce saner definitions which covered all
GPIOs. But this commit just takes the quick and easy path to support my
own hardware.
Most of this code originated in these commits:
commit cbf41dd4ad
Author: Stephan Walter <stephan@walter.name>
Date: Mon Oct 18 20:28:08 2010 +0200
document simulation
commit 3028b297f3
Author: Stephan Walter <stephan@walter.name>
Date: Mon Oct 18 20:15:59 2010 +0200
Add simulation code: use "make sim"
Additional tweaks:
Revert va_args processing for AVR, but keep 'int' generalization
for simulation. gcc wasn't lying. The sim really aborts without this.
Remove delay(us) from simulator (obsolete).
Improve the README.sim to demonstrate working pronterface connection
to sim. Also fix the build instructions.
Appease all stock configs.
Stub out intercom and shush usb_serial when building simulator.
Pretend to be all chip-types for config appeasement.
Replace sim_timer with AVR-simulator timer:
The original sim_timer and sim_clock provided direct replacements
for timer/clock.c in the main code. But when the main code changed,
simcode did not. The main clock.c was dropped and merged into timer.c.
Also, the timer.c now has movement calculation code in it in some
cases (ACCELERATION_TEMPORAL) and it would be wrong to teach the
simulator to do the same thing. Instead, teach the simulator to
emulate the AVR Timer1 functionality, reacting to values written to
OCR1A and OCR1B timer comparison registers.
Whenever OCR1A/B are changed, the sim_setTimer function needs to be
called. It is called automatically after a timer event, so changes
within the timer ISRs do not need to bother with this.
A C++ class could make this requirement go away by noticing the
assignment. On the other hand, a chip-agnostic timer.c would help
make the main code more portable. The latter cleanup is probably
better for us in the long run.
We have now Makefile-AVR - AVR specific stuff - and
Makefile-common - common build instructions.
This effort is the begin of preparing Teacup for ARM targets.
To build the target, copy or link Makefile-AVR or Makefile-ARM
(depending on your target) to Makefile.