These values were queued up just for finding out individual axis
speeds in dda_find_crossing_speed(). Let's do this calculation
with other available movement properties and save 16 bytes of RAM
per movement queue entry.
First version of this commit forgot to take care of the feedrate
sign (prevF, currF). Lack of that found by @Wurstnase. Idea of
tweaking calculation of 'dv' to achieve this also by @Wurstnase.
It was tried to set the sign immediately after calculation of the
absolute values, but that resulted in larger ( = slower) code.
Binary size down 132 bytes, among that two loops. RAM usage down
256 bytes for the standard test case:
ATmega sizes '168 '328(P) '644(P) '1280
Program: 17944 bytes 126% 59% 29% 14%
Data: 1920 bytes 188% 94% 47% 24%
EEPROM: 32 bytes 4% 2% 2% 1%
We calculate a safe join speed in dda_join_moves using data from
two source DDA movements. We ensure the DDA values we use are sane
by atomically copying them to local variables before beginning our
calculation. But later we discard all our results if the DDA went
live in the meantime, as evidenced by changes in `DDA->live` or
`DDA->id`.
Since we will not use the results of our calculations if either of
these change, we can safely reference all the other DDA values
non-atomically. Change the ATOMIC section to protect only the
`DDA->id` values at the start.
Added by Traumflug: this costs a negligible 4 bytes binary size:
ATmega sizes '168 '328(P) '644(P) '1280
Program: 18082 bytes 127% 59% 29% 15%
Data: 2176 bytes 213% 107% 54% 27%
EEPROM: 32 bytes 4% 2% 2% 1%
Nullmoves are movements which don't actually move a stepper. For
example because it's a velocity change only or the movement is
shorter than a single motor step.
Not queueing them up removes the necessity to check for them,
which reduces code in critical areas. It also removes the
necessity to run dda_start() twice to get past a nullmove.
Best of this is, it also makes lookahead perform better. Before,
a nullmove just changing speed interrupted the lookahead chain,
now it no longer does. See straight-speeds.gcode and
...-Fsep.gcode, which produced different timings before, now
results are identical.
Also update the function description for dda_create().
Performance increase is impressive: another 75 clock cycles off
the slowest step, only 36 bytes binary size increase:
ATmega sizes '168 '328(P) '644(P) '1280
Program: 19652 bytes 138% 64% 31% 16%
Data: 2175 bytes 213% 107% 54% 27%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode statistics:
LED on occurences: 888.
LED on time minimum: 280 clock cycles.
LED on time maximum: 458 clock cycles.
LED on time average: 284.653 clock cycles.
smooth-curves.gcode statistics:
LED on occurences: 23648.
LED on time minimum: 272 clock cycles.
LED on time maximum: 501 clock cycles.
LED on time average: 307.275 clock cycles.
triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 272 clock cycles.
LED on time maximum: 458 clock cycles.
LED on time average: 297.625 clock cycles.
Performance of straight-speeds{-Fsep}.gcode before:
straight-speeds.gcode statistics:
LED on occurences: 32000.
LED on time minimum: 272 clock cycles.
LED on time maximum: 586 clock cycles.
LED on time average: 298.75 clock cycles.
straight-speeds-Fsep.gcode statistics:
LED on occurences: 32000.
LED on time minimum: 272 clock cycles.
LED on time maximum: 672 clock cycles.
LED on time average: 298.79 clock cycles.
Now:
straight-speeds.gcode statistics:
LED on occurences: 32000.
LED on time minimum: 272 clock cycles.
LED on time maximum: 501 clock cycles.
LED on time average: 298.703 clock cycles.
straight-speeds-Fsep.gcode statistics:
LED on occurences: 32000.
LED on time minimum: 272 clock cycles.
LED on time maximum: 501 clock cycles.
LED on time average: 298.703 clock cycles.
There we save even 171 clock cycles :-)
Traumflug's note: if one uses #define LOOKAHEAD_DEBUG at line 177,
one should use the same symbol in line 321. Edited the commit to
do so.
This reduces binary size by 38 bytes and RAM usage by 4 bytes.
We calculate all steps from the fastest axis now. So X and Y
steps_per_m don't have to be the same anymore.
Traumflug's: another 16 bytes program size off on AVR, same size
on LPC1114.
Point of this change is to allow using these functions for
writing to the display, too, without duplicating all the code.
To reduce confusion, functions were renamed (they're no longer
'serial', after all:
serwrite_xxx() -> write_xxx()
sersendf_P() -> sendf_P()
To avoid changing all the existing code, a couple of macros
with the old names are provided. They might even be handy as
convenience macros.
Nicely, this addition costs no additional RAM. Not surprising, it
costs quite some binary size, 278 bytes. Sizes now:
Program: 24058 bytes 168% 79% 38% 19%
Data: 1525 bytes 149% 75% 38% 19%
EEPROM: 32 bytes 4% 2% 2% 1%
Regarding USB Serial: code was adjusted without testing on
hardware.
The marlin firmware reportedly reports "Error" instead of "!!", indicating
a machine failure which is followed by a full power-down. The Octoprint
GCode-sender assumes a reported Error means the print has failed and
the machine turned off.
In Teacup we report an "Error" when lookahead was too slow to join
movements, but this is interpreted as an emergency-stop by Octoprint who
then stops the job and leaves the printer idle with all the heaters running.
Change this "Error" to a "Notice" to avoid this problem. Add a comment
prefix while we're at it to fit the de facto standard better.
See http://reprap.org/wiki/G-code
All in one chunk, because it's all hardware-independent and doing
them one by one would end up on not more than some typing
exercises.
Compiles fine. For testing, remove if (DEBUG... for M114 in
gcode_process.c. Then one can see how the queue fills up when
sending movements and M114 repeatedly. This time with actual
coordinates.
No stepper movements, yet, because set_timer() is still empty.
Previously some features were excluded based on whether SIMULATOR
was defined. But in fact these should have been included when __AVR__
was defined. These used to be the same thing, but now with ARM coming
into the picture, they are not. Fix the situation so AVR includes are
truly only used when __AVR__ is defined.
The _crc16_update function appears to be specific to AVR; I've kept the
alternate implementation limited to AVR in that case in crc.c. I think
this is the right thing to do, but I am not sure. Maybe ARM has some
equivalent function in their libraries.
Fix:
dda_lookahead.c:327:17: warning: 'crossF' may be used
uninitialized in this function [-Wmaybe-uninitialized]
sersendf_P(PSTR("Initial crossing speed: %lu\n"), crossF);
^
Should be done for temptable in ThermistorTable.h, too, but this
would mess up an existing users' configuration.
This tries to put emphasis on the fact that you have to read
these values with pgm_read_*() instead of just using the variable.
Unfortunately, gcc compiler neither inserts PROGMEM reading
instructions automatically when reading data stored in flash,
nor does it complain or warn about the missing read instructions.
As such it's very easy to accidently handle data stored in flash
just like normal data. It'll compile and work ... you just read
arbitrary data (often, but not always zeros) instead of what you
intend.
Many places in the code use individual variables for int/uint values
for X, Y, Z, and E. A tip from a comment suggests making these into
arrays for scalability in the future. Replace the discrete variables
with arrays so the code can be simplified in the future.
Previously, ramps were calculated with the combined speed,
which can differ from the speed of the fast axis by factor 2.
This solves part 2 of issue #68.
This is a preparation towards going through the existing movement
queue backwards with dda_join_moves() to allow higher feedrates
for lots of short movements.
There are three locations in the code that repeat a pattern of
"If z=0 then use 2d-approx(dx,dy), else if x==0 && y==0 then use dz,
else use 3d-approx".
Teach approx_distance_3 to detect these conditions for us and apply
the same logic. Replace the three call locations with a simple call
to approx_distance_3.
Binary size for the LOOKAHEAD case drops by almost 400 bytes:
old: FLASH : 21242 bytes 149% 70% 34% 17%
new: FLASH : 20844 bytes 146% 68% 33% 17%
The size for non-LOOKAHEAD drops by 40 bytes:
old: FLASH : 16592 bytes 116% 55% 27% 13%
new: FLASH : 16552 bytes 116% 54% 27% 13%
We can actually do a little better if we consider the zero-ness of all
three axes, but this does make the code a little bit bigger. Another
change will consider that option. This change simply tries to mimic
the existing functionality.
The new one solely looks at speed differences of individual axes.
This means individual jerks for each axis (good!) and relative
simple maths (also good!).
For details and maths, see comments in the code and
https://github.com/Traumflug/Teacup_Firmware/issues/45 .
This is mostly a preparation for reverse walks through the movement queue,
where crossing speed calculation is done only once, while actually used
speeds can be raised successively with repeated walks.
This code was accidentally removed long ago in a botched merge. This
patch recovers it and makes it build again. I've done minimal testing
and some necessary cleanup. It compiles and runs, but it probably still
has a few dust bunnies here and there.
I added registers and pin definitions to simulator.h and
simulator/simulator.c which I needed to match my Gen7-based config.
Other configs or non-AVR ports will need to define more or different
registers. Some registers are 16-bits, some are 8-bit, and some are just
constant values (enums). A more clever solution would read in the
chip-specific header and produce saner definitions which covered all
GPIOs. But this commit just takes the quick and easy path to support my
own hardware.
Most of this code originated in these commits:
commit cbf41dd4ad
Author: Stephan Walter <stephan@walter.name>
Date: Mon Oct 18 20:28:08 2010 +0200
document simulation
commit 3028b297f3
Author: Stephan Walter <stephan@walter.name>
Date: Mon Oct 18 20:15:59 2010 +0200
Add simulation code: use "make sim"
Additional tweaks:
Revert va_args processing for AVR, but keep 'int' generalization
for simulation. gcc wasn't lying. The sim really aborts without this.
Remove delay(us) from simulator (obsolete).
Improve the README.sim to demonstrate working pronterface connection
to sim. Also fix the build instructions.
Appease all stock configs.
Stub out intercom and shush usb_serial when building simulator.
Pretend to be all chip-types for config appeasement.
Replace sim_timer with AVR-simulator timer:
The original sim_timer and sim_clock provided direct replacements
for timer/clock.c in the main code. But when the main code changed,
simcode did not. The main clock.c was dropped and merged into timer.c.
Also, the timer.c now has movement calculation code in it in some
cases (ACCELERATION_TEMPORAL) and it would be wrong to teach the
simulator to do the same thing. Instead, teach the simulator to
emulate the AVR Timer1 functionality, reacting to values written to
OCR1A and OCR1B timer comparison registers.
Whenever OCR1A/B are changed, the sim_setTimer function needs to be
called. It is called automatically after a timer event, so changes
within the timer ISRs do not need to bother with this.
A C++ class could make this requirement go away by noticing the
assignment. On the other hand, a chip-agnostic timer.c would help
make the main code more portable. The latter cleanup is probably
better for us in the long run.
This means, modify existing code to let the lookahead algorithms
do their work. It also means to remove some unused code in
dda_lookahead.c and reordering some code to make it work with
LOOKAHEAD undefined.