This is a preparation for starting a move from non-zero speeds,
which is needed for look-ahead. Keeping both variables in
move_state and doing the calculations in dda_start() is possible
in principle, but might not fit the tight time budget we have when
going from one movement to the next at high step rates.
To deal with this, we have to pre-calculate n and c, so we have
to move it back into the DDA structure. It was there a year ago
already, but moved into move_state to save RAM (move_state exists
only once, dda as often as there are movement queue entries).
His implementation was done on every step and as it turns out,
the very same maths works just fine in the clock interrupt.
Reason for the clock interrupt is: it allows about 3 times
higher step rates.
This strategy is not only substantially faster, but also
a bit smaller.
One funny anecdote: the acceleration initialisation value, C0,
was taken from elsewhere in the code as-is. Still it had to be
adjusted by a factor of sqrt(2) to now(!) match the physics
formulas and to get ramps reasonably matching the prediction
(and my pocket calculator). Apparently the code before
accumulated enough rounding errors to compensate for the
wrong formula.
This was a very interesting approach, but for the forseeable
future it's unlikely the code will replace the current one.
However, many parts of it were already moved to the experimental
branch. It turns out the approach with recalculating acceleration
at a constant time interval is exactly right, but works much more
precisely when keeping maths step-based.
This doesn't matter much, as the timer overflows 300 times/second
worst case, so the very first step of a series of moves is
delayed never more than 30 milliseconds. Hardly recognizeable
by a human.
Saves a nice 40 bytes and improves max step rate by several percent.
This 1/sqrt(x) implementation is a 12 bits fixed point implementation
and a bit faster than a 32 bits divide (it takes about 11% less time
to complete) and could be even faster if one requires only 8 bits.
Also, precision starts getting poor for big values of n which are
likely to be required by small acceleration values.
Implementation by Roland Brochard <zuzuf86@gmail.com>.
Note: If you wonder how code doing multiplications can be faster than
code doing just shifts and increments: I've measured it. One million
square roots in 30 seconds with the new code instead of 220 seconds
with the old code on a Gen7 20 MHz. That's just 30 microseconds or
600 CPU cycles per root.
Code used for the measurement (by a stopwatch) in mendel.c:
...
*include "dda_maths.h"
*include "delay.h"
int main (void)
{
uint32_t i, j;
serial_init();
sei();
serial_writestr_P(PSTR("start\n"));
for (i = 0; i < 1000000; i++) {
j = int_sqrt(i);
}
serial_writestr_P(PSTR("done\n"));
delay_ms(20);
cli();
init();
...
--Traumflug
Before, endstops were checked on every step, wasting precious time.
Checking them 500 times a second should be more than sufficient.
Additionally, an endstop stop now properly decelerates the movement.
This is one important step towards handling accidental endstop hits
gracefully, as it avoids step losses in such situations.
As a sample application, use it in queue_empty().
There's also ATOMIC_BLOCK() coming with avr-libc, but this requires
a C99 compiler while Arduino IDE insists on running avr-gcc in C89 mode.
Teacup handles motor on/off automatically and if your
intention is to stop the printer, M0 is appropriate (and
conforming with the NIST G-code standard).
That said, M84 is kept as a synonym for M2 to enhance compatibility
with slic3rs default end-G-code.
This means, modify existing code to let the lookahead algorithms
do their work. It also means to remove some unused code in
dda_lookahead.c and reordering some code to make it work with
LOOKAHEAD undefined.