REPRAP style acceleration broke quite a while ago, but no one noticed.
Maybe it's not being used, and therefore also not tested. But it should
at least compile while it remains an option.
The compiler complains that dda->n is not defined and that current_id is
never used. The first bug goes back to f0b9daeea0 in late 2013.
In the interest of supporting exploratory accelerations, fix this to
build when ACCELERATION_REPRAP is chosen.
Add a function axes_um_to_steps to convert from um to steps on all axes
respecting current kinematics setting.
Extend code_stepper_axescode_axes_to_stepper_axes to convert all axes,
including E-axis for consistency.
It seems like axes_um_to_steps could be simplified to something like
"apply_kinematics_axes()" which would just do the transformation math
in-place on some axes[] to move from 'Cartesian' to 'target-kinematics'.
Then the original um_to_steps and delta_um code could remain untouched
since 2014. But I'm not sure how this will work with scara or delta
configurations. I'm fairly certain they only work from absolute positions
anyway.
Fixes#216.
dda_clock() might be interrupted by dda_step(), and dda_step might
use or modify variables also being used in dda_clock(). It is
possible for dda to be modified when a new dda becomes live during
our dda_clock(). Check the dda->id to ensure it has not changed on
us before we actually write new calculated values into the dda.
Note by Traumflug: copied some of the explanation in the commit
message directly into the code.
dda_clock() might be interrupted by dda_step(), and dda_step might
use or modify variables also being used in dda_clock. In particular
dda->c is modified in both functions but it is done atomically in
dda_clock() to prevent dda_step() from interrupting during the
write. But dda->n is also modified in both places and it is not
protected in dda_clock().
Move updates to dda->n to the atomic section along with dda->c.
Note by Traumflug: good catch! It even makes the binary 14 bytes
smaller, so likely faster.
Point of this change is to allow using these functions for
writing to the display, too, without duplicating all the code.
To reduce confusion, functions were renamed (they're no longer
'serial', after all:
serwrite_xxx() -> write_xxx()
sersendf_P() -> sendf_P()
To avoid changing all the existing code, a couple of macros
with the old names are provided. They might even be handy as
convenience macros.
Nicely, this addition costs no additional RAM. Not surprising, it
costs quite some binary size, 278 bytes. Sizes now:
Program: 24058 bytes 168% 79% 38% 19%
Data: 1525 bytes 149% 75% 38% 19%
EEPROM: 32 bytes 4% 2% 2% 1%
Regarding USB Serial: code was adjusted without testing on
hardware.
Until this commit, the Z axis is disabled after each move and
only enabled when the Z axis will move. Now you can enable this
as a feature. Some printer axes are too heavy or have a high
pitch which are not self locking. In that case simply do nothing.
It's now off by default.
Previously some features were excluded based on whether SIMULATOR
was defined. But in fact these should have been included when __AVR__
was defined. These used to be the same thing, but now with ARM coming
into the picture, they are not. Fix the situation so AVR includes are
truly only used when __AVR__ is defined.
The _crc16_update function appears to be specific to AVR; I've kept the
alternate implementation limited to AVR in that case in crc.c. I think
this is the right thing to do, but I am not sure. Maybe ARM has some
equivalent function in their libraries.
The trick is to use doubles earlier. As these calculations are
optimised out anyways, binary size and performance is kept.
Verified to have an identical outcome on a few common steps/mm and
acceleration cases.
... instead of trying to fire an interrupt as quickly as possible.
This affects ACCELERATION_TEMPORAL only. It almost doubles the
achievable step rate. Measured maximum step rate (X axis only,
100 mm moves) is 40'000 steps/s on a 16 MHz electronics, so
approx. 50'000 steps/s on a 20 MHz controller, which is even
a bit faster than the ACCELERATION_RAMPING algorithm.
Tests with temporary test code were run and judging by these
tests, clock interrupts are now very reliable up to the point
where processing speed is simply exhaused.
Performance with ACCELERATION_RAMPING: this costs 10 bytes
binary size and exactly 2 clock cycles per step interrupt or
0.6% performance even. We could avoid this with a lot
of #ifdefs, but considering ACCELERATION_TEMPORAL will one
day be the default acceleration, skip these #ifdefs, also
for better code readability.
$ cd testcases
$ ./run-in-simulavr.sh short-moves.gcode smooth-curves.gcode triangle-odd.gcode
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 20528 bytes 144% 67% 33% 16%
RAM : 2188 bytes 214% 107% 54% 27%
EEPROM : 32 bytes 4% 2% 2% 1%
short-moves.gcode statistics:
LED on occurences: 838.
LED on time minimum: 304 clock cycles.
LED on time maximum: 715 clock cycles.
LED on time average: 310.717 clock cycles.
smooth-curves.gcode statistics:
LED on occurences: 8585.
LED on time minimum: 309 clock cycles.
LED on time maximum: 712 clock cycles.
LED on time average: 360.051 clock cycles.
triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 304 clock cycles.
LED on time maximum: 710 clock cycles.
LED on time average: 332.32 clock cycles.
Performance for ACCELERATION_RAMPING unchanged:
$ cd testcases
$ ./run-in-simulavr.sh short-moves.gcode smooth-curves.gcode triangle-odd.gcode
[...]
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 20518 bytes 144% 67% 33% 16%
RAM : 2188 bytes 214% 107% 54% 27%
EEPROM : 32 bytes 4% 2% 2% 1%
short-moves.gcode statistics:
LED on occurences: 838.
LED on time minimum: 302 clock cycles.
LED on time maximum: 713 clock cycles.
LED on time average: 308.72 clock cycles.
smooth-curves.gcode statistics:
LED on occurences: 8585.
LED on time minimum: 307 clock cycles.
LED on time maximum: 710 clock cycles.
LED on time average: 358.051 clock cycles.
triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 302 clock cycles.
LED on time maximum: 708 clock cycles.
LED on time average: 330.322 clock cycles.
Pure cosmetical change.
Performance check:
$ cd testcases
$ ./run-in-simulavr.sh short-moves.gcode smooth-curves.gcode triangle-odd.gcode
[...]
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 20518 bytes 144% 67% 33% 16%
RAM : 2188 bytes 214% 107% 54% 27%
EEPROM : 32 bytes 4% 2% 2% 1%
short-moves.gcode statistics:
LED on occurences: 838.
LED on time minimum: 302 clock cycles.
LED on time maximum: 713 clock cycles.
LED on time average: 308.72 clock cycles.
smooth-curves.gcode statistics:
LED on occurences: 8585.
LED on time minimum: 307 clock cycles.
LED on time maximum: 710 clock cycles.
LED on time average: 358.051 clock cycles.
triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 302 clock cycles.
LED on time maximum: 708 clock cycles.
LED on time average: 330.322 clock cycles.
Forgotten in commit 74808610c7,
"DDA: Move axis calculations into loops, part 5.".
This and the previous commit makes ACCELERATION_TEMPORAL building
(and working!) again.
Next time, please at least try to compile the code section in
question when explicitely changing the section. In this case,
with ACCELERATION_TEMPORAL enabled. It didn't build.
Was broken with commit 95926a3f113809bde8ff0c84b94c55c73e398f67,
"DDA: Rename confusing variable name.".
It was certainly a good idea, but also always a suspect of
malfunctions and as such, almost never used. Newer code
organisation moves most of the code behind it to dda_clock()
anyways, so it also became mostly obsolete.
Rest In Peace, STEP_INTERRUPT_INTERRUPTIBLE, you were matter
of quite a number of interesting discussions and investigations.
Changes for Configtool by jbernardis <jeff.bernardis@gmail.com>
As we can always only move towards one end of an axis, one common
variable to count debouncing is sufficient.
Binary size 12 bytes smaller (and faster).
Previously, when backing off of X_MIN, X_MAX was also checked,
which of course was already open, so it signals endstop release
even while X_MIN is still closed. The issue exposed only when
endstops on both ends of an axis were defined, a more rare situation.
Essentially the fix simply makes a distinct endstop check case
for each side of each axis.
This even makes binary size 40 bytes smaller for the standard case.
This also introduces dda_kinematics.c/.h and a KINEMATICS definition,
which allows to do different distance calculations depending on the
bot kinematics in use. So far only KINEMATICS_STRAIGHT, which matches
what we had before, but other kinematics types are present in
comments already.
Goal is to calculate steps in a separate function to allow different
methods of steps calculation, which is neccessary for supporting
different kinematics types. Accordingly we have to calculate steps
for all axes before setting directions and such stuff.
This was the goal: to not bit-shift when calling setTimer(). Binary
size another 40 bytes off, about 1.2 % better performance:
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 20136 bytes 141% 66% 32% 16%
RAM : 2318 bytes 227% 114% 57% 29%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode statistics:
LED on occurences: 888.
LED on time minimum: 302 clock cycles.
LED on time maximum: 718 clock cycles.
LED on time average: 311.258 clock cycles.
smooth-curves.gcode statistics:
LED on occurences: 9124.
LED on time minimum: 307 clock cycles.
LED on time maximum: 708 clock cycles.
LED on time average: 357.417 clock cycles.
triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 302 clock cycles.
LED on time maximum: 708 clock cycles.
LED on time average: 330.322 clock cycles.
Admittedly it looks like advancing in babysteps, but really
catching every bit shifting instance isn't trivial, sometimes
these shifts are already embedded in other calculations.
Still no binary size or performance change.
While this shifting meant to increase accuracy, there's no actual
use of it, other than that this value gets shifted back and forth.
Let's start to get rid of it.
Performance stays exactly the same:
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 20188 bytes 141% 66% 32% 16%
RAM : 2318 bytes 227% 114% 57% 29%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode statistics:
LED on occurences: 888.
LED on time minimum: 306 clock cycles.
LED on time maximum: 722 clock cycles.
LED on time average: 315.253 clock cycles.
smooth-curves.gcode statistics:
LED on occurences: 9124.
LED on time minimum: 311 clock cycles.
LED on time maximum: 712 clock cycles.
LED on time average: 361.416 clock cycles.
triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 306 clock cycles.
LED on time maximum: 712 clock cycles.
LED on time average: 334.319 clock cycles.
This finally brings Z axis up to speed.
So far we always assumed the fastest axis to have the same steps/mm
as the X axis. In cases where this wasn't true, the movement
wouldn't do sufficient acceleration steps and, accordingly,
not reach the expected maximum speed. This was particularly visible
on a typical Mendel printer, where the Z axis would reach only a
6th of the commanded speed in some configurations.
'all_time' sounds like forever to me, but this variable really
tracks the last time we hit one of "all the axes". It sticks
out more now in looping, so rename it to make sense.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 9 is, finally use this set_direction() thing. As a dessert
topping, it reduces binary size by another 122 bytes.
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 19988 bytes 140% 66% 32% 16%
RAM : 2302 bytes 225% 113% 57% 29%
EEPROM: 32 bytes 4% 2% 2% 1%
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 8 is, move remaining update_current_position() into a loop.
This makes the binary 134 bytes smaller. As it's not critical,
no performance test.
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 20134 bytes 141% 66% 32% 16%
RAM : 2302 bytes 225% 113% 57% 29%
EEPROM: 32 bytes 4% 2% 2% 1%
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 7 is, turn update_current_position() in dda.c partially into
a loop. Surprise, surprise, this changes neither binary size nor
performance. Looking into the generated assembly, the loop is
indeed completely unrolled. Apparently that's smaller than a
real loop.
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 20270 bytes 142% 66% 32% 16%
RAM : 2302 bytes 225% 113% 57% 29%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 888.
Sum of all LED on time: 279945 clock cycles.
LED on time minimum: 306 clock cycles.
LED on time maximum: 722 clock cycles.
LED on time average: 315.253 clock cycles.
smooth-curves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 9124.
Sum of all LED on time: 3297806 clock cycles.
LED on time minimum: 311 clock cycles.
LED on time maximum: 712 clock cycles.
LED on time average: 361.443 clock cycles.
triangle-odd.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 1636.
Sum of all LED on time: 546946 clock cycles.
LED on time minimum: 306 clock cycles.
LED on time maximum: 712 clock cycles.
LED on time average: 334.319 clock cycles.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 6c removes do_step(), but still tries to keep a loop. This
about the maximum of performance I (Traumflug) can think of.
Binary size is as good as with the former attempt, but performance
is actually pretty bad, 45% worse than without looping:
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 19876 bytes 139% 65% 32% 16%
RAM : 2302 bytes 225% 113% 57% 29%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 888.
Sum of all LED on time: 406041 clock cycles.
LED on time minimum: 448 clock cycles.
LED on time maximum: 864 clock cycles.
LED on time average: 457.253 clock cycles.
smooth-curves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 9124.
Sum of all LED on time: 4791132 clock cycles.
LED on time minimum: 453 clock cycles.
LED on time maximum: 867 clock cycles.
LED on time average: 525.113 clock cycles.
triangle-odd.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 1636.
Sum of all LED on time: 800586 clock cycles.
LED on time minimum: 448 clock cycles.
LED on time maximum: 867 clock cycles.
LED on time average: 489.356 clock cycles.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 6b moves do_step() from the "tidiest" place into where it's
currently used, dda.c. Binary size goes down another 34 bytes, to
a total savings of 408 bytes and performance is much better, but
still 16% lower than without using loops:
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 19874 bytes 139% 65% 32% 16%
RAM : 2302 bytes 225% 113% 57% 29%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 888.
Sum of all LED on time: 320000 clock cycles.
LED on time minimum: 351 clock cycles.
LED on time maximum: 772 clock cycles.
LED on time average: 360.36 clock cycles.
smooth-curves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 9124.
Sum of all LED on time: 3875874 clock cycles.
LED on time minimum: 356 clock cycles.
LED on time maximum: 773 clock cycles.
LED on time average: 424.8 clock cycles.
triangle-odd.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 1636.
Sum of all LED on time: 640357 clock cycles.
LED on time minimum: 351 clock cycles.
LED on time maximum: 773 clock cycles.
LED on time average: 391.416 clock cycles.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 6a is putting stuff inside the step interrupt into a loop,
too. do_step() is put into the "tidiest" place. Binary size goes
down a remarkable 374 bytes, but stepping performance suffers by
almost 30%.
Traumflug's performance measurements:
SIZES ATmega... '168 '328(P) '644(P) '1280
FLASH : 19908 bytes 139% 65% 32% 16%
RAM : 2302 bytes 225% 113% 57% 29%
EEPROM: 32 bytes 4% 2% 2% 1%
short-moves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 888.
Sum of all LED on time: 354537 clock cycles.
LED on time minimum: 390 clock cycles.
LED on time maximum: 806 clock cycles.
LED on time average: 399.253 clock cycles.
smooth-curves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 9124.
Sum of all LED on time: 4268896 clock cycles.
LED on time minimum: 395 clock cycles.
LED on time maximum: 807 clock cycles.
LED on time average: 467.875 clock cycles.
triangle-odd.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 1636.
Sum of all LED on time: 706846 clock cycles.
LED on time minimum: 390 clock cycles.
LED on time maximum: 807 clock cycles.
LED on time average: 432.057 clock cycles.
Should be done for temptable in ThermistorTable.h, too, but this
would mess up an existing users' configuration.
This tries to put emphasis on the fact that you have to read
these values with pgm_read_*() instead of just using the variable.
Unfortunately, gcc compiler neither inserts PROGMEM reading
instructions automatically when reading data stored in flash,
nor does it complain or warn about the missing read instructions.
As such it's very easy to accidently handle data stored in flash
just like normal data. It'll compile and work ... you just read
arbitrary data (often, but not always zeros) instead of what you
intend.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 5 is move ACCELERATION_TEMPORAL's step delay calculations
into loops. Not tested, binary size change unknown.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 4 is move ACCELERATION_TEMPORAL's maximum feedrate limitation
into a loop. Not tested, binary size change unknown.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 3 is moving fast axis detection into a loop.
Binary size 84 bytes smaller.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Part 2 is moving maximum speed limit calculations into loops.
Binary size another 160 bytes smaller.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.
Traumflug notes:
Split this once huge commit into smaller ones for ease of
reviewing and bisecting (in case something went wrong).
Part 1 is to put dda_create() distance calculations into loops.
This reduces binary size by another whopping 756 bytes.
This was contributed by Phil Hord as part of another commit.
It saves 168 bytes, to it more than outweights the overhead of
introducing a generic implementation already.
Many places in the code use individual variables for int/uint values
for X, Y, Z, and E. A tip from a comment suggests making these into
arrays for scalability in the future. Replace the discrete variables
with arrays so the code can be simplified in the future.
In preparation for more efficient and scalable code using axis-loops
for common operations, add two new array-types for signed and unsigned
32-bit values per axis. Make the TARGET type use this array instead of
its current X, Y, Z, and E variables.
Traumflug notes:
- Did the usual conversion to spaces for changed lines.
- Added X = 0 to the enum. Just for peace of mind.
- Excellent patch!
Initially I wanted to make the new array an anonymous union with the
old variables to allow accessing values both ways. This way it would
have been possible to do the transition in smaller pieces. But as
the patch worked so flawlessly and binary size is precisely the
same, I abandoned this idea. Maybe it's a good idea in other areas.
This macro is pretty expensive (700 bytes, well, stuff is now
calculated at runtime), so there's no chance to use it in multiple
places and we likely also need this in dda_lookahead.c to achieve
full 4 axis compatibility there.
For now this is for the initial rampup calculation, only, notably
for moving the Z axis (which else gets far to few rampup steps on
a typical mendel-like printer).
The used macro was verified with this test code (in mendel.c):
[...]
int main (void) {
init();
uint32_t speed, spm;
char string[128];
for (spm = 2000; spm < 4099000; spm <<= 1) {
for (speed = 11; speed < 65536; speed *= 8) {
sersendf_P(PSTR("spm = %lu speed %lu ==> macro %lu "),
spm, speed, ACCELERATE_RAMP_LEN_SPM(speed, spm));
delay_ms(10);
sprintf(string, "double %f\n",
(double)speed * (double)speed / ((double)7200000 * (double)ACCELERATION / (double)spm));
serial_writestr((uint8_t *)string);
delay_ms(10);
}
}
[...]
Note: to link the test code, this linker flag is required to add
the full printf library (which does print doubles):
LDFLAGS += -Wl,-u,vfprintf -lprintf_flt -lm
Keeping the hack causes the previous move to decelerate, which isn't
intended when movements are joined with lookahead.
Removing only the hack breaks endstop handling on those axes which
set a huuuge number of acceleration steps for the lack of a proper
calculation algorithm. We have this algorithm now, so we can stop
using this kludge.
Solves part 1 of issue #68.
This is a preparation towards going through the existing movement
queue backwards with dda_join_moves() to allow higher feedrates
for lots of short movements.
This obviously requires less place on the stack and accordingly a
few CPU cycles less, but more importantly, it lets decide
dda_start() whether a previous movement is to be taken into account
or not.
To make this decision more reliable, add a flag for movements done.
Else it could happen we'd try to join with a movement done long
before.
In AVR the labs() function takes a 32-bit signed int parameter. On
the PC it's at least 64-bits and maybe more. When we have a 32-bit
unsigned value we're taking the labs() of, coercing it to 32-bits
first turns our high-bit into a sign, but coercing it to 64-bits
does not. This causes all our negative values to appear to be
really big positive ones.
Create a new function abs32() which always coerces its argument to
a int32_t first before return the abs value. Use that function
whereved needed in dda.c.
This fixes a problem on the simulator which caused negative
direction movements to "never" end.
This code was accidentally removed long ago in a botched merge. This
patch recovers it and makes it build again. I've done minimal testing
and some necessary cleanup. It compiles and runs, but it probably still
has a few dust bunnies here and there.
I added registers and pin definitions to simulator.h and
simulator/simulator.c which I needed to match my Gen7-based config.
Other configs or non-AVR ports will need to define more or different
registers. Some registers are 16-bits, some are 8-bit, and some are just
constant values (enums). A more clever solution would read in the
chip-specific header and produce saner definitions which covered all
GPIOs. But this commit just takes the quick and easy path to support my
own hardware.
Most of this code originated in these commits:
commit cbf41dd4ad
Author: Stephan Walter <stephan@walter.name>
Date: Mon Oct 18 20:28:08 2010 +0200
document simulation
commit 3028b297f3
Author: Stephan Walter <stephan@walter.name>
Date: Mon Oct 18 20:15:59 2010 +0200
Add simulation code: use "make sim"
Additional tweaks:
Revert va_args processing for AVR, but keep 'int' generalization
for simulation. gcc wasn't lying. The sim really aborts without this.
Remove delay(us) from simulator (obsolete).
Improve the README.sim to demonstrate working pronterface connection
to sim. Also fix the build instructions.
Appease all stock configs.
Stub out intercom and shush usb_serial when building simulator.
Pretend to be all chip-types for config appeasement.
Replace sim_timer with AVR-simulator timer:
The original sim_timer and sim_clock provided direct replacements
for timer/clock.c in the main code. But when the main code changed,
simcode did not. The main clock.c was dropped and merged into timer.c.
Also, the timer.c now has movement calculation code in it in some
cases (ACCELERATION_TEMPORAL) and it would be wrong to teach the
simulator to do the same thing. Instead, teach the simulator to
emulate the AVR Timer1 functionality, reacting to values written to
OCR1A and OCR1B timer comparison registers.
Whenever OCR1A/B are changed, the sim_setTimer function needs to be
called. It is called automatically after a timer event, so changes
within the timer ISRs do not need to bother with this.
A C++ class could make this requirement go away by noticing the
assignment. On the other hand, a chip-agnostic timer.c would help
make the main code more portable. The latter cleanup is probably
better for us in the long run.
This is a preparation for starting a move from non-zero speeds,
which is needed for look-ahead. Keeping both variables in
move_state and doing the calculations in dda_start() is possible
in principle, but might not fit the tight time budget we have when
going from one movement to the next at high step rates.
To deal with this, we have to pre-calculate n and c, so we have
to move it back into the DDA structure. It was there a year ago
already, but moved into move_state to save RAM (move_state exists
only once, dda as often as there are movement queue entries).
His implementation was done on every step and as it turns out,
the very same maths works just fine in the clock interrupt.
Reason for the clock interrupt is: it allows about 3 times
higher step rates.
This strategy is not only substantially faster, but also
a bit smaller.
One funny anecdote: the acceleration initialisation value, C0,
was taken from elsewhere in the code as-is. Still it had to be
adjusted by a factor of sqrt(2) to now(!) match the physics
formulas and to get ramps reasonably matching the prediction
(and my pocket calculator). Apparently the code before
accumulated enough rounding errors to compensate for the
wrong formula.
Before, endstops were checked on every step, wasting precious time.
Checking them 500 times a second should be more than sufficient.
Additionally, an endstop stop now properly decelerates the movement.
This is one important step towards handling accidental endstop hits
gracefully, as it avoids step losses in such situations.
This means, modify existing code to let the lookahead algorithms
do their work. It also means to remove some unused code in
dda_lookahead.c and reordering some code to make it work with
LOOKAHEAD undefined.
The binary size impact is moderate, like 18 bytes plus
4 bytes per endstop defined.
The story is a follows:
The endstop logic can be used to use a touch probe with PCB
milling. Connect the (conductive) PCB surface to GND, the
spindle/mill bit to the signal line, turn the internal pullups
on and there you go.
However, doing so with pullups always enabled and while milling
under (conductive) water showed polished mill and drill bits to
become matte after a few hours of usage. Obviously, this small
0.5 mA current from the pullup resistors going through the
rotating mill bit is sufficient to get some spark erosion going.
That's bad, as spark erosion happening also means tools become
dull faster than neccessary.
With this patch, pullups are turned on while being used, only,
so this sparc erosion should go away.