Commit Graph

145 Commits

Author SHA1 Message Date
Markus Hitter c7b022fb3e dda.c: some whitespace cleaning. 2014-10-18 21:02:17 +02:00
Markus Hitter 6c5809f0fa DDA: finally, don't bit-shift dda->c.
This was the goal: to not bit-shift when calling setTimer(). Binary
size another 40 bytes off, about 1.2 % better performance:

    SIZES             ATmega...  '168    '328(P)    '644(P)    '1280
    FLASH : 20136 bytes          141%       66%        32%       16%
    RAM   :  2318 bytes          227%      114%        57%       29%
    EEPROM:    32 bytes            4%        2%         2%        1%

short-moves.gcode statistics:
LED on occurences: 888.
LED on time minimum: 302 clock cycles.
LED on time maximum: 718 clock cycles.
LED on time average: 311.258 clock cycles.

smooth-curves.gcode statistics:
LED on occurences: 9124.
LED on time minimum: 307 clock cycles.
LED on time maximum: 708 clock cycles.
LED on time average: 357.417 clock cycles.

triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 302 clock cycles.
LED on time maximum: 708 clock cycles.
LED on time average: 330.322 clock cycles.
2014-08-31 19:11:49 +02:00
Markus Hitter e098a96bac DDA: don't bit-shift move_c.
Next babystep, tiny enhancement: 8 bytes less binary size.
2014-08-31 19:11:30 +02:00
Markus Hitter b9c38051cc DDA: don't bit-shift dda->c_min.
Another babystep. First results: binary 4 bytes smaller. Yikes!
Ha ha.
2014-08-31 19:11:19 +02:00
Markus Hitter 6880f05f7e DDA: don't bit-shift dda->end_c either.
Next babystep. All changes in ACCELERATION_REPRAP, which isn't
part of current test procedures, so let's cross fingers it was
done right.
2014-08-31 19:11:12 +02:00
Markus Hitter 2541eaf335 DDA: don't bit-shift c_limit, c_limit_calc either.
Admittedly it looks like advancing in babysteps, but really
catching every bit shifting instance isn't trivial, sometimes
these shifts are already embedded in other calculations.

Still no binary size or performance change.
2014-08-31 19:11:04 +02:00
Markus Hitter 4fa755daef dda.c: don't bit-shift c0.
While this shifting meant to increase accuracy, there's no actual
use of it, other than that this value gets shifted back and forth.
Let's start to get rid of it.

Performance stays exactly the same:

    SIZES             ATmega...  '168    '328(P)    '644(P)    '1280
    FLASH : 20188 bytes          141%       66%        32%       16%
    RAM   :  2318 bytes          227%      114%        57%       29%
    EEPROM:    32 bytes            4%        2%         2%        1%

short-moves.gcode statistics:
LED on occurences: 888.
LED on time minimum: 306 clock cycles.
LED on time maximum: 722 clock cycles.
LED on time average: 315.253 clock cycles.

smooth-curves.gcode statistics:
LED on occurences: 9124.
LED on time minimum: 311 clock cycles.
LED on time maximum: 712 clock cycles.
LED on time average: 361.416 clock cycles.

triangle-odd.gcode statistics:
LED on occurences: 1636.
LED on time minimum: 306 clock cycles.
LED on time maximum: 712 clock cycles.
LED on time average: 334.319 clock cycles.
2014-08-31 19:10:56 +02:00
Markus Hitter 4f0a00c1a6 DDA: calculate acceleration for the actual fast axis.
This finally brings Z axis up to speed.

So far we always assumed the fastest axis to have the same steps/mm
as the X axis. In cases where this wasn't true, the movement
wouldn't do sufficient acceleration steps and, accordingly,
not reach the expected maximum speed. This was particularly visible
on a typical Mendel printer, where the Z axis would reach only a
6th of the commanded speed in some configurations.
2014-08-31 19:10:31 +02:00
Markus Hitter 5ee2aebbed DDA: remember number of the fast axis. 2014-08-31 19:10:23 +02:00
Markus Hitter 294f0eda26 DDA: have an acceleration constant for each axis individually.
For now, keep behaviour identical, like still use STEPS_PER_M_X.
This is about to change soon.
2014-08-31 19:10:14 +02:00
Phil Hord 24f5416bba DDA: Rename confusing variable name.
'all_time' sounds like forever to me, but this variable really
tracks the last time we hit one of "all the axes".  It sticks
out more now in looping, so rename it to make sense.
2014-08-31 19:09:24 +02:00
Phil Hord bc4cf20341 Trivial cleanups.
Fix some formatting and hide a couple of variables when they're
not being used.
2014-08-31 19:09:15 +02:00
Phil Hord f9f068596d DDA: Move axis calculations into loops, part 9 (last part).
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 9 is, finally use this set_direction() thing. As a dessert
topping, it reduces binary size by another 122 bytes.

    SIZES             ATmega...  '168    '328(P)    '644(P)    '1280
    FLASH : 19988 bytes          140%       66%        32%       16%
    RAM   :  2302 bytes          225%      113%        57%       29%
    EEPROM:    32 bytes            4%        2%         2%        1%
2014-08-31 19:09:07 +02:00
Markus Hitter 96e9ae4dab dda.h: comment on these direction flags and other things. 2014-08-31 19:08:57 +02:00
Markus Hitter 41e76ca9fe dda.c: make update_current_position() even smaller.
Saves another 24 bytes.

    SIZES             ATmega...  '168    '328(P)    '644(P)    '1280
    FLASH : 20110 bytes          141%       66%        32%       16%
    RAM   :  2302 bytes          225%      113%        57%       29%
    EEPROM:    32 bytes            4%        2%         2%        1%

Using muldiv() would be more accurate, but unfortunately, the
compiler bails out:

   static const axes_uint32_t PROGMEM steps_per_mm_P = {
                                                           ^
dda.c:889:1: error: unable to find a register to spill in class ‘POINTER_REGS’
 }
 ^
dda.c:889:1: error: this is the insn:
(insn 81 80 83 6 (set (reg:SI 77 [ D.3086 ])
        (mem:SI (post_inc:HI (reg:HI 2 r2 [orig:103 ivtmp.106 ] [103])) [3 MEM[base: _82, offset: 0B]+0 S4 A8])) dda.c:881 94 {*movsi}
     (expr_list:REG_INC (reg:HI 2 r2 [orig:103 ivtmp.106 ] [103])
        (nil)))
dda.c:889: confused by earlier errors, bailing out

Another one is, calculating this:

   (int32_t)get_direction(dda, i) *
   move_state.steps[i] * 1000 / pgm_read_dword(&steps_per_mm_P[i]);

produces nonsense values for negative returns from get_direction().
Apparently, the compiler doesn't want to divide negative values???
Odd. Anyways, sufficient parentheses solve the problem.
2014-08-31 19:08:49 +02:00
Phil Hord b552447789 DDA: Move axis calculations into loops, part 8.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 8 is, move remaining update_current_position() into a loop.

This makes the binary 134 bytes smaller. As it's not critical,
no performance test.

    SIZES             ATmega...  '168    '328(P)    '644(P)    '1280
    FLASH : 20134 bytes          141%       66%        32%       16%
    RAM   :  2302 bytes          225%      113%        57%       29%
    EEPROM:    32 bytes            4%        2%         2%        1%
2014-08-31 19:08:42 +02:00
Phil Hord 80b29b727b DDA: Move axis calculations into loops, part 7.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 7 is, turn update_current_position() in dda.c partially into
a loop. Surprise, surprise, this changes neither binary size nor
performance. Looking into the generated assembly, the loop is
indeed completely unrolled. Apparently that's smaller than a
real loop.

    SIZES             ATmega...  '168    '328(P)    '644(P)    '1280
    FLASH : 20270 bytes          142%       66%        32%       16%
    RAM   :  2302 bytes          225%      113%        57%       29%
    EEPROM:    32 bytes            4%        2%         2%        1%

short-moves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 888.
Sum of all LED on time: 279945 clock cycles.
LED on time minimum: 306 clock cycles.
LED on time maximum: 722 clock cycles.
LED on time average: 315.253 clock cycles.

smooth-curves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 9124.
Sum of all LED on time: 3297806 clock cycles.
LED on time minimum: 311 clock cycles.
LED on time maximum: 712 clock cycles.
LED on time average: 361.443 clock cycles.

triangle-odd.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 1636.
Sum of all LED on time: 546946 clock cycles.
LED on time minimum: 306 clock cycles.
LED on time maximum: 712 clock cycles.
LED on time average: 334.319 clock cycles.
2014-08-31 19:08:34 +02:00
Markus Hitter cc9c9ff7b4 DDA: Revert move axis calculations into loops, part 6a-c.
Sad but true, this experiment didn't work out. Performance loss
due to looping in dda_step() is still at least 16% with the best
algorithm found.
2014-08-31 19:08:15 +02:00
Markus Hitter 1fc4a26ccd DDA: Move axis calculations into loops, part 6c.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 6c removes do_step(), but still tries to keep a loop. This
about the maximum of performance I (Traumflug) can think of.
Binary size is as good as with the former attempt, but performance
is actually pretty bad, 45% worse than without looping:

    SIZES             ATmega...  '168    '328(P)    '644(P)    '1280
    FLASH : 19876 bytes          139%       65%        32%       16%
    RAM   :  2302 bytes          225%      113%        57%       29%
    EEPROM:    32 bytes            4%        2%         2%        1%

short-moves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 888.
Sum of all LED on time: 406041 clock cycles.
LED on time minimum: 448 clock cycles.
LED on time maximum: 864 clock cycles.
LED on time average: 457.253 clock cycles.

smooth-curves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 9124.
Sum of all LED on time: 4791132 clock cycles.
LED on time minimum: 453 clock cycles.
LED on time maximum: 867 clock cycles.
LED on time average: 525.113 clock cycles.

triangle-odd.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 1636.
Sum of all LED on time: 800586 clock cycles.
LED on time minimum: 448 clock cycles.
LED on time maximum: 867 clock cycles.
LED on time average: 489.356 clock cycles.
2014-08-31 19:08:07 +02:00
Markus Hitter 808f5dcfca DDA: Move axis calculations into loops, part 6b.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 6b moves do_step() from the "tidiest" place into where it's
currently used, dda.c. Binary size goes down another 34 bytes, to
a total savings of 408 bytes and performance is much better, but
still 16% lower than without using loops:

    SIZES             ATmega...  '168    '328(P)    '644(P)    '1280
    FLASH : 19874 bytes          139%       65%        32%       16%
    RAM   :  2302 bytes          225%      113%        57%       29%
    EEPROM:    32 bytes            4%        2%         2%        1%

short-moves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 888.
Sum of all LED on time: 320000 clock cycles.
LED on time minimum: 351 clock cycles.
LED on time maximum: 772 clock cycles.
LED on time average: 360.36 clock cycles.

smooth-curves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 9124.
Sum of all LED on time: 3875874 clock cycles.
LED on time minimum: 356 clock cycles.
LED on time maximum: 773 clock cycles.
LED on time average: 424.8 clock cycles.

triangle-odd.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 1636.
Sum of all LED on time: 640357 clock cycles.
LED on time minimum: 351 clock cycles.
LED on time maximum: 773 clock cycles.
LED on time average: 391.416 clock cycles.
2014-08-31 19:07:59 +02:00
Phil Hord b83449d8c3 DDA: Move axis calculations into loops, part 6a.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 6a is putting stuff inside the step interrupt into a loop,
too. do_step() is put into the "tidiest" place. Binary size goes
down a remarkable 374 bytes, but stepping performance suffers by
almost 30%.

Traumflug's performance measurements:

    SIZES             ATmega...  '168    '328(P)    '644(P)    '1280
    FLASH : 19908 bytes          139%       65%        32%       16%
    RAM   :  2302 bytes          225%      113%        57%       29%
    EEPROM:    32 bytes            4%        2%         2%        1%

short-moves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 888.
Sum of all LED on time: 354537 clock cycles.
LED on time minimum: 390 clock cycles.
LED on time maximum: 806 clock cycles.
LED on time average: 399.253 clock cycles.

smooth-curves.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 9124.
Sum of all LED on time: 4268896 clock cycles.
LED on time minimum: 395 clock cycles.
LED on time maximum: 807 clock cycles.
LED on time average: 467.875 clock cycles.

triangle-odd.gcode
Statistics (assuming a 20 MHz clock):
LED on occurences: 1636.
Sum of all LED on time: 706846 clock cycles.
LED on time minimum: 390 clock cycles.
LED on time maximum: 807 clock cycles.
LED on time average: 432.057 clock cycles.
2014-08-31 19:07:51 +02:00
Markus Hitter 9a08675576 Rename all these new PROGMEM variables to end in _P.
Should be done for temptable in ThermistorTable.h, too, but this
would mess up an existing users' configuration.

This tries to put emphasis on the fact that you have to read
these values with pgm_read_*() instead of just using the variable.
Unfortunately, gcc compiler neither inserts PROGMEM reading
instructions automatically when reading data stored in flash,
nor does it complain or warn about the missing read instructions.

As such it's very easy to accidently handle data stored in flash
just like normal data. It'll compile and work ... you just read
arbitrary data (often, but not always zeros) instead of what you
intend.
2014-08-31 19:05:25 +02:00
Phil Hord 74808610c7 DDA: Move axis calculations into loops, part 5.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 5 is move ACCELERATION_TEMPORAL's step delay calculations
into loops. Not tested, binary size change unknown.
2014-08-31 19:05:09 +02:00
Phil Hord 8d729d499d DDA: Move axis calculations into loops, part 4.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 4 is move ACCELERATION_TEMPORAL's maximum feedrate limitation
into a loop. Not tested, binary size change unknown.
2014-08-31 19:05:00 +02:00
Phil Hord cd0155b5f4 DDA: Move axis calculations into loops, part 3.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 3 is moving fast axis detection into a loop.
Binary size 84 bytes smaller.
2014-08-31 19:04:52 +02:00
Phil Hord d3beb21225 DDA: Move axis calculations into loops, part 2.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Part 2 is moving maximum speed limit calculations into loops.
Binary size another 160 bytes smaller.
2014-08-31 19:04:42 +02:00
Phil Hord cec3c5f52e DDA: Move axis calculations into loops, part 1.
Clean up code to reduce duplication by consolidating code into
loops for per-axis actions.

Traumflug notes:

Split this once huge commit into smaller ones for ease of
reviewing and bisecting (in case something went wrong).

Part 1 is to put dda_create() distance calculations into loops.
This reduces binary size by another whopping 756 bytes.
2014-08-31 19:04:25 +02:00
Markus Hitter 1c19158bbc DDA: use new generic um_to_steps_* in dda_new_startpoint().
This was contributed by Phil Hord as part of another commit.

It saves 168 bytes, to it more than outweights the overhead of
introducing a generic implementation already.
2014-08-31 19:04:17 +02:00
Phil Hord e2f793c2b3 DDA: Convert more axis variables to arrays.
Many places in the code use individual variables for int/uint values
for X, Y, Z, and E.  A tip from a comment suggests making these into
arrays for scalability in the future. Replace the discrete variables
with arrays so the code can be simplified in the future.
2014-08-31 19:03:31 +02:00
Phil Hord d3f49b3e95 DDA: Convert TARGET axis vars to array.
In preparation for more efficient and scalable code using axis-loops
for common operations, add two new array-types for signed and unsigned
32-bit values per axis. Make the TARGET type use this array instead of
its current X, Y, Z, and E variables.

Traumflug notes:

- Did the usual conversion to spaces for changed lines.

- Added X = 0 to the enum. Just for peace of mind.

- Excellent patch!

Initially I wanted to make the new array an anonymous union with the
old variables to allow accessing values both ways. This way it would
have been possible to do the transition in smaller pieces. But as
the patch worked so flawlessly and binary size is precisely the
same, I abandoned this idea. Maybe it's a good idea in other areas.
2014-08-31 19:03:17 +02:00
Markus Hitter f51e52e7fa dda.c: endstop stop more reliably. 2014-05-29 21:49:20 +02:00
David Forrest dd72f9c1d6 dda.c: Update links to 'Generate stepper-motor speed profiles in real time' David Austin 2004 Embedded article. 2014-05-29 21:48:11 +02:00
Markus Hitter 95a44e8777 DDA: clear flags of a queue entry earlier.
Formerly, once a wait for temp was given, this flag would stick
forever on this queue entry.

Spotted by Zungmann, thanks a lot!

http://forums.reprap.org/read.php?147,33082,280439#msg-280439
2014-03-04 19:58:06 +01:00
Phil Hord c7150445af Zungmann's fixes to compile simulator on Mac OS X, part 2.
Here: .bss section syntax is different.
2014-03-04 19:57:48 +01:00
Markus Hitter 6fae5a8b7c DDA: make macro ACCELERATE_RAMP_LEN_SPM() a function.
This macro is pretty expensive (700 bytes, well, stuff is now
calculated at runtime), so there's no chance to use it in multiple
places and we likely also need this in dda_lookahead.c to achieve
full 4 axis compatibility there.
2014-03-04 19:56:01 +01:00
Markus Hitter 9739382da9 dda.c: remember steps per m of the fast axis for rampup calculation.
For now this is for the initial rampup calculation, only, notably
for moving the Z axis (which else gets far to few rampup steps on
a typical mendel-like printer).

The used macro was verified with this test code (in mendel.c):

[...]
int main (void) {
  init();

  uint32_t speed, spm;
  char string[128];
  for (spm = 2000; spm < 4099000; spm <<= 1) {
    for (speed = 11; speed < 65536; speed *= 8) {
      sersendf_P(PSTR("spm = %lu  speed %lu ==> macro %lu  "),
                 spm, speed, ACCELERATE_RAMP_LEN_SPM(speed, spm));
      delay_ms(10);
      sprintf(string, "double %f\n",
              (double)speed * (double)speed / ((double)7200000 * (double)ACCELERATION / (double)spm));
      serial_writestr((uint8_t *)string);
      delay_ms(10);
    }
  }
[...]

Note: to link the test code, this linker flag is required to add
      the full printf library (which does print doubles):

  LDFLAGS += -Wl,-u,vfprintf -lprintf_flt -lm
2014-03-04 19:55:53 +01:00
Markus Hitter 3da2363ac5 DDA: remember the fast axis micrometers and save their reconstruction.
No surprise, this saves a whopping 600 bytes.
2014-03-04 19:55:45 +01:00
Roland Brochard 297aa28dfd Lookahead: refactored code to compute everything in dda steps. 2014-03-04 19:55:36 +01:00
Markus Hitter 20686eb52c dda.c: remove the hack for too high rampup steps for lookahead.
Keeping the hack causes the previous move to decelerate, which isn't
intended when movements are joined with lookahead.

Removing only the hack breaks endstop handling on those axes which
set a huuuge number of acceleration steps for the lack of a proper
calculation algorithm. We have this algorithm now, so we can stop
using this kludge.

Solves part 1 of issue #68.
2014-03-04 19:55:28 +01:00
Markus Hitter 2f1142d461 dda.c: base rampup_steps calculation on the fast axis, too. 2014-03-04 19:55:20 +01:00
Markus Hitter 41ca1b7570 dda.c: actually reduce endpoint.F for overly fast movement attempts.
This fix was long overdue and is now unavoidable, as the hack with
limiting maximum speed to c_min in dda_clock() conflicts with
lookahead.
2014-03-04 19:55:03 +01:00
Markus Hitter 46548dba47 DDA: make a huge comment compact and move it to where it belongs.
Also clarify the acceleration formula.
2014-03-04 19:54:53 +01:00
Markus Hitter 42ad12fba3 DDA: store distance of each movement.
This is required to calculate speeds of individual axes. So far only
in dda_find_crossing_speed(), but soon also in dda_join_moves().
2014-03-04 19:54:41 +01:00
Markus Hitter 1b5682c01a dda.c: describe all lookahead cases in a comment. 2014-03-04 19:54:34 +01:00
Markus Hitter d10c0f3041 dda.c: clear /all/ lookahead variables when starting.
Catched it! :-)

This was causing occasional step losses when a movement with
zero crossF followed a pair with high crossF.
2014-03-04 19:54:23 +01:00
Markus Hitter da5339d163 DDA: use the already calculated distance for crossing speed calculation. 2014-03-04 19:52:42 +01:00
Markus Hitter 3ac26f0cab DDA: move dda_find_crossing_speed() to dda.c.
This is a preparation towards going through the existing movement
queue backwards with dda_join_moves() to allow higher feedrates
for lots of short movements.
2014-03-04 19:52:29 +01:00
Markus Hitter 1eaf711923 dda.c: review debug messages a bit. 2013-12-06 19:24:58 +01:00
Markus Hitter c594af3e19 dda.c: don't set timer twice when going from one move to the next.
This is a bug which existed, well, basically forever. Nobody noticed
until precision timings could be recorded with SimulAVR.
2013-12-06 19:24:58 +01:00
Markus Hitter f37a65ca36 dda_queue.c: run every queued thing through dda_create().
This allows dda_create() to track things queued but not being a movement.
Important for lookahead.
2013-12-06 19:24:58 +01:00