Teacup_Firmware

Commit Graph

Author	SHA1	Message	Date
Markus Hitter	e098a96bac	DDA: don't bit-shift move_c. Next babystep, tiny enhancement: 8 bytes less binary size.	2014-08-31 19:11:30 +02:00
Markus Hitter	b9c38051cc	DDA: don't bit-shift dda->c_min. Another babystep. First results: binary 4 bytes smaller. Yikes! Ha ha.	2014-08-31 19:11:19 +02:00
Markus Hitter	6880f05f7e	DDA: don't bit-shift dda->end_c either. Next babystep. All changes in ACCELERATION_REPRAP, which isn't part of current test procedures, so let's cross fingers it was done right.	2014-08-31 19:11:12 +02:00
Markus Hitter	2541eaf335	DDA: don't bit-shift c_limit, c_limit_calc either. Admittedly it looks like advancing in babysteps, but really catching every bit shifting instance isn't trivial, sometimes these shifts are already embedded in other calculations. Still no binary size or performance change.	2014-08-31 19:11:04 +02:00
Markus Hitter	4fa755daef	dda.c: don't bit-shift c0. While this shifting meant to increase accuracy, there's no actual use of it, other than that this value gets shifted back and forth. Let's start to get rid of it. Performance stays exactly the same: SIZES ATmega... '168 '328(P) '644(P) '1280 FLASH : 20188 bytes 141% 66% 32% 16% RAM : 2318 bytes 227% 114% 57% 29% EEPROM: 32 bytes 4% 2% 2% 1% short-moves.gcode statistics: LED on occurences: 888. LED on time minimum: 306 clock cycles. LED on time maximum: 722 clock cycles. LED on time average: 315.253 clock cycles. smooth-curves.gcode statistics: LED on occurences: 9124. LED on time minimum: 311 clock cycles. LED on time maximum: 712 clock cycles. LED on time average: 361.416 clock cycles. triangle-odd.gcode statistics: LED on occurences: 1636. LED on time minimum: 306 clock cycles. LED on time maximum: 712 clock cycles. LED on time average: 334.319 clock cycles.	2014-08-31 19:10:56 +02:00
Markus Hitter	ec937adde2	run-in-simulavr.sh: move statistics to the end. This is more convenient for obvious reasons, you no longer have to search all the output for these few lines.	2014-08-31 19:10:48 +02:00
Markus Hitter	157a5a966b	run-in-simulavr.sh: clean up tracein file after being done.	2014-08-31 19:10:39 +02:00
Markus Hitter	4f0a00c1a6	DDA: calculate acceleration for the actual fast axis. This finally brings Z axis up to speed. So far we always assumed the fastest axis to have the same steps/mm as the X axis. In cases where this wasn't true, the movement wouldn't do sufficient acceleration steps and, accordingly, not reach the expected maximum speed. This was particularly visible on a typical Mendel printer, where the Z axis would reach only a 6th of the commanded speed in some configurations.	2014-08-31 19:10:31 +02:00
Markus Hitter	5ee2aebbed	DDA: remember number of the fast axis.	2014-08-31 19:10:23 +02:00
Markus Hitter	294f0eda26	DDA: have an acceleration constant for each axis individually. For now, keep behaviour identical, like still use STEPS_PER_M_X. This is about to change soon.	2014-08-31 19:10:14 +02:00
Markus Hitter	2ad7517e27	preprocessor_math.h, SQRT(): take a better initial guess. Now results are apparently accurate across the whole uint32 range. At least, this test passes with all numbers being exact: #include "preprocessor_math.h" #include <math.h> ... in main() ... sersendf_P(PSTR("0: %lu %lu\n"), (uint32_t)SQRT(0), (uint32_t)sqrt(0)); sersendf_P(PSTR("1: %lu %lu\n"), (uint32_t)SQRT(1), (uint32_t)sqrt(1)); sersendf_P(PSTR("2: %lu %lu\n"), (uint32_t)SQRT(2), (uint32_t)sqrt(2)); sersendf_P(PSTR("3: %lu %lu\n"), (uint32_t)SQRT(3), (uint32_t)sqrt(3)); sersendf_P(PSTR("4: %lu %lu\n"), (uint32_t)SQRT(4), (uint32_t)sqrt(4)); sersendf_P(PSTR("5: %lu %lu\n"), (uint32_t)SQRT(5), (uint32_t)sqrt(5)); sersendf_P(PSTR("6: %lu %lu\n"), (uint32_t)SQRT(6), (uint32_t)sqrt(6)); sersendf_P(PSTR("7: %lu %lu\n"), (uint32_t)SQRT(7), (uint32_t)sqrt(7)); sersendf_P(PSTR("8: %lu %lu\n"), (uint32_t)SQRT(8), (uint32_t)sqrt(8)); sersendf_P(PSTR("9: %lu %lu\n"), (uint32_t)SQRT(9), (uint32_t)sqrt(9)); sersendf_P(PSTR("10: %lu %lu\n"), (uint32_t)SQRT(10), (uint32_t)sqrt(10)); sersendf_P(PSTR("20: %lu %lu\n"), (uint32_t)SQRT(20), (uint32_t)sqrt(20)); sersendf_P(PSTR("30: %lu %lu\n"), (uint32_t)SQRT(30), (uint32_t)sqrt(30)); sersendf_P(PSTR("40: %lu %lu\n"), (uint32_t)SQRT(40), (uint32_t)sqrt(40)); sersendf_P(PSTR("50: %lu %lu\n"), (uint32_t)SQRT(50), (uint32_t)sqrt(50)); sersendf_P(PSTR("60: %lu %lu\n"), (uint32_t)SQRT(60), (uint32_t)sqrt(60)); sersendf_P(PSTR("70: %lu %lu\n"), (uint32_t)SQRT(70), (uint32_t)sqrt(70)); sersendf_P(PSTR("80: %lu %lu\n"), (uint32_t)SQRT(80), (uint32_t)sqrt(80)); sersendf_P(PSTR("90: %lu %lu\n"), (uint32_t)SQRT(90), (uint32_t)sqrt(90)); sersendf_P(PSTR("100: %lu %lu\n"), (uint32_t)SQRT(100), (uint32_t)sqrt(100)); sersendf_P(PSTR("200: %lu %lu\n"), (uint32_t)SQRT(200), (uint32_t)sqrt(200)); sersendf_P(PSTR("300: %lu %lu\n"), (uint32_t)SQRT(300), (uint32_t)sqrt(300)); sersendf_P(PSTR("400: %lu %lu\n"), (uint32_t)SQRT(400), (uint32_t)sqrt(400)); sersendf_P(PSTR("500: %lu %lu\n"), (uint32_t)SQRT(500), (uint32_t)sqrt(500)); sersendf_P(PSTR("600: %lu %lu\n"), (uint32_t)SQRT(600), (uint32_t)sqrt(600)); sersendf_P(PSTR("700: %lu %lu\n"), (uint32_t)SQRT(700), (uint32_t)sqrt(700)); sersendf_P(PSTR("800: %lu %lu\n"), (uint32_t)SQRT(800), (uint32_t)sqrt(800)); sersendf_P(PSTR("900: %lu %lu\n"), (uint32_t)SQRT(900), (uint32_t)sqrt(900)); sersendf_P(PSTR("1000: %lu %lu\n"), (uint32_t)SQRT(1000), (uint32_t)sqrt(1000)); sersendf_P(PSTR("2000: %lu %lu\n"), (uint32_t)SQRT(2000), (uint32_t)sqrt(2000)); sersendf_P(PSTR("3000: %lu %lu\n"), (uint32_t)SQRT(3000), (uint32_t)sqrt(3000)); sersendf_P(PSTR("4000: %lu %lu\n"), (uint32_t)SQRT(4000), (uint32_t)sqrt(4000)); sersendf_P(PSTR("5000: %lu %lu\n"), (uint32_t)SQRT(5000), (uint32_t)sqrt(5000)); sersendf_P(PSTR("6000: %lu %lu\n"), (uint32_t)SQRT(6000), (uint32_t)sqrt(6000)); sersendf_P(PSTR("7000: %lu %lu\n"), (uint32_t)SQRT(7000), (uint32_t)sqrt(7000)); sersendf_P(PSTR("8000: %lu %lu\n"), (uint32_t)SQRT(8000), (uint32_t)sqrt(8000)); sersendf_P(PSTR("9000: %lu %lu\n"), (uint32_t)SQRT(9000), (uint32_t)sqrt(9000)); sersendf_P(PSTR("10000: %lu %lu\n"), (uint32_t)SQRT(10000), (uint32_t)sqrt(10000)); sersendf_P(PSTR("20000: %lu %lu\n"), (uint32_t)SQRT(20000), (uint32_t)sqrt(20000)); sersendf_P(PSTR("30000: %lu %lu\n"), (uint32_t)SQRT(30000), (uint32_t)sqrt(30000)); sersendf_P(PSTR("40000: %lu %lu\n"), (uint32_t)SQRT(40000), (uint32_t)sqrt(40000)); sersendf_P(PSTR("50000: %lu %lu\n"), (uint32_t)SQRT(50000), (uint32_t)sqrt(50000)); sersendf_P(PSTR("60000: %lu %lu\n"), (uint32_t)SQRT(60000), (uint32_t)sqrt(60000)); sersendf_P(PSTR("70000: %lu %lu\n"), (uint32_t)SQRT(70000), (uint32_t)sqrt(70000)); sersendf_P(PSTR("80000: %lu %lu\n"), (uint32_t)SQRT(80000), (uint32_t)sqrt(80000)); sersendf_P(PSTR("90000: %lu %lu\n"), (uint32_t)SQRT(90000), (uint32_t)sqrt(90000)); sersendf_P(PSTR("100000: %lu %lu\n"), (uint32_t)SQRT(100000), (uint32_t)sqrt(100000)); sersendf_P(PSTR("200000: %lu %lu\n"), (uint32_t)SQRT(200000), (uint32_t)sqrt(200000)); sersendf_P(PSTR("300000: %lu %lu\n"), (uint32_t)SQRT(300000), (uint32_t)sqrt(300000)); sersendf_P(PSTR("400000: %lu %lu\n"), (uint32_t)SQRT(400000), (uint32_t)sqrt(400000)); sersendf_P(PSTR("500000: %lu %lu\n"), (uint32_t)SQRT(500000), (uint32_t)sqrt(500000)); sersendf_P(PSTR("600000: %lu %lu\n"), (uint32_t)SQRT(600000), (uint32_t)sqrt(600000)); sersendf_P(PSTR("700000: %lu %lu\n"), (uint32_t)SQRT(700000), (uint32_t)sqrt(700000)); sersendf_P(PSTR("800000: %lu %lu\n"), (uint32_t)SQRT(800000), (uint32_t)sqrt(800000)); sersendf_P(PSTR("900000: %lu %lu\n"), (uint32_t)SQRT(900000), (uint32_t)sqrt(900000)); sersendf_P(PSTR("1000000: %lu %lu\n"), (uint32_t)SQRT(1000000), (uint32_t)sqrt(1000000)); sersendf_P(PSTR("2000000: %lu %lu\n"), (uint32_t)SQRT(2000000), (uint32_t)sqrt(2000000)); sersendf_P(PSTR("3000000: %lu %lu\n"), (uint32_t)SQRT(3000000), (uint32_t)sqrt(3000000)); sersendf_P(PSTR("4000000: %lu %lu\n"), (uint32_t)SQRT(4000000), (uint32_t)sqrt(4000000)); sersendf_P(PSTR("5000000: %lu %lu\n"), (uint32_t)SQRT(5000000), (uint32_t)sqrt(5000000)); sersendf_P(PSTR("6000000: %lu %lu\n"), (uint32_t)SQRT(6000000), (uint32_t)sqrt(6000000)); sersendf_P(PSTR("7000000: %lu %lu\n"), (uint32_t)SQRT(7000000), (uint32_t)sqrt(7000000)); sersendf_P(PSTR("8000000: %lu %lu\n"), (uint32_t)SQRT(8000000), (uint32_t)sqrt(8000000)); sersendf_P(PSTR("9000000: %lu %lu\n"), (uint32_t)SQRT(9000000), (uint32_t)sqrt(9000000)); sersendf_P(PSTR("10000000: %lu %lu\n"), (uint32_t)SQRT(10000000), (uint32_t)sqrt(10000000)); sersendf_P(PSTR("20000000: %lu %lu\n"), (uint32_t)SQRT(20000000), (uint32_t)sqrt(20000000)); sersendf_P(PSTR("30000000: %lu %lu\n"), (uint32_t)SQRT(30000000), (uint32_t)sqrt(30000000)); sersendf_P(PSTR("40000000: %lu %lu\n"), (uint32_t)SQRT(40000000), (uint32_t)sqrt(40000000)); sersendf_P(PSTR("50000000: %lu %lu\n"), (uint32_t)SQRT(50000000), (uint32_t)sqrt(50000000)); sersendf_P(PSTR("60000000: %lu %lu\n"), (uint32_t)SQRT(60000000), (uint32_t)sqrt(60000000)); sersendf_P(PSTR("70000000: %lu %lu\n"), (uint32_t)SQRT(70000000), (uint32_t)sqrt(70000000)); sersendf_P(PSTR("80000000: %lu %lu\n"), (uint32_t)SQRT(80000000), (uint32_t)sqrt(80000000)); sersendf_P(PSTR("90000000: %lu %lu\n"), (uint32_t)SQRT(90000000), (uint32_t)sqrt(90000000)); sersendf_P(PSTR("100000000: %lu %lu\n"), (uint32_t)SQRT(100000000), (uint32_t)sqrt(100000000)); sersendf_P(PSTR("200000000: %lu %lu\n"), (uint32_t)SQRT(200000000), (uint32_t)sqrt(200000000)); sersendf_P(PSTR("300000000: %lu %lu\n"), (uint32_t)SQRT(300000000), (uint32_t)sqrt(300000000)); sersendf_P(PSTR("400000000: %lu %lu\n"), (uint32_t)SQRT(400000000), (uint32_t)sqrt(400000000)); sersendf_P(PSTR("500000000: %lu %lu\n"), (uint32_t)SQRT(500000000), (uint32_t)sqrt(500000000)); sersendf_P(PSTR("600000000: %lu %lu\n"), (uint32_t)SQRT(600000000), (uint32_t)sqrt(600000000)); sersendf_P(PSTR("700000000: %lu %lu\n"), (uint32_t)SQRT(700000000), (uint32_t)sqrt(700000000)); sersendf_P(PSTR("800000000: %lu %lu\n"), (uint32_t)SQRT(800000000), (uint32_t)sqrt(800000000)); sersendf_P(PSTR("900000000: %lu %lu\n"), (uint32_t)SQRT(900000000), (uint32_t)sqrt(900000000)); sersendf_P(PSTR("1000000000: %lu %lu\n"), (uint32_t)SQRT(1000000000), (uint32_t)sqrt(1000000000)); sersendf_P(PSTR("2000000000: %lu %lu\n"), (uint32_t)SQRT(2000000000), (uint32_t)sqrt(2000000000)); sersendf_P(PSTR("3000000000: %lu %lu\n"), (uint32_t)SQRT(3000000000), (uint32_t)sqrt(3000000000)); sersendf_P(PSTR("4000000000: %lu %lu\n"), (uint32_t)SQRT(4000000000), (uint32_t)sqrt(4000000000));	2014-08-31 19:10:07 +02:00
Markus Hitter	6f83519a1d	Add preprocessor math. For now this is a square root function which should solve entirely in the preprocessor. Test results described in the file. Test code for runtime results, inserted right before the main loop in mendel.c: for (uint32_t i = 0; i < 10000000; i++) { uint32_t mathlib = (uint32_t)(sqrt(i) + .5); uint32_t preprocessor = (uint32_t)(SQRT(i) + .5); if (mathlib != preprocessor) { sersendf_P(PSTR("%lu: %lu %lu\n"), i, mathlib, preprocessor); break; } if ((i & 0x00001fff) == 0) sersendf_P(PSTR("%lu\n"), i); } sersendf_P(PSTR("Square root check done.\n")); Test code for compile time results: sersendf_P(PSTR("10000000: %lu\n"), (uint32_t)SQRT(10000000)); sersendf_P(PSTR("10000000: %lu\n"), (uint32_t)sqrt(10000000)); sersendf_P(PSTR("20000000: %lu\n"), (uint32_t)SQRT(20000000)); sersendf_P(PSTR("20000000: %lu\n"), (uint32_t)sqrt(20000000)); sersendf_P(PSTR("30000000: %lu\n"), (uint32_t)SQRT(30000000)); sersendf_P(PSTR("30000000: %lu\n"), (uint32_t)sqrt(30000000)); sersendf_P(PSTR("40000000: %lu\n"), (uint32_t)SQRT(40000000)); sersendf_P(PSTR("40000000: %lu\n"), (uint32_t)sqrt(40000000)); sersendf_P(PSTR("50000000: %lu\n"), (uint32_t)SQRT(50000000)); sersendf_P(PSTR("50000000: %lu\n"), (uint32_t)sqrt(50000000)); sersendf_P(PSTR("60000000: %lu\n"), (uint32_t)SQRT(60000000)); sersendf_P(PSTR("60000000: %lu\n"), (uint32_t)sqrt(60000000)); sersendf_P(PSTR("70000000: %lu\n"), (uint32_t)SQRT(70000000)); sersendf_P(PSTR("70000000: %lu\n"), (uint32_t)sqrt(70000000)); sersendf_P(PSTR("80000000: %lu\n"), (uint32_t)SQRT(80000000)); sersendf_P(PSTR("80000000: %lu\n"), (uint32_t)sqrt(80000000)); sersendf_P(PSTR("90000000: %lu\n"), (uint32_t)SQRT(90000000)); sersendf_P(PSTR("90000000: %lu\n"), (uint32_t)sqrt(90000000));	2014-08-31 19:09:59 +02:00
Phil Hord	76bf5ef75a	Datalog: show traced data as signed ints, not unsigned.	2014-08-31 19:09:37 +02:00
Phil Hord	24f5416bba	DDA: Rename confusing variable name. 'all_time' sounds like forever to me, but this variable really tracks the last time we hit one of "all the axes". It sticks out more now in looping, so rename it to make sense.	2014-08-31 19:09:24 +02:00
Phil Hord	bc4cf20341	Trivial cleanups. Fix some formatting and hide a couple of variables when they're not being used.	2014-08-31 19:09:15 +02:00
Phil Hord	f9f068596d	DDA: Move axis calculations into loops, part 9 (last part). Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 9 is, finally use this set_direction() thing. As a dessert topping, it reduces binary size by another 122 bytes. SIZES ATmega... '168 '328(P) '644(P) '1280 FLASH : 19988 bytes 140% 66% 32% 16% RAM : 2302 bytes 225% 113% 57% 29% EEPROM: 32 bytes 4% 2% 2% 1%	2014-08-31 19:09:07 +02:00
Markus Hitter	96e9ae4dab	dda.h: comment on these direction flags and other things.	2014-08-31 19:08:57 +02:00
Markus Hitter	41e76ca9fe	dda.c: make update_current_position() even smaller. Saves another 24 bytes. SIZES ATmega... '168 '328(P) '644(P) '1280 FLASH : 20110 bytes 141% 66% 32% 16% RAM : 2302 bytes 225% 113% 57% 29% EEPROM: 32 bytes 4% 2% 2% 1% Using muldiv() would be more accurate, but unfortunately, the compiler bails out: static const axes_uint32_t PROGMEM steps_per_mm_P = { ^ dda.c:889:1: error: unable to find a register to spill in class ‘POINTER_REGS’ } ^ dda.c:889:1: error: this is the insn: (insn 81 80 83 6 (set (reg:SI 77 [ D.3086 ]) (mem:SI (post_inc:HI (reg:HI 2 r2 [orig:103 ivtmp.106 ] [103])) [3 MEM[base: _82, offset: 0B]+0 S4 A8])) dda.c:881 94 {movsi} (expr_list:REG_INC (reg:HI 2 r2 [orig:103 ivtmp.106 ] [103]) (nil))) dda.c:889: confused by earlier errors, bailing out Another one is, calculating this: (int32_t)get_direction(dda, i) move_state.steps[i] * 1000 / pgm_read_dword(&steps_per_mm_P[i]); produces nonsense values for negative returns from get_direction(). Apparently, the compiler doesn't want to divide negative values??? Odd. Anyways, sufficient parentheses solve the problem.	2014-08-31 19:08:49 +02:00
Phil Hord	b552447789	DDA: Move axis calculations into loops, part 8. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 8 is, move remaining update_current_position() into a loop. This makes the binary 134 bytes smaller. As it's not critical, no performance test. SIZES ATmega... '168 '328(P) '644(P) '1280 FLASH : 20134 bytes 141% 66% 32% 16% RAM : 2302 bytes 225% 113% 57% 29% EEPROM: 32 bytes 4% 2% 2% 1%	2014-08-31 19:08:42 +02:00
Phil Hord	80b29b727b	DDA: Move axis calculations into loops, part 7. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 7 is, turn update_current_position() in dda.c partially into a loop. Surprise, surprise, this changes neither binary size nor performance. Looking into the generated assembly, the loop is indeed completely unrolled. Apparently that's smaller than a real loop. SIZES ATmega... '168 '328(P) '644(P) '1280 FLASH : 20270 bytes 142% 66% 32% 16% RAM : 2302 bytes 225% 113% 57% 29% EEPROM: 32 bytes 4% 2% 2% 1% short-moves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 888. Sum of all LED on time: 279945 clock cycles. LED on time minimum: 306 clock cycles. LED on time maximum: 722 clock cycles. LED on time average: 315.253 clock cycles. smooth-curves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 9124. Sum of all LED on time: 3297806 clock cycles. LED on time minimum: 311 clock cycles. LED on time maximum: 712 clock cycles. LED on time average: 361.443 clock cycles. triangle-odd.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 1636. Sum of all LED on time: 546946 clock cycles. LED on time minimum: 306 clock cycles. LED on time maximum: 712 clock cycles. LED on time average: 334.319 clock cycles.	2014-08-31 19:08:34 +02:00
David Forrest	32481e2799	debug.h: Align M111 debug bit codes with Repetier-Host. No code changes, binary size and performance kept.	2014-08-31 19:08:26 +02:00
Markus Hitter	cc9c9ff7b4	DDA: Revert move axis calculations into loops, part 6a-c. Sad but true, this experiment didn't work out. Performance loss due to looping in dda_step() is still at least 16% with the best algorithm found.	2014-08-31 19:08:15 +02:00
Markus Hitter	1fc4a26ccd	DDA: Move axis calculations into loops, part 6c. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 6c removes do_step(), but still tries to keep a loop. This about the maximum of performance I (Traumflug) can think of. Binary size is as good as with the former attempt, but performance is actually pretty bad, 45% worse than without looping: SIZES ATmega... '168 '328(P) '644(P) '1280 FLASH : 19876 bytes 139% 65% 32% 16% RAM : 2302 bytes 225% 113% 57% 29% EEPROM: 32 bytes 4% 2% 2% 1% short-moves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 888. Sum of all LED on time: 406041 clock cycles. LED on time minimum: 448 clock cycles. LED on time maximum: 864 clock cycles. LED on time average: 457.253 clock cycles. smooth-curves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 9124. Sum of all LED on time: 4791132 clock cycles. LED on time minimum: 453 clock cycles. LED on time maximum: 867 clock cycles. LED on time average: 525.113 clock cycles. triangle-odd.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 1636. Sum of all LED on time: 800586 clock cycles. LED on time minimum: 448 clock cycles. LED on time maximum: 867 clock cycles. LED on time average: 489.356 clock cycles.	2014-08-31 19:08:07 +02:00
Markus Hitter	808f5dcfca	DDA: Move axis calculations into loops, part 6b. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 6b moves do_step() from the "tidiest" place into where it's currently used, dda.c. Binary size goes down another 34 bytes, to a total savings of 408 bytes and performance is much better, but still 16% lower than without using loops: SIZES ATmega... '168 '328(P) '644(P) '1280 FLASH : 19874 bytes 139% 65% 32% 16% RAM : 2302 bytes 225% 113% 57% 29% EEPROM: 32 bytes 4% 2% 2% 1% short-moves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 888. Sum of all LED on time: 320000 clock cycles. LED on time minimum: 351 clock cycles. LED on time maximum: 772 clock cycles. LED on time average: 360.36 clock cycles. smooth-curves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 9124. Sum of all LED on time: 3875874 clock cycles. LED on time minimum: 356 clock cycles. LED on time maximum: 773 clock cycles. LED on time average: 424.8 clock cycles. triangle-odd.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 1636. Sum of all LED on time: 640357 clock cycles. LED on time minimum: 351 clock cycles. LED on time maximum: 773 clock cycles. LED on time average: 391.416 clock cycles.	2014-08-31 19:07:59 +02:00
Phil Hord	b83449d8c3	DDA: Move axis calculations into loops, part 6a. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 6a is putting stuff inside the step interrupt into a loop, too. do_step() is put into the "tidiest" place. Binary size goes down a remarkable 374 bytes, but stepping performance suffers by almost 30%. Traumflug's performance measurements: SIZES ATmega... '168 '328(P) '644(P) '1280 FLASH : 19908 bytes 139% 65% 32% 16% RAM : 2302 bytes 225% 113% 57% 29% EEPROM: 32 bytes 4% 2% 2% 1% short-moves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 888. Sum of all LED on time: 354537 clock cycles. LED on time minimum: 390 clock cycles. LED on time maximum: 806 clock cycles. LED on time average: 399.253 clock cycles. smooth-curves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 9124. Sum of all LED on time: 4268896 clock cycles. LED on time minimum: 395 clock cycles. LED on time maximum: 807 clock cycles. LED on time average: 467.875 clock cycles. triangle-odd.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 1636. Sum of all LED on time: 706846 clock cycles. LED on time minimum: 390 clock cycles. LED on time maximum: 807 clock cycles. LED on time average: 432.057 clock cycles.	2014-08-31 19:07:51 +02:00
Markus Hitter	ad82907b98	testcases: Add config.h. There's nothing special about this config.h, it's just the one I happened to use for first profiling investigations. To allow everybody else to do the very same profiling runs, I add it here. Doing profiling isn't too complicated: mv config.h config.h.backup ln -s testcases/config.h.Profiling config.h git checkout -b work git cherry-pick simulavr # add tweaks convenient for simulation runs make cd testcases ./run-in-simulavr.sh short-moves.gcode smooth-curves.gcode triangle-odd.gcode After being done you can restore your config.h and delete this work branch. Currently, performance is as following (with convenience commit applied): SIZES ATmega... '168 '328(P) '644(P) '1280 FLASH : 20270 bytes 142% 66% 32% 16% RAM : 2302 bytes 225% 113% 57% 29% EEPROM: 32 bytes 4% 2% 2% 1% short-moves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 888. Sum of all LED on time: 279945 clock cycles. LED on time minimum: 306 clock cycles. LED on time maximum: 722 clock cycles. LED on time average: 315.253 clock cycles. smooth-curves.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 9124. Sum of all LED on time: 3297806 clock cycles. LED on time minimum: 311 clock cycles. LED on time maximum: 712 clock cycles. LED on time average: 361.443 clock cycles. triangle-odd.gcode Statistics (assuming a 20 MHz clock): LED on occurences: 1636. Sum of all LED on time: 546946 clock cycles. LED on time minimum: 306 clock cycles. LED on time maximum: 712 clock cycles. LED on time average: 334.319 clock cycles.	2014-08-31 19:07:39 +02:00
David Forrest	003697ee0f	gcode_parse.c: Debug S with serwrite_int32.	2014-08-31 19:07:30 +02:00
David Forrest	f046c013e3	sermesg.c: Add documentation tag for variable floating point.	2014-08-31 19:07:21 +02:00
Markus Hitter	e7707ea275	config.*.h: extend DEBUG_LED_PIN comment to all config templates.	2014-08-31 19:07:13 +02:00
David Forrest	f356f64bdb	config.default.h: Add DEBUG_LED_PIN to the pinout section.	2014-08-31 19:07:01 +02:00
David Forrest	b12157cb6f	gcode_process.c: Add comment on units of P, I, and D parameters.	2014-08-31 19:06:52 +02:00
David Forrest	5b5c44b523	dda_lookahead.c: Eliminate debug crossF variable compile warning. Fix: dda_lookahead.c:327:17: warning: 'crossF' may be used uninitialized in this function [-Wmaybe-uninitialized] sersendf_P(PSTR("Initial crossing speed: %lu\n"), crossF); ^	2014-08-31 19:06:43 +02:00
David Forrest	2496a95c6f	dda_maths.h: Add comment on units of C0.	2014-08-31 19:06:34 +02:00
David Forrest	f3666fc43f	heater_sim.c: Note that the heater isn't implemented in the simulator.	2014-08-31 19:06:23 +02:00
Markus Hitter	fdfd202e5d	run-in-simulavr.sh: add statistics output for LED On Time. As it's still a bit cumbersome to go through the whole .vcd file to find the highest delay between On and Off, do this search automatically and output an statistics. Can look like this: Statistics (assuming a 20 MHz clock): LED on occurences: 838. Sum of all LED on time: 262055 clock cycles. LED on time minimum: 306 clock cycles. LED on time maximum: 717 clock cycles. LED on time average: 312.715 clock cycles. This should give an reasonable overview of wether and roughly how much a particular code change makes your code slower or faster. It should also show up showblockers, like occasionally huge delays. BTW., the above data was collected timing the step interrupt when running short-moves.gcode with the current firmware.	2014-08-31 19:06:13 +02:00
Markus Hitter	da08c35edd	run-in-simulavr.sh: add support for timing measurements. The idea is simple: if you want to time a portion of code precisely, turn on the Debug LED (see config.h for DEBUG_LED_PIN) at the start of sequence and turn it off when done. Running this in SimulAVR, you have two flanges precise to the clock cycle which exactly reflect the time taken to run this code sequence. Ideally, you run this code n a loop to get a number of samples, if it doesn't run in a loop anyways. Time taken can then be measured in GTKWave. For convenience and for a better overview, run-in-simulavr.sh also extracts all the delays into it's own signal, so it can be viewed as an ongoing number.	2014-08-31 19:06:05 +02:00
Markus Hitter	4389e670bd	run-in-simulavr.sh: start signals undefined. Also a few aesthetical corrections.	2014-08-31 19:05:56 +02:00
Markus Hitter	35c4949965	run-in-simulavr.sh: run SimulAVR a bit more verbose. SimulAVR doesn't always work exactly the way it should, so looking at the command line it's started with is a first debugging step.	2014-08-31 19:05:47 +02:00
Markus Hitter	6250dbb9e0	Configuration: move DEBUG_LED definition. Eventual debugging LEDs aren't part of the CPU, but part of the electronics. Accordingly, define it in config..h, not in arduino_.h (which would be better named something like "atmega_*.h).	2014-08-31 19:05:38 +02:00
Markus Hitter	9a08675576	Rename all these new PROGMEM variables to end in _P. Should be done for temptable in ThermistorTable.h, too, but this would mess up an existing users' configuration. This tries to put emphasis on the fact that you have to read these values with pgm_read_*() instead of just using the variable. Unfortunately, gcc compiler neither inserts PROGMEM reading instructions automatically when reading data stored in flash, nor does it complain or warn about the missing read instructions. As such it's very easy to accidently handle data stored in flash just like normal data. It'll compile and work ... you just read arbitrary data (often, but not always zeros) instead of what you intend.	2014-08-31 19:05:25 +02:00
Phil Hord	74808610c7	DDA: Move axis calculations into loops, part 5. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 5 is move ACCELERATION_TEMPORAL's step delay calculations into loops. Not tested, binary size change unknown.	2014-08-31 19:05:09 +02:00
Phil Hord	8d729d499d	DDA: Move axis calculations into loops, part 4. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 4 is move ACCELERATION_TEMPORAL's maximum feedrate limitation into a loop. Not tested, binary size change unknown.	2014-08-31 19:05:00 +02:00
Phil Hord	cd0155b5f4	DDA: Move axis calculations into loops, part 3. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 3 is moving fast axis detection into a loop. Binary size 84 bytes smaller.	2014-08-31 19:04:52 +02:00
Phil Hord	d3beb21225	DDA: Move axis calculations into loops, part 2. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Part 2 is moving maximum speed limit calculations into loops. Binary size another 160 bytes smaller.	2014-08-31 19:04:42 +02:00
Phil Hord	427d6637c3	dda_maths.h: remove now obsolete um_to_steps_[xyze].	2014-08-31 19:04:33 +02:00
Phil Hord	cec3c5f52e	DDA: Move axis calculations into loops, part 1. Clean up code to reduce duplication by consolidating code into loops for per-axis actions. Traumflug notes: Split this once huge commit into smaller ones for ease of reviewing and bisecting (in case something went wrong). Part 1 is to put dda_create() distance calculations into loops. This reduces binary size by another whopping 756 bytes.	2014-08-31 19:04:25 +02:00
Markus Hitter	1c19158bbc	DDA: use new generic um_to_steps_* in dda_new_startpoint(). This was contributed by Phil Hord as part of another commit. It saves 168 bytes, to it more than outweights the overhead of introducing a generic implementation already.	2014-08-31 19:04:17 +02:00
Phil Hord	62bdbd86d6	DDA: convert um_to_steps_* to generic implementation. A generic implementation here will allow callers to pass the target axis in as a parameter so the callers can also be made more generic. Traumflug notes: Split out application of the new implementation in dda.c into its own commit. This actually costs 128 bytes, but as we can access axes from within a loop now, I expect to get more savings elsewhere. Interestingly, binary size is raised by another 18 bytes if um_to_steps(int32_t, enum axis_e) is changed to um_to_steps(enum axis_e, int32_t) even on the 8-bit ATmega. While putting the axis number to the front might be a bit more logical (think of additional parameters, the axis number position would move), NXP application note AN10963 states on page 10ff, 16-bit data should be 16-bit aligned and 32-bit data should be 32-bit aligned for best performance. Well, so let's do it this way.	2014-08-31 19:04:08 +02:00
Markus Hitter	84cbf2a42a	home.c: no need to turn off Z axis here. This is done in dda.c already, see dda.c, line 678.	2014-08-31 19:03:57 +02:00
Markus Hitter	94fa733ee8	home.c: don't move to zero after homing to max endstop. This can be counterproductive if the actual zero point is outside the available build room. For example, if an additional bed probing is going to happen. It also costs quite some time on the Z axis. If you actually want this behaviour, send a simple G0 XYZ after homing.	2014-08-31 19:03:45 +02:00

... 5 6 7 8 9 ...

1264 Commits All Branches Search

1264 Commits

All Branches