On ARM we use only the 16 byte hardware buffer for sending and
receiving over the serial line, which is often too short for
debugging messages. This implementation works fine and still
neither blocks nor introduces delays for short messages.
Removed while-loop. Looks like we need some more us than the LPC?!? With +7us
we do not lose characters anymore.