Revised Tk Text Widget

Improved Char Segment Split Behavior

The old behaviour how to split character segments is a bit silly. Use case: the user is typing a character at the end of the text (and we have no tagged text or marks). Now the last char segment will be split into two segments (before newline), and a newly segment with the typed character will be linked between the splitted ones. Afterwards the cleanup (which is processing the full segment chain!) is joining the first with the second, and then it joins the result of the previous join with the third segment, the unused segments will be destroyed.

The revised version will only reallocate the last segment (in this case), and inserts the character. No cleanup is required.

In old version for text segments nchars+1 bytes (one additional byte for nul) will be allocated for the chars. And when the user types another character finally a new segment with one more byte will be allocated (after some split and joins), the other one will be destroyed. In the revised version always (nchars+8)/8)*8 bytes will be allocated for the chars, this cannot increase the effective memory usage, because a memory address is a multiple of 8. This can reduce the number of allocations of char segments by factor 8, especially when the user is simply typing text, because a char segment now has some capacity for increase.

Impact on the speed: in fact the new split behavior has a pro and a contra.

Pro: The normal left to right insertion (character by character) is significantly faster, about 30%, it's obvious why. It is expected that in practice this is the most frequent case.
Contra: Inserting from right to left (character by character) is significantly slower, about 25%. Here the impact of the memmove() function, used in the revised information for the insertion at left side, has a bad impact, because in this case this function cannot do anything else than copying the content byte by byte.

Another improvement:

Any cursor movement hovering the widget is constantly splitting and joining char segments (function TkTextPickCurrent). This is useless. In the revised implementation the insertion of the "current" mark segment will be postponed until a real command will be executed, no more superfluous split and joins.