The current implementation of tk::text has severe performance problems. My basis for testing are the Chess Database applications – Scid vs. PC (a powerful and widely used multi-platform chess toolkit) and, my own application, Scidb. With the standard tk::text, it can take upto a few minutes just to display a chess game with a 100 lines. Based on profiling the following functions are the top performance killer:
So it's clear what has to be done:
1+3+4+5: New technique how to handle tagged text.
2: Elimination of CleanupLine.
6+7: New technique how to map between byte positions and segments.
8+9: Caching of calculated line heights for all display lines, not only the height of the logical lines.
10: Caching the currently hovered chunk, and using a section structure for faster lookup.
The original implementation is already quite complex, especially in time. It's clear that in general complexity cannot be eliminated. So I shifted the complexity in time to a more complex implementation, especially more complex data structures. The new implementation is very efficient in time. The increase in memory usage is not very high (but a bit high), and despite this, in many cases, especially if many tags are used, and/or undo is enabled, the revised version is even decreasing the memory usage.
Of special interest is the difference in speed, and the effect on user handling (smooth scrolling, fast response time), in some cases the difference is enormous, see Comparison of Old vs. Revised Implementation. With the revised implementation the problem with long lines (polynomial time in old code) is completely solved (in general only O(n × log n) in revised version). So the following performance issues can be removed:
Very long text lines can be expensive, especially if they have many marks and tags within them.
One performance problem can arise if you have hundreds or thousands of different tags …
While examining the code I found some more weak stuff. Furthermore I used this opportunity to add a few features, some for performance reasons, and some for additional functionality. It was the right time for these additions, because the acceleration of the text widget caused a rewrite/redesign of the code anyway. Also some bugs have been eliminated.
The revised version is passing all test cases under Linux. The special test cases under Windoze and Mac are still open.
In revised version some test cases have been adjusted, because:
The revised version is optimizing the computation of display lines, so some traces have changed.
The revised version has fixed some bugs, and some commands/options have changed.
I've added a few test cases, but for the most new features test cases are not yet written.
Currently only these issues are known:
The code for the implementation has increased by more than 100%, and about 70% of the old code has been changed. The revised implementation needs more testing, the text widget is very complex, and bugs are expected. And a few additions are not yet well tested.
Function tk_textCopy is copying hidden (elided) text. This seems to be unexpected, but it's the behavior of the original implementation. Probably this is a bug and should be corrected.
Adding/deleting tags covering a large range of text is still quite time consuming.
The display line with the insert cursor is redrawn each time the cursor blinks, which causes a steady stream of graphics traffic. It would be desirable if the cursor update will be performed with a specialized and efficient redraw function.
If option -spacemode is set to trim, then get -displaychars should probably return trimmed spaces. Currently this command is not trimming spaces, so the result may not coincide with the visible text.
The search -regexp sub-command is still not yet fully implemented, see Tk documentation.
The revised widget still ignores modifying commands silently if state is not normal, this behavior is unreasonable, but conform to original version.
Currently the special index specifier begin has the lowest precedence, although it should have the same precedence as the special index special end (see section INDICES). In a future release this will be corrected. The current behavior is a workaround, avoiding that existing applications will break with the introduction of begin.
The implementation still contains some TODO's of minor issues.
Many thanks to Steven Atkinson (Scid vs. PC) for testing, and compiling the Mac version (with backport to tk8.5). He has found so many bugs in revised version, and this was very helpful.
The revised implementation is already used in applications Scid vs. PC and Scidb. Both applications are displaying normal text and chess data, and especially application Scidb has a very sophisticated text layout, using many features of the text widget.
Use the bug tracker of project Scidb for the report of bugs. You need a Sourceforge account. It's important to attach a script file for replication of the bug. And note that currently I cannot fix problems with Windoze or Mac, here a volunteer is needed who is testing and fixing the revised version under Windoze/Mac. But Windoze and Mac is pre-tested, many thanks to Steven Atkinson.
Use the discussion forum of project Scidb for questions, suggestions, and issues. Here also you need a Sourceforge account.
Visit Download and Installation for download of the whole package of implementation files. Below you may inspect single implementation files.