Revised Implementation of Tk Text Widget

Introduction
Comparison in Performance
Performance Improvements
Bugs/Issues in Original Implementation
Additional Functionality
Index of Commands
Index of Functions
Index of Widget Options
Index of Tag Options
Index of Options for Embedded Window
Index of Chapters
Test Cases
Issues in Revised Implementation
Incompatibilities to Standard Version
Deprecated Commands and Options
Some General Suggestions
Credits
Applications
Reporting Bugs/Issues/Questions/Suggestions
Source Code

Introduction

The current implementation of tk::text has severe performance problems. My basis for testing are the Chess Database applications – Scid vs. PC (a powerful and widely used multi-platform chess toolkit) and, my own application, Scidb. With the standard tk::text, it can take upto a few minutes just to display a chess game with a 100 lines. Based on profiling the following functions are the top performance killer:

TkBTreeGetTags
CleanupLine
TkTextIsElided
TkBTreeCharTagged
TkBTreeTag
TkTextIndexToSeg
TkTextMarkSegToIndex
MeasureUp
MeasureDown
TkTextPixelIndex

So it's clear what has to be done:

1+3+4+5: New technique how to handle tagged text.

The whole stuff concerning tagged text will be overworked.
2: Elimination of CleanupLine.

This functions is a result of sloppy programming and not needed. (Moreover this function is conceptually erroneous. It is used to traverse the whole line, joining succeeding char segments, and eliminating empty tag ranges. But this cleanup has to be done by the function which is doing the modifications, because this function knows exactly the relevant segments for cleanup. Traversing the whole line is unnecessary, and too expensive – the elimination of empty tag ranges has quadratic complexity. Moreover the revised version is using more clever techniques avoiding cleanups in many cases.)
6+7: New technique how to map between byte positions and segments.

New list structure Section will be introduced, accelerating the mapping.
8+9: Caching of calculated line heights for all display lines, not only the height of the logical lines.
10: Caching the currently hovered chunk, and using a section structure for faster lookup.

The original implementation is already quite complex, especially in time. It's clear that in general complexity cannot be eliminated. So I shifted the complexity in time to a more complex implementation, especially more complex data structures. The new implementation is very efficient in time. The increase in memory usage is not very high (but a bit high), and despite this, in many cases, especially if many tags are used, and/or undo is enabled, the revised version is even decreasing the memory usage.

Of special interest is the difference in speed, and the effect on user handling (smooth scrolling, fast response time), in some cases the difference is enormous, see Comparison of Old vs. Revised Implementation. With the revised implementation the problem with long lines (polynomial time in old code) is completely solved (in general only O(n × log n) in revised version). So the following performance issues can be removed:

Very long text lines can be expensive, especially if they have many marks and tags within them.

One performance problem can arise if you have hundreds or thousands of different tags …

While examining the code I found some more weak stuff. Furthermore I used this opportunity to add a few features, some for performance reasons, and some for additional functionality. It was the right time for these additions, because the acceleration of the text widget caused a rewrite/redesign of the code anyway. Also some bugs have been eliminated.

Comparison in Performance

Comparison of Old vs. Revised Implementation

Performance Improvements

Bugs/Issues in Original Implementation

Additional Functionality

Index of Commands

Index of Functions

tk_mergeRange

tk_textInsert

tk_textReplace

tk_textRebindMouseWheel

Index of Widget Options

Index of Tag Options

Index of Options for Embedded Image/Window

owner
tags

Index of Chapters

Test Cases

The revised version is passing all test cases under Linux. The special test cases under Windoze and Mac are still open.

In revised version some test cases have been adjusted, because:

The revised version is optimizing the computation of display lines, so some traces have changed.
The revised version has fixed some bugs, and some commands/options have changed.

I've added a few test cases, but for the most new features test cases are not yet written.

Issues in Revised Implementation

Currently only these issues are known:

The code for the implementation has increased by more than 100%, and about 70% of the old code has been changed. The revised implementation needs more testing, the text widget is very complex, and bugs are expected. And a few additions are not yet well tested.
Function tk_textCopy is copying hidden (elided) text. This seems to be unexpected, but it's the behavior of the original implementation. Probably this is a bug and should be corrected.
Adding/deleting tags covering a large range of text is still quite time consuming.
The display line with the insert cursor is redrawn each time the cursor blinks, which causes a steady stream of graphics traffic. It would be desirable if the cursor update will be performed with a specialized and efficient redraw function.
If option -spacemode is set to trim, then get -displaychars should probably return trimmed spaces. Currently this command is not trimming spaces, so the result may not coincide with the visible text.
The search -regexp sub-command is still not yet fully implemented, see Tk documentation.
The revised widget still ignores modifying commands silently if state is not normal, this behavior is unreasonable, but conform to original version.
Currently the special index specifier begin has the lowest precedence, although it should have the same precedence as the special index special end (see section INDICES). In a future release this will be corrected. The current behavior is a workaround, avoiding that existing applications will break with the introduction of begin.
The implementation still contains some TODO's of minor issues.

Some General Suggestions

Credits

Many thanks to Steven Atkinson (Scid vs. PC) for testing, and compiling the Mac version (with backport to tk8.5). He has found so many bugs in revised version, and this was very helpful.

Applications

The revised implementation is already used in applications Scid vs. PC and Scidb. Both applications are displaying normal text and chess data, and especially application Scidb has a very sophisticated text layout, using many features of the text widget.

Reporting Bugs/Issues/Questions/Suggestions

Use the bug tracker of project Scidb for the report of bugs. You need a Sourceforge account. It's important to attach a script file for replication of the bug. And note that currently I cannot fix problems with Windoze or Mac, here a volunteer is needed who is testing and fixing the revised version under Windoze/Mac. But Windoze and Mac is pre-tested, many thanks to Steven Atkinson.

Use the discussion forum of project Scidb for questions, suggestions, and issues. Here also you need a Sourceforge account.

Source Code

Visit Download and Installation for download of the whole package of implementation files. Below you may inspect single implementation files.