Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for right-to-left text direction. #3400

Closed
wants to merge 5 commits into from

Conversation

ashensis
Copy link
Contributor

@ashensis ashensis commented Sep 7, 2017

The must to have requirement to any text editor from the Arabic/Hebrew customer's perspective is support for base text direction.
The natural mode of typing & display for aforementioned languages is right-to-left.
[For detailed explanation of what base text direction term means please see, for instance, pages 5-9 of the following reference document.
https://docs.google.com/document/d/1dDrSwimrQbpbXybhMYDEiJXLeeqvelnY4cc8HeCP1Nk/edit?usp=sharing]

Virtually every editor (like those from MS Office or Open Office) or web based (like CKEditor, IBM Docs) text editors as well as large variety of program products and platforms including Windows system editors, web browser edit fields and areas allows the possibility to specify the text direction that impacts the Arabic/Hebrew text.

The current proposal (based on recently merged PR 'Support for Arabic/Hebrew/Persian languages' #3137) introduces the right-to-left text direction support for ACE text mode editors ('text' and 'plain text').
As in almost all editors it provides possibility to toggle between left-to-right (default) and right-to-left by pressing respectively CTRL+ALT+Shift+L(left) and CTRL+ALT+SHIFT+R(right) hot key combination.

The text direction is provided on per line basis, on switching to right-to-left paragraph text becomes right aligned.
Propagating the line by pressing enter at the 'end' position produces new right-to-left line.
It is possible to switch the few selected line between right-to-left and left-to-right simultaneously.

Both 'insert' and 'overwrite' typing modes are supported.

Paragraph level right-to-left text direction is serialize-able since it is achieved by inserting RLE UCC mark at position 0 of corresponding document text line (which can't be removed by user's accidental keyboard operations) and adding 'direction' and text-align CSS styles in order to achive right-to-left text direction and right alignment (including wrapped lines).

Tested on FF, IE, Chrome/Opera

@nightwing
Copy link
Member

nightwing commented Sep 7, 2017

Tests are failing because some of the changed files are indented with tabs, please use 4 spaces to get tests pass.

I have several questions about this pr:

  • Are there any code editors implementing this algorithm?
  • where is the code handling CTRL+ALT(left) and CTRL+ALT(right)? how would it interact with multiselect
  • how can this be generalized for other modes?
  • are you mainly interested in using ace in plaintext mode?
  • it is possible to select ltr/rtl mark in the line and delete it, is it the desired behavior?

@tomerm
Copy link

tomerm commented Sep 7, 2017

@nightwing would greatly appreciate your review / comments.

Just a couple of words on the motivation for the need to change direction between LTR and RTL.
As we know Arabic / Hebrew languages (actually scripts) are called bidirectional for a good reason. Different pieces of text flow in different direction.

#3137 resolved problem of text flow (also called basic reordering) on the text segments level (continuous segment which is associated with the same language flows in the same direction. For example, "hello world" appears as flowing from left to right, while "HELLO WORLD" should appear as flowing from right to left: "DLROW OLLEH".

This PR resolves the problem on the paragraph (line) level. Base text direction mentioned by @ashensis above affects relative position of segments of text (also called directional runs. for example "hello world" is a directional run with left to right direction, while "DLROW OLLEH" is a directional run with right to left direction) in a larger context (i.e. paragraph). For example:

  • "my name is TOMER" should be displayed with left to right direction: "my name is REMOT",
  • while "MY NAME IS tomer" should appear with right to left direction as: "tomer SI EMAN YM".

Why paragraph / line level is important ? Mainly for following reasons:

  1. Sentences formulated in Arabic / Hebrew quite often include numbers and segments in English thus creating bidirectionality. It is critical to display such sentences with right to left direction to assure their readability. If Arabic / Hebrew sentence including English word is displayed with left to right direction it will be unreadable: "SI EMAN YM tomer"

  2. In many cases in the same document there coexist paragraphs in English and paragraphs in Arabic (or/and Hebrew). One most natural example: as you may know Arabic / Hebrew / English are state languages in Israel and every single document (including street signs) includes text in all 3 languages. This means that we need the ability to display paragraphs in Arabic / Hebrew with right to left direction, while paragraphs in English with left to right direction.

@nightwing
Copy link
Member

@tomerm thanks for the clarification, could you also explain why is this needed in a code editor, it seems like this is going to be much more complicated for modes with highlighting, where we'd probably want to preserve direction of some code spans based on their syntactic meaning.

@tomerm
Copy link

tomerm commented Sep 7, 2017

@nightwing relating to several general questions (leaving it to @ashensis to reply to code related question).

Are there any code editors implementing this algorithm?

This is part of standard UBA (Unicode Bidi Algorithm). It is implemented by all editors. The difference is only:

  • how this change of direction can be invoked (different editors provide different ways to the end user to achieve that) and
  • the level on which text direction can be changed (i.e. entire editing area, paragraph, line etc.)

Several examples for some of popular editors:

  • Notepad allows interactive change of text direction of the entire editing area via keyboard shortcuts (CTRL+ALT(left) and CTRL+ALT(right)).
  • CKeditor / MS Office (web and desktop) / Star Office / Google Docs / TextBox.IO / IBM Docs allow control over text direction on the paragraph level (usually either via keyboard shortcut or/and toolbar buttons).
  • Plain text fields in web browser (assuming you set dir=auto) define text direction of each paragraph automatically based on simplest heuristic (also part of UBA)
  • On mobile platforms plain text fields have out of the box direction = auto (they also call it first strong or natural etc.). In addition they also allow interactive change of text direction on the paragraph level using flowing menu.

how can this be generalized for other modes?

Text direction is very important for text formulated in natural language. Thus it is natural for plain text mode. Obviously it can be applicable for comments. In other words editing of Java (or similar) code may require enforcing text direction of English comment to LTR while Arabic / Hebrew comments to RTL.

are you mainly interested in using ace in plaintext mode?

Indeed. This is the main usage of editor at the moment. Addressing Bidi requirements for other modes (i.e. Java / C / C++ / Jason / markdown etc. ) will require additional efforts. We need to assure that syntax of specific programming language is not violated on display by presence of Bidi text in constants / comments (more rarely in variables / function names). This specific requirement is addressed on display only (similar to coloring scheme : comments appear in green, variable in blue etc.). Color information is not serialized with edited source code. Similarly enforcement of code syntax happens on display only without affecting storage in any way.

@tomerm
Copy link

tomerm commented Sep 7, 2017

@nightwing

why is this needed in a code editor, it seems like this is going to be much more complicated for modes with highlighting, where we'd probably want to preserve direction of some code spans based on their syntactic meaning.

Indeed. If we wish to relate to text direction in the code editor we need to take care:

  1. First and foremost preserve syntax of programming language on the display. UBA does not do a good job here. UBA is not meant to handle any syntax or structure inside text.

  2. Second we may wish to apply text direction to comments / constants.

  3. By all means as you point it out the handling in such cases will be more complex since we need to preserve both coloring scheme - highlights - and direction of text using HTML markup / CSS styles.

@nightwing
Copy link
Member

Interesting.
Could you please clarify two more points:

  • What is the advantage of using Ace in plain text mode instead of using some other rich text editor? My first impression was that it may not very useful, but maybe i am wrong and it's the next big market for Ace :)
  • What is the advantage of manual switching over the dir auto mode?

@tomerm
Copy link

tomerm commented Sep 8, 2017

@nightwing

What is the advantage of using Ace in plain text mode instead of using some other rich text editor? My first impression was that it may not very useful, but maybe i am wrong and it's the next big market for Ace :)

No advantage. However if you have in your product contexts in which you edit some type of source codes and also contexts in which you edit plain / rich text, sometimes it is easier to just leverage the same editor for all contexts instead of having embedded two different editors. It is really a choice of software product owners.

We are not planning to turn Ace into full blown rich text editor (i.e. like CKEditor). We are adding some very basic capabilities only. They are really essential for bidi users.

What is the advantage of manual switching over the dir auto mode?

Clear advantage is to have full control instead of letting somebody else (UBA heuristic or web browser etc.) to make a decision for you. While UBA heuristic behind dir=auto is not bad it is still a heuristic and it fails to address a general case. For example Arabic / Hebrew sentences starting from Latin letters (i.e. ace IS A GREAT EDITOR) it will never display with RTL direction.

@ashensis
Copy link
Contributor Author

ashensis commented Sep 8, 2017

The majority of questions already had been answered by tomerm.
I will address the remainder here.

'where is the code handling CTRL+ALT(left) and CTRL+ALT(right)? how would it interact with multiselect'
In textinput.js, 'onKeyDown' handler. To my humble understanding there shouldn't any clash with multiselect.

'it is possible to select ltr/rtl mark in the line and delete it, is it the desired behavior?'
I tried hard to prevent this to happen. First of all there is no ltr mark, only rtl one since left-to-right mode is provided by default. End user may remove rtl mark by pressing CTRL+ALT(left) and insert it by CTRL+ALT(right). Besides this rtl mark always goes accompanied by corresponding 'derection' and 'text-alignment' CSS
The only way to actually select and delete rtl mark is to start selection on the previous line and end it at current rtl line and press delete/backspace. But this is may be qualified as 'line merging' and all should work as designed.

@@ -1135,6 +1135,11 @@ EditSession.$uid = 0;
*
**/
this.insert = function(position, text) {
if (text == this.$bidiHandler.LRE || text == this.$bidiHandler.RLE || this.$bidiHandler.isRtlLine()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is better to use eventListener, instead of modifying session insert and remove, either the change event, or add a new event to https://github.com/ajaxorg/ace/blob/master/lib/ace/editor.js#L966.

@@ -201,6 +201,15 @@ var TextInput = function(parentNode, host) {
if (afterContextMenu)
afterContextMenu = false;
};

var onKeyDown = function(e) {
if(!inComposition && e.shiftKey && e.ctrlKey && host.session.$bidiHandler.hasRtlSupport()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be done on keyup instead, to not interfere with ctrl-shift- shortcuts?
Also note that some laptops (e.g. macbook) do not have right ctrl key, so this needs to be customizable.

@@ -204,6 +204,8 @@ var Text = function(parentEl) {
);
lineElement.style.height = config.lineHeight * this.session.getRowLength(row) + "px";
lineElement.innerHTML = html.join("");
if (this.session.$bidiHandler.hasRtlSupport())
this.setRtlDir({childNodes: [lineElement], textContent: lineElement.textContent});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be better to do this without accessing textContent, because getting the same information from line is faster

@nightwing
Copy link
Member

Thanks for detailed explanations.
This pr mostly looks good, but still needs some changes to be mergeable.

  • text selection/marker drawing on wrapped lines is broken
  • the ctrl-shift keybinding should not be hardcoded
  • the code needs to be refactored into an extensions, and should interact with the editor via events, "change", "afterRender" etc.

@ashensis
Copy link
Contributor Author

Thank you @nightwing for prompt review.
However in order to avoid misunderstanding, could I kindly ask you to eleborate (at least in short terms) these 3 last comments of yours"

text selection/marker drawing on wrapped lines is broken
the ctrl-shift keybinding should not be hardcoded
the code needs to be refactored into an extensions, and should interact with the editor via events, "change", "afterRender" etc.

@nightwing
Copy link
Member

  1. On https://rawgit.com/ashensis/ace/268da25ff3b601c2362163661f8b14fa6bc02ebe/kitchen-sink.html
    selection ranges are wrong, see screenshot
    image
  2. The ctrl-shift needs to be a command similar to other commands, so that users can rebind the key.
  3. currently code for this feature is in several files, it would be better to move it into a separate file similar to https://github.com/ajaxorg/ace/blob/master/lib/ace/ext/language_tools.js or https://github.com/ajaxorg/ace/blob/master/lib/ace/ext/searchbox.js, and use event listeners to update lines after render, and modify input.

@ashensis
Copy link
Contributor Author

ashensis commented Oct 22, 2017

@nightwing
Here is the list of changes addressing your comments:

  1. Fixed text selection on wrapped lines.
  2. Replaced ctrl-shift by respectively ctrl-alt-shift-l and ctrl-alt-shift-r hot keys in order
    to not interfer with existing hot keys. Featured them as reconfugurable commands in default_commands.js
  3. Moved all right-to-left pertinent code from textinput.js, editor.js, edit_session.js into appropriate
    event handlers in bidihandler.js. Made handlers to be switched on only for modes for which right-to-left is supported.
  4. Moved all "bidiHandler.markAsDirty" calls (intoduced in PR Support for Arabic/Hebrew/Persian languages. #3137 Support for Arabic/Hebrew/Persian) into
    appropriate handlers in bidihandler.js.
  5. Refactored right-to-left related code in text.js by making it smaller, Ceased to use DOM 'textContent' property.
  6. Finally, on merging with your last changes I dropped down your code related to 'beenBidi' from BidiHandler since it disrupts the whole functionality.
    Please note that your only update this member on input and return false from 'isBidiRow' when this member is false, but 'isBidiRow' gets executed also when cursor changes the row (due to mouse or keyboard initiated re-positionin eturning old not updated 'false' value compromises the whole functionality.

@nightwing
Copy link
Member

@ashensis does seenBidi break code already on master or only the new code when switching direction of a line?
I think to fix it in the new version, we only need to set seenBidi to true when toggleRtlDirection is called.

@ashensis
Copy link
Contributor Author

@nightwing
Your stuff related to 'seenBidi' breaks the old code introduced in PR #3137 Support for Arabic/Hebrew.....
This 'isBidiRow' is called in many cases, in particular when caret position is calculated, and consequentially it should rebuild bidi map every time when cursor is moved to different screen line (either by mouse or keyboard arrows movement/selection).
Thus 'seenBidi' introduced by you and being set on typing only, completely disrupts the functionality.

@nightwing
Copy link
Member

@ashensis if document doesn't contain any bidi characters computing bidi map is not necessary.
It becomes necessary only when such characters are inserted by typing at which point seenBidi will be set to true and isBidiRow will work as it did before.
Could you please describe a sequence of actions that user needs to perform to see the issue?

@ashensis
Copy link
Contributor Author

@nightwing
You set seenBidi to 'true' value only on "insert" action. But if document containing Bidi characters is loaded (no typing has been performed yet" then seenBidi remains "false" and Bidi functionality is never invoked (bidi map have no chance to be created in 'isBidiRow')

@talgalili
Copy link

Hi @nightwing and @ashensis
I wanted to know if there is a chance to keep this patch forward. This would be helpful to many people.

Thanks,
Tal

@nightwing
Copy link
Member

Hi, i have rebased and refactored code from this pull request in #3658. And also fixed the bug with seenBidi.
Could you please verify that all the features you have added work as expected on that branch. Thanks!

@nightwing
Copy link
Member

merged as #3658

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants