Author Topic: cursor word motion and word selection  (Read 1911 times)

jporkkahtc

  • Senior Community Member
  • Posts: 1861
  • Hero Points: 179
  • Text
cursor word motion and word selection
« on: May 24, 2017, 09:43:27 pm »
Given text like this

the quick brown fox jumped over
ab  cd    ef    gh  ij     kl


WRT Motion:
Moving to the end of a word is really painful.
Starting at the end of the 2nd line, there is no keyboard way to jump to the end of each word.
Ctrl+Left jumps to the beginning of the previous word, but there is no way to get to the end of the previous word.

WRT Selection:
Using the keyboard it is painful to select words on the 2nd line without getting a bunch of extra padding.
Placing the cursor at the 3rd word, Ctrl+Shift+Right selects the 2 letters plus 4 spaces

"ef    "


OTOH, double-clicking on "ef", selects just those two characters -- exactly what I want.

The behavior that I think I'd like to see here is:
Place cursor at start of 2nd line.
  • Ctrl+Shift+Right -- Selects "ab"
  • Ctrl+Shift+Right -- Selects "ab  cd"
  • Ctrl+Shift+Right -- Selects "ab  cd    ef"
  • Ctrl+Shift+Right -- Selects "ab  cd    ef    gh"
  • Ctrl+Shift+Right -- Selects "ab  cd    ef    gh  ij"
  • Ctrl+Shift+Right -- Selects "ab  cd    ef    gh  ij     kl"
Is there a keyboard way to do this?

Perhaps a new command "copy-to-clipboard-trim" -- which would copy the selection minus any leading/trailing white space.



Tim Kemp

  • Senior Community Member
  • Posts: 536
  • Hero Points: 90
Re: cursor word motion and word selection
« Reply #1 on: May 25, 2017, 11:06:38 am »
One of the great things about SlickEdit is that it's so flexible. If you want the behavior you describe, write a macro and bind it to whatever key you want. You don't have to stick with the default behavior. Let your freak flag fly!

Lee

  • SlickEdit Team Member
  • Senior Community Member
  • *
  • Posts: 1299
  • Hero Points: 130
Re: cursor word motion and word selection
« Reply #2 on: May 25, 2017, 01:56:51 pm »
It depends on emulation and setting for Tools > Options > Editing > General > Next Word Style.  Is it set to Begin or End of word?  If emulation is Visual Studio, Visual C++ there is custom word style for that, Brief also has custom word style.  You can disable the custom styles and just use the SlickEdit implementation.  For Visual Studio/C++ emulation, set-var def_vcpp_word 0.  For Brief, set-var def_brief_word 0.  You can also enable Brief word style even not in Brief with set-var def_brief_word 1, and making sure Visual C++ style is turned off.

There are subtle differences between the next/prev word implementations, you should try all three word styles to if one suits you better.

jporkkahtc

  • Senior Community Member
  • Posts: 1861
  • Hero Points: 179
  • Text
Re: cursor word motion and word selection
« Reply #3 on: May 25, 2017, 07:05:38 pm »
Hey much better.
I've tried that before, but I was missing the def_vcpp_word setting ... it was set to 1.
Setting it to 0 plus NextWordStyle to End seems to be what I want.
I'll try it for awhile and see what happens.

WRT customizing the behavior: Yeah, yeah ... but this is a particularly complicated bit of machinery with a lot of corner cases to get right.

I tried looking in macros\*.e to see what def_brief_word and def_vcpp_word do, and I found a bug, I think.
In stdcmds.e, static _str skip_word(_str direction_option)
    Line 7770: ch=get_text(-1);
    Line 7821: ch=get_text();
I think line 7821 should also have the "-1" parameter - just to be consistent with the rest of the code.

get_text docs say that -1 "Get current unicode character or DBCS character"
But, what is the alternative?

I found with some unicode characters get_text(-1) sometimes returns a different result than get_text()
Get_text() appears to return the actual character, while get_text(-1) returns something else - the wrong thing.

Example: Unicode character 5B9A is returned as 59E0 by get_text(-1), but get_text() returns 5B9A.
Why?




Clark

  • SlickEdit Team Member
  • Senior Community Member
  • *
  • Posts: 4962
  • Hero Points: 409
Re: cursor word motion and word selection
« Reply #4 on: May 30, 2017, 04:39:31 pm »
Example: Unicode character 5B9A is returned as 59E0 by get_text(-1), but get_text() returns 5B9A.
Why?
get_text(-1) returns 0x5B9A for me. get_text() returns the first byte (0xE5) of the Utf-8 sequence which is not a valid Utf-8 character.
Code: [Select]
   _str ch;
   edit('not-found.xml');
   _insert_text(_UTF8Chr(0x5B9A));
   left();
   ch=get_text(-1);
   say(dec2hex(_UTF8Asc(ch)));  // Displays 0x5B9A
   ch=get_text();
   say(dec2hex(_asc(ch)));  // Displays 0xE5

jporkkahtc

  • Senior Community Member
  • Posts: 1861
  • Hero Points: 179
  • Text
Re: cursor word motion and word selection
« Reply #5 on: May 30, 2017, 06:03:23 pm »
OK, I was just printing CH with say, then copy-n-paste the say window into a buffer.
So what use is get_text() without -1 parameter then?

I guess I assumed that the characters in _str were whole characters, and not bytes.
   _str s=get_text(3);
   int i;
   for(i = 0;i<3;++i) {
       ch=substr(s, i+1,1);
       say("ch:"ch", Hex:"dec2hex(_asc(ch))");
   }

prints the UTF-8 bytes.

So how do you step thru a string one character at a time?

Hm....
So this is why column oriented things have some problems with utf-8 multiple characters.
format_cols for example, seems to align things based on byte counts, not characters.

Column selections isn't perfect, but it is better than format columns.

Clark

  • SlickEdit Team Member
  • Senior Community Member
  • *
  • Posts: 4962
  • Hero Points: 409
Re: cursor word motion and word selection
« Reply #6 on: June 01, 2017, 08:51:10 pm »
There isn't a function for stepping through a Utf-8 string by each individual Utf-8 sequence ( i=_Utf8Next(s,i) ). My guess is it's never been needed so we never added one. Right now, you'd have to insert the text into a buffer (maybe a temp buffer).

Usually the data comes from a buffer to start with. Use right()/left() to cursor through a line of text. These functions work whether the underlying data is SBCS/DBCS or Utf-8. I think right() and left() traverse composite characters which may be multiple Utf-8 sequences. Currently, composite characters are only supported on Windows.

SlickEdits native data formats are SBCS/DBCS and Utf-8. Using Utf-8 was definitely the best way to go. By default, when data is fetched from an SBCS/DBCS file, the text is converted to Utf-8. If SlickEdit stored data as Utf-16, that would be aweful. Very few files are stored on disk as Utf-16 and Utf-16 must support surrogates which coders often incorrectly forget to support. Utf-32 would allow easiest character traversal but takes way too much memory. Utf-32 doesn't help with composite characters but I don't think anyone would care if SlickEdit didn't support composite characters.

Keep in mind that there is no such thing as a fixed font when dealing with Unicode text. SlickEdit has a number of functions for doing pixel-to-column and column-to-pixel calculations.

jporkkahtc

  • Senior Community Member
  • Posts: 1861
  • Hero Points: 179
  • Text
Re: cursor word motion and word selection
« Reply #7 on: June 01, 2017, 10:14:00 pm »
Yeah, I'd imagine if one were writing an editor starting today probably the only native text format would be utf-8. Everything else is converted to that in memory. (Well, maybe binary too, but having just a single in memory data format has got to simplify things).

format_columns needs _Utf8Next().
Well, possibly, it needs _strUtf8. It grabs text from the buffer and does a bunch of string operations - like pos(), to find where the columns are.

What else in Slick needs this?
Probably the beautify code.
I'm a little surprised that word-wrap while typing seems to work fine.


Format_columns on a UTF8 buffer will count each byte of each character as a column.
This isn't about fonts at all since format_columns doesn't care.
See the attached file for examples.

column selection works pretty well, though I noted a specific bug in that file.
« Last Edit: June 01, 2017, 10:17:47 pm by jporkkahtc »

Clark

  • SlickEdit Team Member
  • Senior Community Member
  • *
  • Posts: 4962
  • Hero Points: 409
Re: cursor word motion and word selection
« Reply #8 on: June 02, 2017, 08:41:02 pm »
Format columns is limited to simple scenarios with fixed fonts. It does no font calculations. It will never work with all Unicode characters even with a fixed font. When lining up source code, it's better not to assume a particular font. Sadly, there's just no perfect solutions.

Word wrap does font calculations. Block selections do font calculations.