Author Topic: Incorrect replace using Perl regex  (Read 1448 times)

CyberZombie

  • Community Member
  • Posts: 23
  • Hero Points: 0
Incorrect replace using Perl regex
« on: September 14, 2016, 08:24:33 pm »
When I attempt to clear trailing white space, I use a global replace using the expression "\s+$". In B2, B3 it goes one better - it also wipes out empty lines.

Also, I expect to only need to use "\n" to signify end-of-line. But for Windows 0x0d0x0a terminators, I have to use "\n\r" - new to me from V15

Clark

  • SlickEdit Team Member
  • Senior Community Member
  • *
  • Posts: 5146
  • Hero Points: 425
Re: Incorrect replace using Perl regex
« Reply #1 on: September 14, 2016, 08:39:14 pm »
v15 and v16 are the same with respect to these subtle difference. Both have the new much more accurate Perl syntax support. There are a number of Perl regex things I'm not a fan of but SlickEdit's Perl regex support needs to be very accurate.

SlickEdit regex doesn't have these issues but it is very different from Perl syntax though.

jporkkahtc

  • Senior Community Member
  • Posts: 1901
  • Hero Points: 184
  • Text
Re: Incorrect replace using Perl regex
« Reply #2 on: September 14, 2016, 08:52:21 pm »
WRT line endings, use \R - it matches any line ending.

In the Perl RE spec it has some odd things - the one that bugs me the most:
    [^abc]
Matches anything but abc, including newlines.
So generally when using this you want to explicitly exclude newlines as well:
   [^abc\r\n]


This is a useful RE resource: http://www.regular-expressions.info/refcharacters.html
It would be nice if they included Slick RE information too  :)

CyberZombie

  • Community Member
  • Posts: 23
  • Hero Points: 0
Re: Incorrect replace using Perl regex
« Reply #3 on: September 14, 2016, 09:13:38 pm »
Yes - wrt \n\r I was relying on old incorrect behavior :) Thanks jporkkahtc for the \R pointer.

It still doesn't explain \s+$ removing empty lines as \n and \r aren't whitespace characters...

CyberZombie

  • Community Member
  • Posts: 23
  • Hero Points: 0
Re: Incorrect replace using Perl regex
« Reply #4 on: September 14, 2016, 09:22:40 pm »
Follow-up - from the perldoc, there is an exception (http://perldoc.perl.org/perlrecharclass.html):
Quote
If the /a modifier is in effect ...

In all Perl versions, \s matches the 5 characters [\t\n\f\r ]; that is, the horizontal tab, the newline, the form feed, the carriage return, and the space. Starting in Perl v5.18, it also matches the vertical tab, \cK . See note [1] below for a discussion of this.
And from the regex modifiers (http://perldoc.perl.org/perlre.html#Character-set-modifiers):
Quote
The /a modifier, on the other hand, may be useful. Its purpose is to allow code that is to work mostly on ASCII data to not have to concern itself with Unicode.
Does this mean that I should be doing all my text editing using Unicode to get \s+ to not match newline?