Hi Folks,
looking around while trying to search an abap pattern in the whole abap repository with the interesting "ABAP_SOURCE_CODE_SCAN" report i realized that SAP has built-in regexp pattern matching features into ABAP.
I had been waiting long time for such a feature.
An overview of the underlying abap syntax can be found looking at the demo source code and is quite simple:
IF nocase = 'X'.
FIND REGEX regex IN TABLE result_it IGNORING CASE
SUBMATCHES sub1 sub2 sub3 sub4 sub5 sub6.
IF first = 'X'.
REPLACE REGEX regex IN TABLE result_it
WITH new_marked IGNORING CASE.
ELSE. " all = 'X'
WITH new_marked IGNORING CASE.
ENDIF.
ELSE. " case = 'X'
FIND REGEX regex IN TABLE result_it
SUBMATCHES sub1 sub2 sub3 sub4 sub5 sub6.
IF first = 'X'.
REPLACE REGEX regex IN TABLE result_it
WITH new_marked.
ELSE. " all = 'X'
REPLACE ALL OCCURRENCES OF REGEX regex IN TABLE result_it
WITH new_marked.
ENDIF.
Here text lines in abap are represente as a "table of string data elements" which is quite common.
If you are interested in regexp there is an interesting tutorial here at
regexp tutorial
The syntax should be posix compliant. i tried some simple stuff and it works just fine.
SAP has released i nice toy test program acually called
DEMO_REGEX_TOY
Here a screen shot
So what does \b(\w+)\s+\1\b mean?
\b is the beginning of the word
\w+ means 1 or more alphanumeric chars
\s+ means 1 or more whitespace chars
\1 means the first subpattern matched
\b means end of the word in that position.
So basically you ask the regexp engine to match
a beginning word made of 1 or more alpha chars wollowed by one or more whitespace chars followed by the first matched subpattern (the (\w+) ) followed by an ending word.
Funny isn't it?
But quite useful and powerful!