Reini Urban
2014-10-15 13:58:53 UTC
parrot wants to change its behavior with illegal escape sequences with 6.9.0
See https://github.com/parrot/parrot/issues/1103
parrot and rakudo smoked fine with this branch,
and it helps finding imcc parser bugs, esp. .lex quoting issues.,
tracked in GH #1095, which was found by this perl6
https://rt.perl.org/Public/Bug/Display.html?id=116643
Previously:
Silently ignore illegal escapes for a-zA-Z
and change the string from "foo\o" to "fooo"
Now:
Throw "Illegal escape sequence \o in foo\o"
The C standard requires such "invalid" escape sequences to be diagnosed
(i.e., the compiler must print an error message). Parrot behaved strange,
and hence imcc parser quoting bugs were never fixed.
Parrot_str_unescape()
Unescapes the specified C string. These sequences are covered:
\xhh 1..2 hex digits
\ooo 1..3 oct digits
\cX control char X
\x{h..h} 1..8 hex digits
\uhhhh 4 hex digits
\Uhhhhhhhh 8 hex digits
\a, \b, \t, \n, \v, \f, \r, \e
These sequences are not escaped: C<\\ \" \' \?>
All other escape sequences within C<[a-zA-Z]> are illegal.
This printed ok 4 instead of ok 3:
.sub 'main' :main
$S0 = 'bar\o'
$P1 = box 'ok 1'
set_global $S0, $P1
$P2 = get_global 'bar\o'
say $P2
$S1 = "foo\\o"
$P1 = box 'ok 2'
set_global "foo\\o", $P1 # ok, parsed as "foo\\o"
$P2 = get_global "foo\\o"
say $P2
$S2 = "foo\o"
$P1 = box 'ok 3'
$S3 = "fooo"
$P2 = box 'ok 4'
set_global "foo\o", $P1 # wrong, parsed as "fooo"
set_global "fooo", $P2
$P3 = get_global "foo\o"
say $P3
$P3 = get_global "fooo"
say $P3
.end
but the real problem is with double-quoted .lex names.
What do you think? Should I allow illegal escape chars for one
deprecation cycle, just warn once?
Or change it right away? I'd go for right away.
See https://github.com/parrot/parrot/issues/1103
parrot and rakudo smoked fine with this branch,
and it helps finding imcc parser bugs, esp. .lex quoting issues.,
tracked in GH #1095, which was found by this perl6
https://rt.perl.org/Public/Bug/Display.html?id=116643
Previously:
Silently ignore illegal escapes for a-zA-Z
and change the string from "foo\o" to "fooo"
Now:
Throw "Illegal escape sequence \o in foo\o"
The C standard requires such "invalid" escape sequences to be diagnosed
(i.e., the compiler must print an error message). Parrot behaved strange,
and hence imcc parser quoting bugs were never fixed.
Parrot_str_unescape()
Unescapes the specified C string. These sequences are covered:
\xhh 1..2 hex digits
\ooo 1..3 oct digits
\cX control char X
\x{h..h} 1..8 hex digits
\uhhhh 4 hex digits
\Uhhhhhhhh 8 hex digits
\a, \b, \t, \n, \v, \f, \r, \e
These sequences are not escaped: C<\\ \" \' \?>
All other escape sequences within C<[a-zA-Z]> are illegal.
This printed ok 4 instead of ok 3:
.sub 'main' :main
$S0 = 'bar\o'
$P1 = box 'ok 1'
set_global $S0, $P1
$P2 = get_global 'bar\o'
say $P2
$S1 = "foo\\o"
$P1 = box 'ok 2'
set_global "foo\\o", $P1 # ok, parsed as "foo\\o"
$P2 = get_global "foo\\o"
say $P2
$S2 = "foo\o"
$P1 = box 'ok 3'
$S3 = "fooo"
$P2 = box 'ok 4'
set_global "foo\o", $P1 # wrong, parsed as "fooo"
set_global "fooo", $P2
$P3 = get_global "foo\o"
say $P3
$P3 = get_global "fooo"
say $P3
.end
but the real problem is with double-quoted .lex names.
What do you think? Should I allow illegal escape chars for one
deprecation cycle, just warn once?
Or change it right away? I'd go for right away.
--
Reini Urban
http://cpanel.net/ http://www.perl-compiler.org/
Reini Urban
http://cpanel.net/ http://www.perl-compiler.org/