Discussion:
Pegex Parser for Inline::C
Ingy dot Net
2012-07-01 00:19:43 UTC
Permalink
Greetings,

I got the 10 year itch to hack on Inline today. Nick Patch and I wrote an
Inline::C parser in Pegex. I commited it to the 'pegex' branch here:
https://github.com/ingydotnet/inline-pm/tree/pegex

It passes an initial test: C/xt/pegex.t

ParsePegex should be nearly as fast as ParseRegExp and is more beautiful
and maintainable than ParseRecescent.

See:
https://github.com/ingydotnet/inline-pm/blob/pegex/C/lib/Inline/C/ParsePegex.pm#L35

As you can see Pegex separates the Grammar from the AST building code, both
of which are fairly simple. Nick and I paired on this today and we found a
couple bugs in Pegex that keep the grammar from being even more readable.
I'll try to fix them soon.

Looking over the code base, it's obvious that Inline could be brought up to
date with Modern Perl™. I'm willing to put some effort into that. Rob, do
you have the cycles to manage a new release over the couple months? (I
would probably do all my refactorings on branches, and let you integrate
and manage it).

Anyone who is interested, I started #inline on irc.perl.org to discuss it.

Cheers, Ingy

NOTE: This is the first Inline code contribution by Ingy döt Net! :D
Sisyphus
2012-07-02 00:42:40 UTC
Permalink
----- Original Message -----
From: "Ingy dot Net" <ingy-***@public.gmane.org>
To: "inline" <inline-***@public.gmane.org>; "Nick Patch" <patch-***@public.gmane.org>
Sent: Sunday, July 01, 2012 10:19 AM
Subject: Pegex Parser for Inline::C
Post by Ingy dot Net
Greetings,
I got the 10 year itch to hack on Inline today.
And about time, too ;-)
Post by Ingy dot Net
Nick Patch and I wrote an
https://github.com/ingydotnet/inline-pm/tree/pegex
Been away all weekend - I'll have a look at it tonight. (More later, once
I've had a chance to peruse it.)
Post by Ingy dot Net
It passes an initial test: C/xt/pegex.t
ParsePegex should be nearly as fast as ParseRegExp and is more beautiful
and maintainable than ParseRecescent.
https://github.com/ingydotnet/inline-pm/blob/pegex/C/lib/Inline/C/ParsePegex.pm#L35
As you can see Pegex separates the Grammar from the AST building code, both
of which are fairly simple. Nick and I paired on this today and we found a
couple bugs in Pegex that keep the grammar from being even more readable.
I'll try to fix them soon.
Looking over the code base, it's obvious that Inline could be brought up to
date with Modern PerlT. I'm willing to put some effort into that. Rob, do
you have the cycles to manage a new release over the couple months? (I
would probably do all my refactorings on branches, and let you integrate
and manage it).
No problem there. (If I get a bit slow on the uptake, just let me know when
you think it's right to go and 'twill be done.)

Cheers,
Rob
Sisyphus
2012-07-02 11:31:55 UTC
Permalink
----- Original Message -----
From: "Sisyphus"
Post by Sisyphus
Been away all weekend - I'll have a look at it tonight. (More later, once
I've had a chance to peruse it.)
First thing I noticed is that we now have additional dependencies of (the
non-core modules) IO::String, IO::All, YAML::XS, Pegex and parent.

No big deal, I guess, but I get repeated warnings for all of the IO::All
test scripts:

Useless use of \E at lib/IO/All.pm line 76.
Useless use of \E at lib/IO/All.pm line 84.
Useless use of \E at lib/IO/All.pm line 85.

When I get to running xt/pegex.t I get:

C:\sisyphusion\Inline-pegex\C>perl -Mblib xt/pegex.t
1..2
Useless use of \E at C:/MinGW/perl516/site/lib/IO/All.pm line 76.
Useless use of \E at C:/MinGW/perl516/site/lib/IO/All.pm line 84.
Useless use of \E at C:/MinGW/perl516/site/lib/IO/All.pm line 85.
ok 1
ok 2 - parse worked
rm: cannot remove `_Inline/lib/auto/pegex_t_c4f8/pegex_t_c4f8.dll':
Permission denied
# Looks like your test exited with 256 just after 2.

Why can't it remove that file ? (It can do it in the BEGIN block, but not
the END block.)

Recently (Inline-0.50_01) I hacked ParseRegExp to accommodate the passing of
'void' as a function's single argument - so we can now do:

SV * foo(void) {
return newSViv(-42);
}

That works only with ParseRegExp, btw. It still fails with ParseRecDescent
and, AIUI, will continue to do so until someone writes the appropriate patch
for Parse::RecDescent.
There's a long-standing bug report for this at
https://rt.cpan.org/Public/Bug/Display.html?id=5465

I was curious to see how Pegex would handle that, so I tried this script
(xs/foo.t):

#############################
use strict;
use Test::More tests => 1;

BEGIN { system "rm _Inline* -fr" }
END { system "rm _Inline* -fr" }

use Inline C => <<'END', USING => 'ParsePegex';
SV* foo(void) {
return newSViv(-42);
}
END

is foo(), -42;
#############################

I got:

#############################
C:\sisyphusion\Inline-pegex\C>perl -Mblib xt/foo.t
1..1
C:\MinGW\perl516\bin\perl.exe C:\MinGW\perl516\lib\ExtUtils\xsubpp -typemap
"C:
\MinGW\perl516\lib\ExtUtils\typemap" foo_t_0c81.xs > foo_t_0c81.xsc &&
C:\MinGW
\perl516\bin\perl.exe -MExtUtils::Command -e mv -- foo_t_0c81.xsc
foo_t_0c81.c
Could not find a typemap for C type 'voi'.
The following C types are mapped by the current typemap:
'AV *', 'Boolean', 'CV *', 'FILE *', 'FileHandle', 'HV *', 'I16', 'I32',
'I8', 'IV', 'InOutStream', 'InputStream', 'NV', 'OutputStream', 'PerlIO *',
'Result', 'STRLEN', 'SV *', 'SVREF', 'SysRet', 'SysRetLong', 'Time_t *',
'U16', 'U32', 'U8', 'UV', 'bool', 'bool_t', 'caddr_t', 'char', 'char *',
'char **', 'const char *', 'double', 'float', 'int', 'long', 'short',
'size_t', 'ssize_t', 'time_t', 'unsigned', 'unsigned char', 'unsigned char
*', 'unsigned int', 'unsigned long', 'unsigned long *', 'unsigned short',
'void *', 'wchar_t', 'wchar_t *'
in foo_t_0c81.xs, line 16
dmake: Error code 129, while making 'foo_t_0c81.c'

A problem was encountered while attempting to compile and install your
Inline
C code. The command that failed was:
dmake > out.make 2>&1

The build directory was:
C:\sisyphusion\Inline-pegex\C\_Inline\build\foo_t_0c81

To debug the problem, cd to the build directory, and inspect the output
files.

at xt/foo.t line 7.
BEGIN failed--compilation aborted at xt/foo.t line 11.
#############################

It'd be nice if that could be fixed before we go to production - that would
leave ParseRecDescent as the odd man out.
(I'm sure there are other things to be fixed too ... I'm well aware that
this is still early days for 'ParsePegex'.)

Cheers,
Rob
David Oswald
2012-07-02 16:17:09 UTC
Permalink
----- Original Message ----- From: "Sisyphus"
Post by Sisyphus
Been away all weekend - I'll have a look at it tonight. (More later, once
I've had a chance to peruse it.)
First thing I noticed is that we now have additional dependencies of (the
non-core modules) IO::String, IO::All, YAML::XS, Pegex and parent.
My .02: I really like the cleaner grammar of Pegex. The "FAIL" rates
of Pegex and YAML::XS are higher than the FAIL rate of any of
Inline::C's dependencies, including Inline::C itself. It's too bad
that CPAN install tools don't yet support enough of the
CPAN::Meta::Spec to allow (easily) for alternate dependency chains (at
least that I'm aware). If they did, one could specify to try the
Pegex chain first, and then the Parse::Regex chain, and then the
Parse::RecDescent chain (as a last resort).

Since that's still not easy to do in any reliable fashion I just
wonder if we could come together to work on the FAILs for those
dependency modules (and on up their chains), sort of as we did with
Inline::CPP.

I'm interested in this too, because it seems like it could also be
applicable to Inline::CPP, though I'm not sure that I have the time to
learn and write the Pegex grammar for Inline::CPP at the moment. But
still, it would be nice to have a more maintainable grammar than what
we've got for Inline::CPP currently. I'd love to see the future bring
a larger subset of C++ to Inline::CPP's understanding.

Dave
--
David Oswald
daoswald-***@public.gmane.org
Sisyphus
2012-07-03 09:38:07 UTC
Permalink
----- Original Message -----
From: "Sisyphus"
Post by Sisyphus
C:\sisyphusion\Inline-pegex\C>perl -Mblib xt/pegex.t
1..2
Useless use of \E at C:/MinGW/perl516/site/lib/IO/All.pm line 76.
Useless use of \E at C:/MinGW/perl516/site/lib/IO/All.pm line 84.
Useless use of \E at C:/MinGW/perl516/site/lib/IO/All.pm line 85.
ok 1
ok 2 - parse worked
Permission denied
# Looks like your test exited with 256 just after 2.
Why can't it remove that file ? (It can do it in the BEGIN block, but not
the END block.)
No need to worry - that's just the way MS Windows works.
Having loaded the dll, it then can't be removed until *after* the script
(process) has exited.

Well .... if we really do need to clean everything up in the END block it
would first have to unload the dll. This could be done using a
Win32::FreeLibrary(HANDLE) call.
Assuming we can get a hold of the HANDLE, that would work.
Hopefully, in the finished product, there'll be no need to perform this
clean up.

Cheers,
Rob

Loading...