package Parse::YYLex;

=head1 NAME

Parse::YYLex - Version of Parse::Lex to be used by a byacc parser.

=head1 SYNOPSIS

I<Parse::Lex> requires this perl version:
  require 5.004;
  use Parse::YYLex;

If using a procedural parser:
  Parse::YYLex->create ...; # exports &yylex and $yylval
    # see Parse::Lex for the token table args <...>
  Parse::YYLex::lex->from(\*FH);
  require 'MyParser.pl';  # generated by byacc 
  yyparse();

If using an object-oriented parser:
  $lexer = new Parse::YYLex ...; 
    # see Parse::Lex for the token table args <...>
  use MyParser;  # generated by byacc5
  $parser = new MyParser($lexer->getyylex, \&yyerror, $debug);
                                  # you must write &yyerror 
  $lexer->from(\*STREAM);
  $parser->yyparse(*STREAM);

To get the token definitions from F<MyParser.ph> instead of 
F<y.tab.ph> or to change the skip regexp (default 
whitespace), do this before calling C<new> or C<create>:
  Parse::YYLex->ytab('MyParser.ph');
  Parse::YYLex->skip('');

=head1 DESCRIPTION

Often times you'd use a lexer in conjunction with a parser. 
And most of the time you'd want to generate that parser with 
a yacc parser generator. B<Parse::YYLex> is a derivative of 
I<Parse::Lex> compatible with yacc parsers, by adapting it 
to the byacc calling conventions:

=over 4

=item *

The parser wants to receive integer token types as defined 
in F<y.tab.ph> instead of the symbolic types that 
I<Parse::Lex> returns.

=item *

The parser wants its tokens as two components (type and 
value), whereas I<Parse::Lex> returns one object with these 
two components. Furthermore, a procedural parser wants the 
value stored in a variable C<$yylval>.

=item *

The parser wants to receive the tokens by calling a yylex 
B<function>, not an object B<method>. Thus we have to give 
the parser a curried form of the lexer function, where the 
self argument is fixed.

=back

=head2 Procedural Parsers

Yacc (and Bison) traditionally generate C or C++ parsers. 
Fortunately, Berkeley yacc has been modified to generate 
Perl, see 
  ftp://ftp.sterling.com/local/perl-byacc.tar.Z 

Byacc with the I<-P> option generates procedural perl code 
that is compatible with both perl4 and perl5. (However you 
cannot use B<Parse::YYLex> with perl4.) Use this variant for 
quick hacks, as it is more convenient than the one below. In 
this case C<S<Parse::YYLex->create>> instantiates a lexer 
and exports a C<&yylex> function (the lexer) and a 
C<$yylval> variable (the token value) to its caller's 
namespace (which should be the namespace of the parser). 

If you need to call any object methods of the created lexer 
(see I<Parse::Lex> for documentation), use the 
C<$Parse::YYLex::lex> variable.

=head2 Object-Oriented Parsers

Another byacc modification (I call it byacc5) generates 
object-oriented Perl5 code: 
  CPAN/authors/id/JAKE/perl5-byacc-patches-0.5.tar.gz

Use this variant if you need more than one parser, you need 
flexibility, or you simply like OO. In this case you need to 
use I<new>, and pass the return value of I<getyylex> (a 
reference to the curried lexing function) to the parser.
The lexing function returns a two-element array, the token 
type and value.

=head2 Numeric Token Table

Yacc parsers insist on using numeric token types, and define 
these in a file customarily named F<y.tab.ph>. That is where 
B<Parse::YYLex> will look by default, and the file has to be 
in the @INC path (which includes the current directory).

You can specify a different token table before calling 
C<new> or C<create>:
  Parse::YYLex->ytab('MyParser.ph');

=head1 LIMITATIONS

C<Parse::YYLex> is based on I<Parse::Lex> which requires 
perl 5.004 and will not work with earlier versions. A 
slightly different version, I<Parse::CLex>, works with 
earlier perl versions. It would be easy to allow a choice 
between I<Parse::Lex> and I<Parse::CLex>, but the latter has 
some limitations, and presently seems to have
some bugs.

=head1 AUTHOR

Vladimir Alexiev <vladimir@cs.ualberta.ca>

=head1 SEE ALSO

byacc(1), L<Parse::Lex>.

=cut

use strict;
use Parse::Lex;
use vars qw(@ISA $VERSION);

@ISA = qw(Parse::Lex);
$Parse::YYLex::VERSION = '2.19';

my ($TABLE,$yylval);
my $ytab = 'y.tab.ph';
use vars '$lex';                # global lexer

sub ytab {                      # class method
  my ($self,$tab) = @_;
  $ytab = $tab;
}

sub new {
  my $class = ref $_[0] || $_[0]; shift; 
  my ($path) = grep -f "$_/$ytab", @INC;
  $path or die "can't locate the file $ytab in \@INC\n"; 
  my %table;
  open(FILE,"$path/$ytab") or die "can't open $ytab: !$\n"; 
  for (<FILE>) {$table{$1}=$2 if /^\$(\w+)=(\d+);$/} 
  close(FILE);
  # Normally we would create the object, but Parse::Lex uses weird array-based 
  # attributes instead of the conventional hash-based attributes.
  my $self = bless new Parse::Lex(@_), $class; 
  # index of my slot in the Lex attr array
  if (!defined $TABLE) {$TABLE = scalar @$self;} 
  elsif ($TABLE != scalar @$self) 
    {die "Can't get a consistent slot for my TABLE attribute";}
  push @$self, \%table; 
  $self;
}

sub create {                    # class method
  $lex = shift->new(@_);        # get an instance 
  my $caller = caller;
  no strict 'refs';
  *{"${caller}::yylex"} = $lex->getyylex; 
  $yylval = \${"${caller}::yylval"};
}

sub yylex {                     # convert &next to yyparse calling conv
  my $self = shift;
  # my $stream = shift; $self->from($stream) if FIXME 
  # it's different from last time... 
  my $token = $self->next;
  my $text = $token->text;
  my $symbolic = $token->type;  # symbolic token type 
  my $numeric =                 # numeric token code
    $symbolic eq 'EOI' ? 0 :    # end of input
      $self->[$TABLE]{$symbolic} || # token defined in y.tab.ph
        (length $text==1 ? ord $text : # single-character token, eg '+'
         die "Please declare %token $symbolic (`$text') in the parser.\n");
  if ($yylval) {                # global lexer
    $$yylval = $text;
    return $numeric;
  } else {return ($numeric,$text);}
  # it doesn't hurt to return both in both cases, but let's play safe
}

sub getyylex {
  my $self = shift;
  sub{$self->yylex(@_)}
}

1;

