Sunday, March 05, 2006

Awk/Sed to Perl

When one of my pal asking is there a way to port awk script to perl, I remember there's a tool that I haven't tried before which able to convert awk script to perl script, and yes it is in the OpenBSD port as well, I quickly install it and give it a try.

A2P - Awk to Perl converter

I just write a simple awk filter to try out which is awk-test

shell>cat awk-test

# Simple Awk filter to search for the non-numeric at first field of data

!/^[0-9]/{ print $1 }

Then this is the file I want to filter - datafile

shell>cat datafile
Bon Jovi 190
Lee 20
Ven 2000
Jack 100222
890 Lee

This is the result when I run the awk simple script against datafile.

shell>nawk -f awk-test datafile
Bon
Lee
Ven
Jack

Since it works, then I try to use a2p to convert it to perl code.

shell>a2p awk-test
#!/usr/bin/perl
eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
if $running_under_some_shell;
# this emulates #! processing on NIH machines.
# (remove #! line above if indigestible)

eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z_0-9]+=)(.*)/ && shift;
# process any FOO=bar switches

$, = ' '; # set output field separator
$\ = "\n"; # set output record separator

while (<>) {
($Fld1) = split(' ', $_, 9999);
if (!/^[0-9]/) {
print $Fld1;
}
}

Once I have the output, I redirect it to aperl-test.

shell>a2p awk-test > aperl-test

shell>chmod +x aperl-test

Now I run the perl script that converted from awk and it works :)

shell>./aperl-test datafile
Bon
Lee
Ven
Jack

Later I found out there's sed to perl converter too and feel fun to check it out.

s2p - Sed to Perl Converter

shell>cat sed-test

# Simple Sed filter to search for : and replace with null globally
s/://g

I create the file called datafile2 to try out the filter.

shell>cat datafile2
Lee: 123
Tia: 456
Test: 789
god: 345
ghost: 098

I directly convert the sed script to perl script.

shell>s2p -f sed-test
#!/usr/bin/perl -w
eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
if 0;
$0 =~ s/^.*?(\w+)[\.\w+]*$/$1/;

use strict;
use Symbol;
use vars qw{ $isEOF $Hold %wFiles @Q $CondReg
$doAutoPrint $doOpenWrite $doPrint };
$doAutoPrint = 1;
$doOpenWrite = 1;
# prototypes
sub openARGV();
sub getsARGV(;\$);
sub eofARGV();
sub printQ();

# Run: the sed loop reading input and applying the script
#
sub Run(){
my( $h, $icnt, $s, $n );
# hack (not unbreakable :-/) to avoid // matching an empty string
my $z = "\000"; $z =~ /$z/;
# Initialize.
openARGV();
$Hold = '';
$CondReg = 0;
$doPrint = $doAutoPrint;
CYCLE:
while( getsARGV() ){
chomp();
$CondReg = 0; # cleared on t
BOS:;
# s/://g
{ $s = s /://sg;
$CondReg ||= $s;
}
EOS: if( $doPrint ){
print $_, "\n";
} else {
$doPrint = $doAutoPrint;
}
printQ() if @Q;
}

exit( 0 );
}
Run();

# openARGV: open 1st input file
#
sub openARGV(){
unshift( @ARGV, '-' ) unless @ARGV;
my $file = shift( @ARGV );
open( ARG, "<$file" )
|| die( "$0: can't open $file for reading ($!)\n" );
$isEOF = 0;
}

# getsARGV: Read another input line into argument (default: $_).
# Move on to next input file, and reset EOF flag $isEOF.
sub getsARGV(;\$){
my $argref = @_ ? shift() : \$_;
while( $isEOF || ! defined( $$argref = ) ){
close( ARG );
return 0 unless @ARGV;
my $file = shift( @ARGV );
open( ARG, "<$file" )
|| die( "$0: can't open $file for reading ($!)\n" );
$isEOF = 0;
}
1;
}

# eofARGV: end-of-file test
#
sub eofARGV(){
return @ARGV == 0 && ( $isEOF = eof( ARG ) );
}

# makeHandle: Generates another file handle for some file (given by its path)
# to be written due to a w command or an s command's w flag.
sub makeHandle($){
my( $path ) = @_;
my $handle;
if( ! exists( $wFiles{$path} ) || $wFiles{$path} eq '' ){
$handle = $wFiles{$path} = gensym();
if( $doOpenWrite ){
if( ! open( $handle, ">$path" ) ){
die( "$0: can't open $path for writing: ($!)\n" );
}
}
} else {
$handle = $wFiles{$path};
}
return $handle;
}

# printQ: Print queued output which is either a string or a reference
# to a pathname.
sub printQ(){
for my $q ( @Q ){
if( ref( $q ) ){
# flush open w files so that reading this file gets it all
if( exists( $wFiles{$$q} ) && $wFiles{$$q} ne '' ){
open( $wFiles{$$q}, ">>$$q" );
}
# copy file to stdout: slow, but safe
if( open( RF, "<$$q" ) ){
while( defined( my $line = ) ){
print $line;
}
close( RF );
}
} else {
print $q;
}
}
undef( @Q );
}

Converting and writing the perl code to sperl-test.

shell>s2p -f sed-test > sperl-test

shell>chmod +x sperl-test

Trying to run the converted perl script against datafile2 and it works since all the : gone :)

shell> ./sperl-test datafile2
Lee 123
Tia 456
Test 789
god 345
ghost 098

Most code monkey will write their own codes but this kind of tools just about easing the code porting especially for simple codes that don't overkill.

Cheers :]

2 comments:

Randal L. Schwartz said...

a2p and s2p have been part of core Perl since Perl version 1. No need to install a "port".

C.S.Lee said...

Randal,

That's my mistakes :), thanks for the info. Seems I need to learn from you especially your perl script foo.