perl命令详解

null

闻道张江子翩翩海上来几时寻旧隐相对玉堂开 --白山头

00 正文

Perl ，一般被称为“实用报表提取语言”（Practical Extraction and Report Language）。顾名思义，非常适合文本处理。Perl 最初的设计者为拉里·沃尔（Larry Wall），他于1987年12月18日发表。Perl借取了C、sed、awk、shell scripting以及很多其他程序语言的特性。其中最重要的特性是它内部集成了正则表达式的功能，以及巨大的第三方代码库CPAN。简而言之，Perl象C一样强大，象awk、sed等脚本描述语言一样方便。自1987年1.0版本发布以来，Perl的用户数一直急剧增加，同时越来越多的程序员与软件开发者（商）参与Perl的开发。高超的执行效率，无限复杂数据结构的支持，强大的module资源，shell一样的方便性，所有这些使得Perl用IC设计中被广泛应用也就绝非偶然了。

本文推荐3个个人觉得非常好用的命令，可以极大的提高编程效率, 同时也能提高代码的可读性（perl的可读性比较受诟病）。他们在linux中有原型，但是更加强大。有了他们，也就无需傻傻的foreach循环了。 一句话的事，要啥自行车

01 grep

基本语法：

grep BLOCK LIST
grep EXPR, LIST

不同的写法

my @foo = grep(!/^#/, @bar);

 my @foo = grep {!/^#/} @bar;

两者效果是一样的，个人比较喜欢第二种，更接近shell中grep的用法。

grep vs loop



open FILE "<myfile" or die "Can't open myfile: $!"; 
print grep /terrorism|nuclear/i, <FILE>;

等效于



while ($line = <FILE>) {
    if ($line =~ /terrorism|nuclear/i) { print $line }
}

可以说，任何grep能实现的功能，都可以用循环来解决。但是为什么还需要grep呢？因为grep用起来更像shell。

如果grep之后的指赋值给scalar类型，那么就能得到grep之后的array的size。



$num_apple = grep /^apple$/i, @fruits;

uniq一个list



@unique = grep { ++$count{$_} < 2 } 
               qw(a b a c d d e f g f h h);
print "@unique\n";
a b c d e f g h

找到list中出现两次的元素



@crops = qw(wheat corn barley rice corn soybean hay 
            alfalfa rice hay beets corn hay);
@dupes = grep { $count{$_} == 2 } 
              grep { ++$count{$_} > 1 } @crops;
print "@dupes\n";
rice

列出目录中的文本文件



@files = grep { -f and -T } glob '* .*';
print "@files\n";

过滤掉旧文件

my @files = glob "*.log";
my @old_files = grep { -M $_ > 365 } @files;
print join "\n", @old_files;

-M $path_to_file 返回文件最后一次修改至今的天数。这个例子过滤掉365天内修改的文件，并得到至少存在了一年以上的文件。

02 map

基本语法

map EXPR, LIST
map  BLOCK LIST

文件名map成文件大小



Transform filenames to file sizes
@sizes = map { -s $_ } @file_names;

查找拼错单词



%dictionary = map { $_, 1 } qw(cat dog man woman hat glove);
@words = qw(dog kat wimen hat man glov);
foreach $word (@words) {
    if (not $dictionary{$word}) {   
        print "Possible misspelled word: $word\n";
    }
}Possible misspelled word: kat
Possible misspelled word: wimen
Possible misspelled word: glov

找到某个数组值的索引



@teams = qw(Miami Oregon Florida Tennessee Texas Oklahoma Nebraska LSU Colorado Maryland);
%rank = map { $teams[$_], $_ + 1 } 0 .. $#teams;
print "Colorado: $rank{Colorado}n";
print "Texas: $rank{Texas} (hook 'em, Horns!)n";




Colorado: 9
Texas: 5 (hook 'em, Horns!)

03 sort

sort SUBNAME LIST
sort BLOCK LIST
sort LIST

数字顺序排序

@array = (8, 2, 32, 1, 4, 16);
print join(' ', sort { $a <=> $b } @array), "\n";
1 2 4 8 16 32

ASCII 排序排序

@languages = qw(fortran lisp c c++ Perl python java);
print join(' ', sort @languages), "\n";
Perl c c++ fortran java lisp python

词典顺序排序



@array = qw(ASCII ascap at_large atlarge A ARP arp);
@sorted = sort { ($da = lc $a) =~ s/[\W_]+//g;
                 ($db = lc $b) =~ s/[\W_]+//g;
                 $da cmp $db;
               } @array;
print "@sorted\n";
A ARP arp ascap ASCII atlarge at_large

倒序



sort { $b <=> $a } @array;

等效于



reverse sort { $a <=> $b } @array;

基于key值对哈希进行排序



%hash = (Donald => Knuth, Alan => Turing, John => Neumann);
@sorted = map { { ($_ => $hash{$_}) } } sort keys %hash;
foreach $hashref (@sorted) {
    ($key, $value) = each %$hashref;
    print "$key => $value\n";
}
Alan => Turing
Donald => Knuth
John => Neumann

基于value对hash进行排序

%hash = ( Elliot => Babbage, 
          Charles => Babbage,
          Grace => Hopper,
          Herman => Hollerith
        );
@sorted = map { { ($_ => $hash{$_}) } } 
              sort { $hash{$a} cmp $hash{$b}
                     or $a cmp $b
                   } keys %hash;
foreach $hashref (@sorted) {
    ($key, $value) = each %$hashref;
    print "$key => $value\n";
}
Charles => Babbage
Elliot => Babbage
Herman => Hollerith
Grace => Hopper

-end-

这三个命令可以一起配合使用 , 例如

@new_array = sort {BLOCK} map {BLOCK} grep {BLOCK}  @array

关于perl的入门书籍，推荐《Perl语言入门》进阶的话，就读《Intermediate Perl》

在IC设计中，学会这两本书，就可以玩弄各种文本文件于股掌之间了。