Perl 的隐藏特性？

What are some really useful but esoteric language features in Perl that you've actually been able to employ to do useful work?

Guidelines:

Try to limit answers to the Perl core and not CPAN
Please give an example and a short description

Hidden Features also found in other languages' Hidden Features:

(These are all from Corion's answer)

C
- Duff's Device
- Portability and Standardness
C#
- Quotes for whitespace delimited lists and strings
- Aliasable namespaces
Java
- Static Initalizers
JavaScript
- Functions are First Class citizens
- Block scope and closure
- Calling methods and accessors indirectly through a variable
Ruby
- Defining methods through code
PHP
- Pervasive online documentation
- Magic methods
- Symbolic references
Python
- One line value swapping
- Ability to replace even core functions with your own functionality

Other Hidden Features:

Operators:

Quoting constructs:

Syntax and Names:

Modules, Pragmas, and command-line options:

Variables:

Loops and flow control:

Regular expressions:

Other features:

Other tricks, and meta-answers:

See Also:

转载于:https://stackoverflow.com/questions/161872/hidden-features-of-perl

There are many non-obvious features in Perl.

For example, did you know that there can be a space after a sigil?

 $ perl -wle 'my $x = 3; print $ x'
 3

Or that you can give subs numeric names if you use symbolic references?

$ perl -lwe '*4 = sub { print "yes" }; 4->()' 
yes

There's also the "bool" quasi operator, that return 1 for true expressions and the empty string for false:

$ perl -wle 'print !!4'
1
$ perl -wle 'print !!"0 but true"'
1
$ perl -wle 'print !!0'
(empty line)

Other interesting stuff: with use overload you can overload string literals and numbers (and for example make them BigInts or whatever).

Many of these things are actually documented somewhere, or follow logically from the documented features, but nonetheless some are not very well known.

Update: Another nice one. Below the q{...} quoting constructs were mentioned, but did you know that you can use letters as delimiters?

$ perl -Mstrict  -wle 'print q bJet another perl hacker.b'
Jet another perl hacker.

Likewise you can write regular expressions:

m xabcx
# same as m/abc/

Autovivification. AFAIK no other language has it.

while(/\G(\b\w*\b)/g) {
     print "$1\n";
}

the \G anchor. It's hot.

Let's start easy with the Spaceship Operator.

$a = 5 <=> 7;  # $a is set to -1
$a = 7 <=> 5;  # $a is set to 1
$a = 6 <=> 6;  # $a is set to 0

The m// operator has some obscure special cases:

If you use ? as the delimiter it only matches once unless you call reset.
If you use ' as the delimiter the pattern is not interpolated.
If the pattern is empty it uses the pattern from the last successful match.

A bit obscure is the tilde-tilde "operator" which forces scalar context.

print ~~ localtime;

is the same as

print scalar localtime;

and different from

print localtime;

Taint checking. With taint checking enabled, perl will die (or warn, with -t) if you try to pass tainted data (roughly speaking, data from outside the program) to an unsafe function (opening a file, running an external command, etc.). It is very helpful when writing setuid scripts or CGIs or anything where the script has greater privileges than the person feeding it data.

Magic goto. goto &sub does an optimized tail call.

The debugger.

use strict and use warnings. These can save you from a bunch of typos.

One of my favourite features in Perl is using the boolean || operator to select between a set of choices.

 $x = $a || $b;

 # $x = $a, if $a is true.
 # $x = $b, otherwise

This means one can write:

 $x = $a || $b || $c || 0;

to take the first true value from $a, $b, and $c, or a default of 0 otherwise.

In Perl 5.10, there's also the // operator, which returns the left hand side if it's defined, and the right hand side otherwise. The following selects the first defined value from $a, $b, $c, or 0 otherwise:

$x = $a // $b // $c // 0;

These can also be used with their short-hand forms, which are very useful for providing defaults:

$x ||= 0;   # If $x was false, it now has a value of 0.

$x //= 0;   # If $x was undefined, it now has a value of zero.

Cheerio,

Paul

It's simple to quote almost any kind of strange string in Perl.

my $url = q{http://my.url.com/any/arbitrary/path/in/the/url.html};

In fact, the various quoting mechanisms in Perl are quite interesting. The Perl regex-like quoting mechanisms allow you to quote anything, specifying the delimiters. You can use almost any special character like #, /, or open/close characters like (), [], or {}. Examples:

my $var  = q#some string where the pound is the final escape.#;
my $var2 = q{A more pleasant way of escaping.};
my $var3 = q(Others prefer parens as the quote mechanism.);

Quoting mechanisms:

q : literal quote; only character that needs to be escaped is the end character. qq : an interpreted quote; processes variables and escape characters. Great for strings that you need to quote:

my $var4 = qq{This "$mechanism" is broken.  Please inform "$user" at "$email" about it.};

qx : Works like qq, but then executes it as a system command, non interactively. Returns all the text generated from the standard out. (Redirection, if supported in the OS, also comes out) Also done with back quotes (the ` character).

my $output  = qx{type "$path"};      # get just the output
my $moreout = qx{type "$path" 2>&1}; # get stuff on stderr too

qr : Interprets like qq, but then compiles it as a regular expression. Works with the various options on the regex as well. You can now pass the regex around as a variable:

sub MyRegexCheck {
    my ($string, $regex) = @_;
    if ($string)
    {
       return ($string =~ $regex);
    }
    return; # returns 'null' or 'empty' in every context
}

my $regex = qr{http://[\w]\.com/([\w]+/)+};
@results = MyRegexCheck(q{http://myurl.com/subpath1/subpath2/}, $regex);

qw : A very, very useful quote operator. Turns a quoted set of whitespace separated words into a list. Great for filling in data in a unit test.


   my @allowed = qw(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z { });
   my @badwords = qw(WORD1 word2 word3 word4);
   my @numbers = qw(one two three four 5 six seven); # works with numbers too
   my @list = ('string with space', qw(eight nine), "a $var"); # works in other lists
   my $arrayref = [ qw(and it works in arrays too) ];

They're great to use them whenever it makes things clearer. For qx, qq, and q, I most likely use the {} operators. The most common habit of people using qw is usually the () operator, but sometimes you also see qw//.

The "for" statement can be used the same way "with" is used in Pascal:

for ($item)
{
    s/&‎nbsp;/ /g;
    s/<.*?>/ /g;
    $_ = join(" ", split(" ", $_));
}

You can apply a sequence of s/// operations, etc. to the same variable without having to repeat the variable name.

NOTE: the non-breaking space above (&‎nbsp;) has hidden Unicode in it to circumvent the Markdown. Don't copy paste it :)

The flip-flop operator is useful for skipping the first iteration when looping through the records (usually lines) returned by a file handle, without using a flag variable:

while(<$fh>)
{
  next if 1..1; # skip first record
  ...
}

Run perldoc perlop and search for "flip-flop" for more information and examples.

Not really hidden, but many every day Perl programmers don't know about CPAN. This especially applies to people who aren't full time programmers or don't program in Perl full time.

The operators ++ and unary - don't only work on numbers, but also on strings.

my $_ = "a"
print -$_

prints -a

print ++$_

prints b

$_ = 'z'
print ++$_

prints aa

This is a meta-answer, but the Perl Tips archives contain all sorts of interesting tricks that can be done with Perl. The archive of previous tips is on-line for browsing, and can be subscribed to via mailing list or atom feed.

Some of my favourite tips include building executables with PAR, using autodie to throw exceptions automatically, and the use of the switch and smart-match constructs in Perl 5.10.

Disclosure: I'm one of the authors and maintainers of Perl Tips, so I obviously think very highly of them. ;)

The null filehandle diamond operator <> has its place in building command line tools. It acts like <FH> to read from a handle, except that it magically selects whichever is found first: command line filenames or STDIN. Taken from perlop:

while (<>) {
...         # code for each line
}

rename("$_.part", $_) for "data.txt";

renames data.txt.part to data.txt without having to repeat myself.

As Perl has almost all "esoteric" parts from the other lists, I'll tell you the one thing that Perl can't:

The one thing Perl can't do is have bare arbitrary URLs in your code, because the // operator is used for regular expressions.

Just in case it wasn't obvious to you what features Perl offers, here's a selective list of the maybe not totally obvious entries:

Duff's Device - in Perl

Portability and Standardness - There are likely more computers with Perl than with a C compiler

A file/path manipulation class - File::Find works on even more operating systems than .Net does

Quotes for whitespace delimited lists and strings - Perl allows you to choose almost arbitrary quotes for your list and string delimiters

Aliasable namespaces - Perl has these through glob assignments:

*My::Namespace:: = \%Your::Namespace

Static initializers - Perl can run code in almost every phase of compilation and object instantiation, from BEGIN (code parse) to CHECK (after code parse) to import (at module import) to new (object instantiation) to DESTROY (object destruction) to END (program exit)

Functions are First Class citizens - just like in Perl

Block scope and closure - Perl has both

Calling methods and accessors indirectly through a variable - Perl does that too:

my $method = 'foo';
my $obj = My::Class->new();
$obj->$method( 'baz' ); # calls $obj->foo( 'baz' )

Defining methods through code - Perl allows that too:

*foo = sub { print "Hello world" };

Pervasive online documentation - Perl documentation is online and likely on your system too

Magic methods that get called whenever you call a "nonexisting" function - Perl implements that in the AUTOLOAD function

Symbolic references - you are well advised to stay away from these. They will eat your children. But of course, Perl allows you to offer your children to blood-thirsty demons.

One line value swapping - Perl allows list assignment

Ability to replace even core functions with your own functionality

use subs 'unlink'; 
sub unlink { print 'No.' }

BEGIN{
    *CORE::GLOBAL::unlink = sub {print 'no'}
};

unlink($_) for @ARGV

Add support for compressed files via magic ARGV:

s{ 
    ^            # make sure to get whole filename
    ( 
      [^'] +     # at least one non-quote
      \.         # extension dot
      (?:        # now either suffix
          gz
        | Z 
       )
    )
    \z           # through the end
}{gzcat '$1' |}xs for @ARGV;

(quotes around $_ necessary to handle filenames with shell metacharacters in)

Now the <> feature will decompress any @ARGV files that end with ".gz" or ".Z":

while (<>) {
    print;
}

The quoteword operator is one of my favourite things. Compare:

my @list = ('abc', 'def', 'ghi', 'jkl');

and

my @list = qw(abc def ghi jkl);

Much less noise, easier on the eye. Another really nice thing about Perl, that one really misses when writing SQL, is that a trailing comma is legal:

print 1, 2, 3, ;

That looks odd, but not if you indent the code another way:

print
    results_of_foo(),
    results_of_xyzzy(),
    results_of_quux(),
    ;

Adding an additional argument to the function call does not require you to fiddle around with commas on previous or trailing lines. The single line change has no impact on its surrounding lines.

This makes it very pleasant to work with variadic functions. This is perhaps one of the most under-rated features of Perl.

The ability to parse data directly pasted into a DATA block. No need to save to a test file to be opened in the program or similar. For example:

my @lines = <DATA>;
for (@lines) {
    print if /bad/;
}

__DATA__
some good data
some bad data
more good data 
more good data

New Block Operations

I'd say the ability to expand the language, creating pseudo block operations is one.

You declare the prototype for a sub indicating that it takes a code reference first:

sub do_stuff_with_a_hash (&\%) {
    my ( $block_of_code, $hash_ref ) = @_;
    while ( my ( $k, $v ) = each %$hash_ref ) { 
        $block_of_code->( $k, $v );
    }
}

You can then call it in the body like so

use Data::Dumper;

do_stuff_with_a_hash {
    local $Data::Dumper::Terse = 1;
    my ( $k, $v ) = @_;
    say qq(Hey, the key   is "$k"!);
    say sprintf qq(Hey, the value is "%v"!), Dumper( $v );

} %stuff_for
;

(Data::Dumper::Dumper is another semi-hidden gem.) Notice how you don't need the sub keyword in front of the block, or the comma before the hash. It ends up looking a lot like: map { } @list

Source Filters

Also, there are source filters. Where Perl will pass you the code so you can manipulate it. Both this, and the block operations, are pretty much don't-try-this-at-home type of things.

I have done some neat things with source filters, for example like creating a very simple language to check the time, allowing short Perl one-liners for some decision making:

perl -MLib::DB -MLib::TL -e 'run_expensive_database_delete() if $hour_of_day < AM_7';

Lib::TL would just scan for both the "variables" and the constants, create them and substitute them as needed.

Again, source filters can be messy, but are powerful. But they can mess debuggers up something terrible--and even warnings can be printed with the wrong line numbers. I stopped using Damian's Switch because the debugger would lose all ability to tell me where I really was. But I've found that you can minimize the damage by modifying small sections of code, keeping them on the same line.

Signal Hooks

It's often enough done, but it's not all that obvious. Here's a die handler that piggy backs on the old one.

my $old_die_handler = $SIG{__DIE__};
$SIG{__DIE__}       
    = sub { say q(Hey! I'm DYIN' over here!); goto &$old_die_handler; }
    ;

That means whenever some other module in the code wants to die, they gotta come to you (unless someone else does a destructive overwrite on $SIG{__DIE__}). And you can be notified that somebody things something is an error.

Of course, for enough things you can just use an END { } block, if all you want to do is clean up.

`overload::constant`

You can inspect literals of a certain type in packages that include your module. For example, if you use this in your import sub:

overload::constant 
    integer => sub { 
        my $lit = shift;
        return $lit > 2_000_000_000 ? Math::BigInt->new( $lit ) : $lit 
    };

it will mean that every integer greater than 2 billion in the calling packages will get changed to a Math::BigInt object. (See overload::constant).

Grouped Integer Literals

While we're at it. Perl allows you to break up large numbers into groups of three digits and still get a parsable integer out of it. Note 2_000_000_000 above for 2 billion.

Binary "x" is the repetition operator:

print '-' x 80;     # print row of dashes

It also works with lists:

print for (1, 4, 9) x 3; # print 149149149

My vote would go for the (?{}) and (??{}) groups in Perl's regular expressions. The first executes Perl code, ignoring the return value, the second executes code, using the return value as a regular expression.

Based on the way the "-n" and "-p" switches are implemented in Perl 5, you can write a seemingly incorrect program including }{:

ls |perl -lne 'print $_; }{ print "$. Files"'

which is converted internally to this code:

LINE: while (defined($_ = <ARGV>)) {
    print $_; }{ print "$. Files";
}

Special code blocks such as BEGIN, CHECK and END. They come from Awk, but work differently in Perl, because it is not record-based.

The BEGIN block can be used to specify some code for the parsing phase; it is also executed when you do the syntax-and-variable-check perl -c. For example, to load in configuration variables:

BEGIN {
    eval {
        require 'config.local.pl';
    };
    if ($@) {
        require 'config.default.pl';
    }
}

map - not only because it makes one's code more expressive, but because it gave me an impulse to read a little bit more about this "functional programming".

tie, the variable tying interface.

The continue clause on loops. It will be executed at the bottom of every loop, even those which are next'ed.

while( <> ){
  print "top of loop\n";
  chomp;

  next if /next/i;
  last if /last/i;

  print "bottom of loop\n";
}continue{
  print "continue\n";
}

The "desperation mode" of Perl's loop control constructs which causes them to look up the stack to find a matching label allows some curious behaviors which Test::More takes advantage of, for better or worse.

SKIP: {
    skip() if $something;

    print "Never printed";
}

sub skip {
    no warnings "exiting";
    last SKIP;
}

There's the little known .pmc file. "use Foo" will look for Foo.pmc in @INC before Foo.pm. This was intended to allow compiled bytecode to be loaded first, but Module::Compile takes advantage of this to cache source filtered modules for faster load times and easier debugging.

The ability to turn warnings into errors.

local $SIG{__WARN__} = sub { die @_ };
$num = "two";
$sum = 1 + $num;
print "Never reached";

That's what I can think of off the top of my head that hasn't been mentioned.

The goatse operator*:

$_ = "foo bar";
my $count =()= /[aeiou]/g; #3

sub foo {
    return @_;
}

$count =()= foo(qw/a b c d/); #4

It works because list assignment in scalar context yields the number of elements in the list being assigned.

* Note, not really an operator