用户注册



邮箱:

密码:

用户登录


邮箱:

密码:
记住登录一个月忘记密码?

发表随想


还能输入:200字
云代码 - perl代码库

artDownload

2012-11-30 作者: 铁士代诺举报

[perl]代码库

#!/usr/bin/perl

use LWP;
use LWP::Simple;

$SIG{INT} = \&get_out;

my $url = 'http://www.airenti.org/Html/Type/1_1.html';
my $url_girls = 'http://www.airenti.org/Html/';
my $local_path = '/cygdrive/d/Downloads/art/';
my $crt_file = '';
my $tmp_dir = '.art';

my @HEAD = (
    'Host' => 'processbase.neusoft.com',
    'User-Agent' => 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:10.0.1) Gecko/20100101 Firefox/10.0.1',
    'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language' => 'en-us,en;q=0.5',
    'Accept-Encoding' => 'gzip, deflate',
    'Connection' => 'keep-alive',
);

my $browser = LWP::UserAgent->new();
my $response = $browser->get($url);
my $index_page = $response->content;
if ($index_page =~ m{<a\s+href="(.*?)".*?alt="(.*?)".*?\2}) {
    my $girl_url = $1;
    $girl_url =~ s/\.\.\//$url_girls/;

    while ($girl_url) {
        $girl_url = get_girl_pics($girl_url);
    }
}

sub get_girl_pics {
    my $url = shift;
    my $girl_res = $browser->get($url);
    my $page_content = $girl_res->content;

    my $girl_no = $url;
    $girl_no =~ s/.*\/(\d+)_.*/$1/;

    my $pageindex = $url;
    $pageindex =~ s{/[^/]+$}{};

    if ($page_content =~ /<(title)>(.*?)<\/\1>/) {
        my $title = $2;

        print $title."\n";
        if (!-d $local_path.$title) {
            my $tmp_path = $local_path.$title.$tmp_dir;
            -d $tmp_path or mkdir($tmp_path);
            get_girl_pic($page_content, $title);
            while ($page_content =~ m{.*?href="(${girl_no}_(\d+)\.html)">\2}g) {
                get_girl_pic($browser->get($pageindex.'/'.$1)->content, $title);
            }

            rename($tmp_path, $local_path.$title);
        }
    }

    my $no_next = '下一组:没有了';
    my $next = '下一组:';
    if ($page_content =~ /$no_next/m)
    {
        return 0;
    }
    elsif ($page_content =~ /.*<a href="(.*?)">$next/m) {
        return $pageindex.'/'.$1;
    }
    else {
        return 0;
    }
}

sub get_girl_pic {
    my $page = shift;
    my $title = shift;

    while ($page =~ m{<img\s+src="(.*?)"\s+alt="$title"}g) {
        my $pic_file = $1;
        $pic_file =~ s/^\s+//;
        $pic_file =~ s/\s+$//;
        my $local_file = $pic_file;

        $local_file =~ s/.*\///;
        $local_file = $local_path.$title.$tmp_dir.'/'.$local_file;

        if (-e $local_file) {
            #print "\t".$pic_file."\n";
            #print "\t已经有了!\n";
        }
        else {
            print "\t".$pic_file."\n";
            print "\t => ".$local_file."\n";
            $crt_file = $local_file;
            LWP::Simple::getstore($pic_file, $local_file);
            $crt_file = '';
        }
    }
}

sub get_out {
    if ($crt_file) {
        unlink ($crt_file);
    }
    exit;
}


网友评论    (发表评论)

共1 条评论 1/1页

发表评论:

评论须知:

  • 1、评论每次加2分,每天上限为30;
  • 2、请文明用语,共同创建干净的技术交流环境;
  • 3、若被发现提交非法信息,评论将会被删除,并且给予扣分处理,严重者给予封号处理;
  • 4、请勿发布广告信息或其他无关评论,否则将会删除评论并扣分,严重者给予封号处理。


扫码下载

加载中,请稍后...

输入口令后可复制整站源码

加载中,请稍后...