wgetのバージョンにより、cssの扱い方が変わると聞いたので確認してみた。

http://ja.wikipedia.org/wiki/GNU_Wget
Wget 1.12 (2009年9月リリース)ウェブ上のCSSからのURL解析機能と国際化資源識別子(IRI)の取り扱いが追加された。

http://wget.addictivecode.org/FrequentlyAskedQuestions?action=show&redirect=Faq

5.2. Can Wget download links found in CSS?
Thanks to code supplied by Ted Mielczarek, Wget can now parse embedded CSS stylesheet data
and text/css files to find additional links for recursion, as of version 1.12.

■比較
——————————————————————————–

[root@colinux tools]# wget --version
GNU Wget 1.10.2 (Red Hat modified)
[root@colinux 1.10.2]# wget -p -H http://kakaku.com/

終了しました --10:09:00--
ダウンロード完了: 519,822 バイト、76 ファイル
[root@colinux 1.10.2]#

[root@colinux wget-1.13]# wget --version
GNU Wget 1.13 built on linux-gnu.
[root@colinux 1.13]# wget -p -H http://kakaku.com/

FINISHED --2011-12-10 10:15:15--
Total wall clock time: 4.6s
Downloaded: 121 files, 601K in 0.3s (2.35 MB/s)

[root@colinux test]# wget -p -H -e robots=off http://kakaku.com/

FINISHED --2011-12-10 11:09:12--
Total wall clock time: 6.1s
Downloaded: 136 files, 727K in 0.3s (2.54 MB/s)
[root@colinux test]#

——————————————————————————–


CSS1.13ではCSSに書かれている画像全てをダウンロードしてきている。
htmlを表示する為に、必要な画像以外もダウンロードしてきている?

[root@colinux wget_compare]# ls -l 1.10.2/ 1.13/
1.10.2/:
合計 24
drwxr-xr-x 3 root root 4096 2011-12-10 10:23 image.akiba.kakaku.com
drwxr-xr-x 5 root root 4096 2011-12-10 10:23 img.kakaku.com
drwxr-xr-x 3 root root 4096 2011-12-10 10:23 img2.kakaku.k-img.com
drwxr-xr-x 4 root root 4096 2011-12-10 10:23 kakaku.com
drwxr-xr-x 2 root root 4096 2011-12-10 10:23 notice.kakaku.com
drwxr-xr-x 2 root root 4096 2011-12-10 10:23 www.googleadservices.com

1.13/:
合計 24
drwxr-xr-x 3 root root 4096 2011-12-10 10:28 image.akiba.kakaku.com
drwxr-xr-x 5 root root 4096 2011-12-10 10:28 img.kakaku.com
drwxr-xr-x 3 root root 4096 2011-12-10 10:28 img2.kakaku.k-img.com
drwxr-xr-x 4 root root 4096 2011-12-10 10:28 kakaku.com
drwxr-xr-x 2 root root 4096 2011-12-10 10:28 notice.kakaku.com
drwxr-xr-x 2 root root 4096 2011-12-10 10:28 www.googleadservices.com
[root@colinux wget_compare]#

[root@colinux wget_compare]# cat wget_1.10.log | egrep -i http:// | awk ‘{print $2}’ > wget_1.10_http.log
[root@colinux wget_compare]# cat wget_1.13.log | egrep -i http:// | awk ‘{print $3}’ > wget_1.13_http.log

[root@colinux wget_compare]# diff wget_1.10_http.log wget_1.13_http.log
77a78,122
> http://img.kakaku.com/images/home/home_header_bg.gif
> http://img.kakaku.com/images/icon_login.gif
> http://img.kakaku.com/images/icon_guide.gif
> http://img.kakaku.com/images/icon_register.gif
> http://img.kakaku.com/images/icon_mypage.gif
> http://img.kakaku.com/images/icon_history.gif
> http://img.kakaku.com/images/h1_btm.gif
> http://img.kakaku.com/images/h1bg.gif
> http://img.kakaku.com/images/itemview/item/bm_tweetn-ja.png
> http://img.kakaku.com/images/itemview/item/icon_guide.gif
> http://img.kakaku.com/images/dot_999999.gif
> http://img.kakaku.com/images/itemview/item/arrow_pagetop.gif
> http://img.kakaku.com/images/article/pickup/template/link_bk.jpg
> http://img.kakaku.com/images/home/arrow_next01.gif
> http://img.kakaku.com/images/itemview/item/tab_bar_default.gif
> http://img.kakaku.com/images/balloonhelp/balloon_tp.png
> http://img.kakaku.com/images/balloonhelp/balloon_tp2.png
> http://img.kakaku.com/images/balloonhelp/balloon_tp3.png
> http://img.kakaku.com/images/balloonhelp/balloon_md.png
> http://img.kakaku.com/images/balloonhelp/balloon_bt.png
> http://img.kakaku.com/images/balloonhelp/balloon_bt2.png
> http://img.kakaku.com/images/balloonhelp/balloon_bt3.png
> http://img.kakaku.com/images/category/btn_search_sub.gif
> http://img.kakaku.com/images/itemlist/btn_search.gif
> http://img.kakaku.com/images/home/icon_all.png
> http://img.kakaku.com/images/home/box_bg_in.png
> http://img.kakaku.com/images/home/box_bg.png
> http://img.kakaku.com/images/home/h2_top_all.png
> http://img.kakaku.com/images/home/dotline01.gif
> http://img.kakaku.com/images/home/bg_search.png
> http://img.kakaku.com/images/home/bg_category.png
> http://img.kakaku.com/images/home/home_icon_category.png
> http://img.kakaku.com/images/home/home_icon_group.png
> http://img.kakaku.com/images/home/icon_all.gif
> http://img.kakaku.com/images/home/icon_slider.png
> http://img.kakaku.com/images/home/h2_sub_all.png
> http://img.kakaku.com/images/home/icon_reviewall.gif
> http://img.kakaku.com/images/home/icon_mag.png
> http://img.kakaku.com/images/home/icon_akiba_all.png
> http://img.kakaku.com/images/home/trendnews_category.png
> http://img.kakaku.com/images/home/icon_tv.png
> http://img.kakaku.com/images/home/menu_boxall_h2.png
> http://img.kakaku.com/images/home/dotline02.gif
> http://img.kakaku.com/images/home/attention_arrow.gif
> http://img.kakaku.com/images/home/menu_group_h2.gif
[root@colinux wget_compare]#

[root@colinux css]# cat global_new.css | grep home_header_bg.gif
background: url(http://img.kakaku.com/images/home/home_header_bg.gif) repeat-x left bottom;
[root@colinux css]#

[root@colinux css]# cat home_common.css | grep attention_arrow.gif
background:url(http://img.kakaku.com/images/home/attention_arrow.gif) no-repeat left top;
[root@colinux css]#

wgetのバージョンによって挙動がかわっている事は確認出来た。

■インストール
——————————————————————————–

[root@colinux wget-1.13]# wget http://ftp.gnu.org/gnu/wget/wget-1.13.tar.gz
[root@colinux wget-1.13]# tar zxvf wget-1.13.tar.gz
[root@colinux wget-1.13]# ./configure --with-ssl=openssl
[root@colinux wget-1.13]# make
[root@colinux wget-1.13]# make install
[root@colinux wget-1.13]# whereis wget
wget: /usr/local/bin/wget
[root@colinux wget-1.13]# ln -s /usr/local/bin/wget /usr/bin/wget
[root@colinux wget-1.13]# wget --version
GNU Wget 1.13 built on linux-gnu.

■古いwgetをアンインストール
———————————————————————————

[root@colinux wget-1.13]# yum list installed | grep wget
wget.i386 1.10.2-15.fc7 installed
[root@colinux wget-1.13]# yum remove wget.i386
Setting up Remove Process
fedora 100% |=========================| 2.1 kB 00:00
updates 100% |=========================| 2.3 kB 00:00
Resolving Dependencies
--> Running transaction check
---> Package wget.i386 0:1.10.2-15.fc7 set to be erased
--> Finished Dependency Resolution

Transaction Test Succeeded
Running Transaction
Erasing : wget ######################### [1/1]

Removed: wget.i386 0:1.10.2-15.fc7
Complete!
[root@colinux wget-1.13]# yum list installed | grep wget
[root@colinux wget-1.13]#

■その他必要だったパツケージの事前インストール
———————————————————————————

[root@colinux wget-1.13]# yum search openssl | grep dev
openssl-devel.i386 : Files for development of applications which will use OpenSSL
openssl-devel.i386 : Files for development of applications which will use OpenSSL
xmlsec1-openssl-devel.i386 : OpenSSL crypto plugin for XML Security Library
tcltls-devel.i386 : Header files for the OpenSSL extension for Tcl
[root@colinux wget-1.13]# yum install openssl-devel.i386 xmlsec1-openssl-devel.i386 tcltls-devel.i386
Setting up Install Process
Parsing package install arguments
Resolving Dependencies
--> Running transaction check
---> Package tcltls-devel.i386 0:1.5.0-11.fc6 set to be updated
--> Processing Dependency: tcltls = 1.5.0-11.fc6 for package: tcltls-devel
---> Package xmlsec1-openssl-devel.i386 0:1.2.9-8.1 set to be updated
--> Processing Dependency: libxslt-devel >= 1.1.0 for package: xmlsec1-openssl-devel
--> Processing Dependency: libxml2-devel >= 2.6.0 for package: xmlsec1-openssl-devel
--> Processing Dependency: xmlsec1 = 1.2.9 for package: xmlsec1-openssl-devel
--> Processing Dependency: xmlsec1-devel = 1.2.9 for package: xmlsec1-openssl-devel
--> Processing Dependency: libxmlsec1-openssl.so.1 for package: xmlsec1-openssl-devel
--> Processing Dependency: xmlsec1-openssl = 1.2.9 for package: xmlsec1-openssl-devel
---> Package openssl-devel.i386 0:0.9.8b-15.fc7 set to be updated
--> Processing Dependency: zlib-devel for package: openssl-devel
--> Processing Dependency: krb5-devel for package: openssl-devel
--> Running transaction check
---> Package libxslt-devel.i386 0:1.1.24-1.fc7 set to be updated
--> Processing Dependency: libgcrypt-devel for package: libxslt-devel
---> Package libxml2-devel.i386 0:2.6.31-1.fc7 set to be updated
---> Package xmlsec1.i386 0:1.2.9-8.1 set to be updated
---> Package tcltls.i386 0:1.5.0-11.fc6 set to be updated
---> Package krb5-devel.i386 0:1.6.1-9.fc7 set to be updated
--> Processing Dependency: e2fsprogs-devel for package: krb5-devel
---> Package xmlsec1-devel.i386 0:1.2.9-8.1 set to be updated
---> Package zlib-devel.i386 0:1.2.3-10.fc7 set to be updated
---> Package xmlsec1-openssl.i386 0:1.2.9-8.1 set to be updated
--> Running transaction check
---> Package e2fsprogs-devel.i386 0:1.40.2-3.fc7 set to be updated
---> Package libgcrypt-devel.i386 0:1.2.4-1 set to be updated
--> Processing Dependency: libgpg-error-devel for package: libgcrypt-devel
--> Running transaction check
---> Package libgpg-error-devel.i386 0:1.4-2 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================
Package Arch Version Repository Size
=============================================================================
Installing:
tcltls-devel i386 1.5.0-11.fc6 fedora 4.3 k
xmlsec1-openssl-devel i386 1.2.9-8.1 fedora 87 k
Installing for dependencies:
e2fsprogs-devel i386 1.40.2-3.fc7 updates 627 k
krb5-devel i386 1.6.1-9.fc7 updates 1.1 M
libgcrypt-devel i386 1.2.4-1 fedora 274 k
libgpg-error-devel i386 1.4-2 fedora 17 k
libxml2-devel i386 2.6.31-1.fc7 updates 2.1 M
libxslt-devel i386 1.1.24-1.fc7 updates 324 k
openssl-devel i386 0.9.8b-15.fc7 updates 1.8 M
tcltls i386 1.5.0-11.fc6 fedora 28 k
xmlsec1 i386 1.2.9-8.1 fedora 176 k
xmlsec1-devel i386 1.2.9-8.1 fedora 667 k
xmlsec1-openssl i386 1.2.9-8.1 fedora 71 k
zlib-devel i386 1.2.3-10.fc7 fedora 81 k

Transaction Summary
=============================================================================
Install 14 Package(s)
Update 0 Package(s)
Remove 0 Package(s)

Total download size: 7.4 M
Is this ok [y/N]:

その他参考になるURL
Compare cURL Features with Other Download Tools
http://curl.haxx.se/docs/comparison-table.html