I want to download urls recursively,
starting from : http://code.google.com/apis/maps/,
but I want to download only those URLs which
match the this pattern :
http://code.google.com/apis/maps/*
I tried wget -r -D http://code.google.com/apis/maps/ http://code.google.com/apis/maps/
but it downloads only index.html and stops.
I tried few other options but they didn't work as intended either.
Ans :
Hm... no linux.... O.K. here is another alternative: w3mir. It's perl based and not restricted to linux. Actually I tried it on windows and it works as expected.
http://www.langfeldt.net/w3mir/
Download the w3mir. Unpack it and read the file INSTALL.w32. Basically it's the following steps to "install" it on windows.
get and install winzip from http://www.winzip.com/
get and install ActivePerl (now Build 509) from http://www.activeperl.com/
get nmake.exe from ftp://ftp.microsoft.com/Softlib/MSLFILES/nmake15.exe
After installing the tools above, do this in the unpacked w3mir directory
perl makefile.pl
nmake
nmake install
After that w3mir will be installed in the default path of your perl Installation.
w3mir -h
Here is a sample file for your problem: w3mir.cfg
# Retrive all of janl's home pages:
Options: recurse
#
# This is the two argument form of URL:. It fetches the first into the second
URL: http://code.google.com/apis/maps/documentation/
Fetch-RE: m/flash/
cd: d:\mirror
Then run w3mir like this:
mkdir d:\mirror
w3mir -cfgfile w3mir.cfg
http://www.langfeldt.net/w3mir/
Download the w3mir. Unpack it and read the file INSTALL.w32. Basically it's the following steps to "install" it on windows.
get and install winzip from http://www.winzip.com/
get and install ActivePerl (now Build 509) from http://www.activeperl.com/
get nmake.exe from ftp://ftp.microsoft.com/Softlib/MSLFILES/nmake15.exe
After installing the tools above, do this in the unpacked w3mir directory
perl makefile.pl
nmake
nmake install
After that w3mir will be installed in the default path of your perl Installation.
w3mir -h
Here is a sample file for your problem: w3mir.cfg
# Retrive all of janl's home pages:
Options: recurse
#
# This is the two argument form of URL:. It fetches the first into the second
URL: http://code.google.com/apis/maps/documentation/
Fetch-RE: m/flash/
cd: d:\mirror
Then run w3mir like this:
mkdir d:\mirror
w3mir -cfgfile w3mir.cfg
No comments:
Post a Comment