如何使用网络爬虫处理安全的cookie [关闭]

I have some tasks on a site in php using nginx that I'm trying to automate. I am able to log in but subsequent requests to the rest of the site fail because of a bunch of cookies I'm not able to capture. When I grab the response header its like they don't exist. All I get is a PHPSESSID and SERVERID, and I'm missing 5 others, although I can see them in my browser cookies. I think only one of them is being used as a persistent authentication token. Ive tried using JSoup, java URL, and lwp/mechanize in PERL. I should be able to get them since burp was written in Java.

http: REMOVED
POST /authenticate.php HTTP/1.1
Host: REMOVED
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.23)
Gecko/20110920 Firefox/3.6.23
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Proxy-Connection: keep-alive
Referer: REMOVED

Cookie: __utma=35782181.1596497020.1319574836.1319750878.1319821717.7; __utmv=35782181.|1=SignupDate=2011-OCT-24=1;uid="MTU5MTY4Ng==|1319649169|e4db70a9171742176a944f4fdc3613fd963b1b7e";username="dGVzdF9sb2dpbg==|1319649169|b82e24618b06d6b14d7ea64600c84a2d20c3de73"; defaultstat1=10; defaultstat3=10; SERVERID=ww4; PHPSESSID=53a7cd9acbb71ed7e7cc7be680e6c99c; __utmb=35782181.1.10.1319821717; __utmc=35782181; mode=full

Content-Type: application/x-www-form-urlencoded
Content-Length: 57
username=test_login&password=login123&btnLogin=Login
HTTP/1.0 302 Moved Temporarily
Server: nginx
Date: Fri, 28 Oct 2011 17:09:08 GMT
Content-Type: text/html
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: secret=99ba70c185973be0cd25e0f12dd1ea72; path=/
Location: REMOVED
X-Cache: MISS from REMOVED
Via: 1.0 REMOVED (http_scan/4.0.2.6.19)
Proxy-Connection: close

JSoup:

Connection.Response res = JSoup.connect(url)
     .data("username", username)
     .data("password", password)
     .method(Method.POST)
    .execute();

cookies[] = res.cookies();

cookies[] only contains PHPSESSID and SERVERID.

The cookies in your sample are Google's web analytics cookies, and they're set via Javascript. Unless the crawler you're writing can execute Javascript, those cookies will simply NEVER get set in the crawler.

What you see in your browser is utterly irrelevant for fixing this - it's what the crawler sees, gets, and can do that counts.