前景提要
HDC调试需求开发(15万预算),能者速来!>>> public class TestPageProcessor implements PageProcessor { private Site site = Site.me().setRetryTimes(3).setSleepTime(3000) .setUserAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0"); @Override public void process(Page page) { Html html = page.getHtml(); List<String> list = html.$(".nav").$("a").all(); } @Override public Site getSite() { return site; } public static void main(String[] args) { Spider.create(new TestPageProcessor()).addUrl("http://hbda.gov.cn/").thread(5).run(); //Spider.create(new TestPageProcessor()).addUrl("http://www.longhoo.net/").thread(5).run(); } 控制台打印信息如下: 19:51:40-[INFO] us.codecraft.webmagic.Spider Spider hbda.gov.cn started! 19:51:40-[INFO] us.codecraft.webmagic.downloader.HttpClientDownloader downloading page success http://hbda.gov.cn/ 19:51:40-[INFO] us.codecraft.webmagic.Spider page status code error, page http://hbda.gov.cn/ , code: 412 19:51:43-[INFO] us.codecraft.webmagic.Spider Spider hbda.gov.cn closed! 1 pages downloaded. }