Jsoup應用實例


在本篇文章中,將列出了一些常用的jsoup例子,例如獲取URL或HTML文檔的標題,鏈接,圖像和元數據。

1. 獲取URL的標題

Document doc = Jsoup.connect("http://www.yiibai.com").get();  
String title = doc.title();

2. 從HTML文件獲取標題

Document doc = Jsoup.parse(new File("e:\\register.html"),"utf-8");//assuming register.html file in e drive  
String title = doc.title();

3. 獲取URL的鏈接

Document doc = Jsoup.connect("http://www.yiibai.com").get();  
Elements links = doc.select("a[href]");  
for (Element link : links) {  
    System.out.println("\nlink : " + link.attr("href"));  
    System.out.println("text : " + link.text());  
}

4. 獲取URL的元信息

Document doc = Jsoup.connect("http://www.yiibai.com").get();  
String keywords = doc.select("meta[name=keywords]").first().attr("content");  
System.out.println("Meta keyword : " + keywords);  
String description = doc.select("meta[name=description]").get(0).attr("content");  
System.out.println("Meta description : " + description);

5. 獲取URL的圖像

Document doc = Jsoup.connect("http://www.yiibai.com").get();  
Elements images = doc.select("img[src~=(?i)\\.(png|jpe?g|gif)]");  
for (Element image : images) {  
    System.out.println("src : " + image.attr("src"));  
    System.out.println("height : " + image.attr("height"));  
    System.out.println("width : " + image.attr("width"));  
    System.out.println("alt : " + image.attr("alt"));  
}

6. 獲取表單參數

Document doc = Jsoup.parse(new File("e:\\register.html"),"utf-8");  
Element loginform = doc.getElementById("registerform");  

Elements inputElements = loginform.getElementsByTag("input");  
for (Element inputElement : inputElements) {  
    String key = inputElement.attr("name");  
    String value = inputElement.attr("value");  
    System.out.println("Param name: "+key+" \nParam value: "+value);  
}