This article describes the method of Java obtaining the specified HTML tags and specifying attribute values based on regular expressions. Share it for your reference, as follows:
Sometimes there may be such a requirement. Getting the specified attribute value of the specified tag from the HTML page can be obtained through third-party library parsing, but this is relatively troublesome!
If you use regular expressions, it becomes simple. The code is as follows:
package com.mmq.regex;import java.util.ArrayList;import java.util.List;import java.util.regex.Matcher;import java.util.regex.Pattern;/** * @use Get the value of the specified attribute of the specified HTML tag* @ProjectName stuff * @Author mikan * @FullName com.mmq.regex.MatchHtmlElementAttrValue.java * @JDK 1.6.0 * @Version 1.0 */public class MatchHtmlElementAttrValue { /** * Get the value of the specified attribute of the specified HTML tag* @param source source source text to match* @param element tag name* @param attr attribute name* @return attribute name of the tag* @return attribute value list*/ public static List<String> match(String source, String element, String attr) { List<String> result = new ArrayList<String>(); String reg = "<" + element + "[^<>]*?//s" + attr + "=['/"]?(.*?)['/"]?(//s.*?)?>"; Matcher m = Pattern.compile(reg).matcher(source); while (m.find()) { String r = m.group(1); result.add(r); } return result; } public static void main(String[] args) { String source = "<a title=China Sports News href=''>aaa</a><a title='Beijing Daily' href=''>bbb</a>"; List<String> list = match(source, "a", "title"); System.out.println(list); }}PS: Here are two very convenient regular expression tools for your reference:
JavaScript regular expression online testing tool:
http://tools.VeVB.COM/regex/javascript
Regular expression online generation tool:
http://tools.VeVB.COM/regex/create_reg
I hope this article will be helpful to everyone's Java programming.