[java]代码库
//http://cc-cedict.org/wiki/start
Pattern line_pattern = Pattern.compile("([^\\s]+)\\s([^\\s]+)\\s(\\[.+\\])\\s(/.+/)");
Matcher matcher = line_pattern.matcher(line);
boolean matchFound = matcher.find();
while(matchFound) {
System.out.println(matcher.start() + "-" + matcher.end());
for(int i = 0; i <= matcher.groupCount(); i++) {
String groupStr = matcher.group(i);
System.out.println(i + ":" + groupStr);
}
if(matcher.end() + 1 <= line.length()) {
matchFound = matcher.find(matcher.end());
}
else{
break;
}
}
T字帳 T字帐 [T zi4 zhang4] /T-account (accounting)/
0-47
0:T字帳 T字帐 [T zi4 zhang4] /T-account (accounting)/
1:T字帳
2:T字帐
3:[T zi4 zhang4]
4:/T-account (accounting)/