org.hfbk.util.HTMLUtils Class Reference

List of all members.

Static Public Member Functions

static String clean (String html)
static String removePunctuation (String text)

Static Package Functions

static String decodeEntities (String encoded)

Static Package Attributes

static Matcher punctuationMatcher = Pattern.compile("(^[^\\p{N}\\p{L}])|([^\\p{N}\\p{L}]$)").matcher("")
static Matcher entityMatcher = Pattern.compile("&[^;]+;").matcher("")

Detailed Description

HTML format related utils.


Definition at line 13 of file

Member Function Documentation

static String org.hfbk.util.HTMLUtils.clean ( String  html  )  [static]

clean HTML to plain text.

strips all tags, tabs and newlines. decodes hex entities to utf8.

html as plain text

Definition at line 25 of file

References org.hfbk.util.HTMLUtils.decodeEntities().

Here is the call graph for this function:

static String org.hfbk.util.HTMLUtils.removePunctuation ( String  text  )  [static]

removes punctuation around words

Definition at line 41 of file

References org.hfbk.util.HTMLUtils.punctuationMatcher.

static String org.hfbk.util.HTMLUtils.decodeEntities ( String  encoded  )  [static, package]

Definition at line 48 of file

References org.hfbk.util.HTMLUtils.entityMatcher.

Referenced by org.hfbk.util.HTMLUtils.clean().

Here is the caller graph for this function:

Member Data Documentation

Matcher org.hfbk.util.HTMLUtils.punctuationMatcher = Pattern.compile("(^[^\\p{N}\\p{L}])|([^\\p{N}\\p{L}]$)").matcher("") [static, package]

Definition at line 39 of file

Referenced by org.hfbk.util.HTMLUtils.removePunctuation().

Matcher org.hfbk.util.HTMLUtils.entityMatcher = Pattern.compile("&[^;]+;").matcher("") [static, package]

Definition at line 46 of file

Referenced by org.hfbk.util.HTMLUtils.decodeEntities().

The documentation for this class was generated from the following file:
Generated on Tue Apr 7 17:57:46 2009 for visclient by  doxygen 1.5.1