jTokeniser is a Java library for tokenising
strings into a list of tokens. A variety of possible
tokenisers are available, including a very basic
whitespace tokeniser, a more flexible
StringTokeniser, a couple of regular expression tokenisers,
and a tokeniser that utilises Java's BreakIterator, which provides more complex, locale dependant tokenisation. More recently, a tokeniser that add breaks text into its constituent sentences. All are very simple to use.