EDIT:
I finally hacked together a working but very inelegant implementation of your algorithm to remove all non-letter chars (if you wanted to include spaces in the result, you could add in an "or" clause using Character.isSpaceChar or Character.isWhitespace ) I wrote it using an instance var, but you could just as easily make it static, public static String removeNonLetters( String inputString ) { (...) }
public String removeNonLetters( ) {
// returns copy of instance var inputString with only letters
String result = new String( );
for( int i=0; i < inputString.length( ); i++ ) {
if( Character.isLetter( inputString.charAt( i ) ) )
result += inputString.charAt( i );
}
return result;
}
StringTokenizer() seems to me like a great method to use (especially if you can give it a list of which chars to consider delimiters, as you can in C's version of it), although the API ( http://download.oracle.com/javase/6/docs/api/index.html ) has deprecated it and says to use the split method of String or the java.util.regex package instead.
From the sample use of split() in the API (in, oddly enough, the page for StringTokenizer, not for String), it looks as if you can designate your delimiter chars,
String[] result = "this is a test".split("\\s");
for (int x=0; x
System.out.println(result[x]);
A couple of brief examples are on the String page of the API (find String[] split(String regex) in the method summary and click on it at http://download.oracle.com/javase/6/docs/api/java/lang/String.html )
"Splits this string around matches of the given regular expression (...)
The string "boo:and:foo", for example, yields the following results with these expressions:
Regex Result
======= ==========
: { "boo", "and", "foo" }
o { "b", "", ":and:f" }
Your ideas seem fine to me, as much of it as you were able to type in (yahoo length limits?)
EDIT: The below did not work as I expected
You could use Java String's matches( regexp ) to match to letters. If Java follows the same conventions as UNIX it would be
if (!testString.matches( "[a-zA-Z]*" ) ) {
/* remove non-letters */
testString.replaceAll( "^[a-zA-Z]", "" );
}
in regular expressions, [] signifies a group of characters,
[a-zA-Z] designates anything in the range a-z or the range A-Z
Putting a * after it matches 0 or more of these things, so [a-zA-Z]* matches 0 or more occurrences of lower case or upper case letters
^ is used for "not" so ^[a-zA-Z] designates any character not in the designated group (here, any non-alphabetic character)
String's replaceAll method replaces each substring that matches the regular expression in the first argument, with the given replacement (in this case, blank)
EDIT: I can't get it to work. I'll try another approach