In message <[EMAIL PROTECTED]>, "Malcolm Davis" writes: >Is there any time to use Regular Expressions >when the format of the steam doesn't change?
Probably not if all you're doing is tokenizing based on a set of delimiters. The equivalent of strtok() is O(n). A specific well-constructed regular expression may be O(n), but it will have much higher associated overheads (a and b are greater in a*n+b). >In other words, if I already know the format, >why use Regular Expressions? Probably convenience if performance isn't a factor. It's analogous to using compiler compilers. Hand-crafted parsers are faster, but take more effort to write. I'll bring this up in a separate thread at some point, but jakarta-oro is showing its age and needs some reimplementation work to take advantage of the latest generation of JITs. Basically, there are some things that made it faster when JITs didn't exist or were bad. For example, strings are converted to character arrays to do the matching. This is now a performance killer rather than a boon. The overhead of calling charAt() was enormous in the past, but the latest HotSpot dynamically inlines the call. That's why if you read your input, or at least your lines, into character arrays instead of using readLine() and generating a string, you should see some performance improvement. I hate to say it, but if you really find regular expressions more convenient, and you can use JDK 1.4 for your project, try using java.util.regex because it benefits from foreknowledge of JIT performance improvements rather than doing undesirable things to work around former performance defficiencies the way jakarta-oro does. I'll stop now before this turns into that new thread. daniel -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
