How fast can a BufferedReader read lines in Java?


In an earlier post, I asked how fast thegetlinefunction in C++ could run through the lines in a text file. The answer was about 2 GB/s. That is slower than some of the best disk drives and network connections. If you take into account that software rarely only need to “just” access the lines, it is easy to build a system where text-file processing is processor-bound, as opposed to disk or network bound.

What about Java? In Java, the standard way to access lines in a text file is to use aBufferedReader. To avoid system calls, I create a large string containing many lines of text, and then I call a very simple processing function that merely records the length of the strings…

StringReader fr=new StringReader(data);
BufferedReader bf=new BufferedReader(fr);
bf.lines().forEach(s->parseLine(s));

// elsewhere:
publicvoidparseLine(Strings){
  volume+=s.length();
}

The result is that Java is four times slower than C++, on the same system, for this benchmark:

BufferedReader.lines0.5 GB/s

This is not the best that Java can do: Java can ingest data much faster. However, my results suggest that on modern systems, Java file parsing might be frequently processor-bound, as opposed to system bound. That is, you can buy much better disks and network cards, and your system won’t go any faster. Unless, of course, you have really good Java engineers.

Many firms probably just throw more hardware at the problem.

My source code is available.

Published by

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here