Wednesday, 14 January 2015

Java8 Parallel Streams


Recently found great article on when to use Parallel Stream by Doug lea and team. In this blog i will share some performance result of different data structure.

Splittable data structure
This is one of the most important factor that needs to be consider. All the parallel stream operation are executed using default fork join pool and fork join pool uses Divide and conquer algorithms to split stream in small chunks and apply function on small chunks, so splitting is important factor and if splitting is going to take more time then all computation is going to choke!

Types of datastructure

-Array
Array based data structure are most efficient data structure from splitting perspective because each element can be randomly accessed , so splitting is very quick operation.
Some of the example of array based collections are ArrayList & open addressing based Set/Map.

-Tree
Balanced tree based collection like TreeMap or ConcurrentHashMap. It is easy to chop the collection into 2 parts, but if tree is not balanced then splitting will be not that efficient because task load will be not equal for each thread.
Another thing to consider is that memory access pattern for tree based data structure is random , so there will be memory latency cost when elements are accessed.

-LinkedList
This type of data structure gives worst splitting performance because each element must be visited to split it. Some of the samples from JDK are LinkedList,Queues

- Others
This for all others type of datastructure for eg I/O based , BufferedReader.lines which returns stream but splitting operation is very expensive.

Some performance numbers
All performance tuning guess must be backed by experiment, so lets have look at some numbers.

Some code snippet
Array Stream
public static Stream<Integer> arrayStream() {
if(values==null) {
values = new Integer[COUNT];
for (int c = 0; c < COUNT; c++) {
values[c] = c;
}
}
return Stream.of(values);
}
view raw ArrayStream hosted with ❤ by GitHub
ArrayList Stream
public static Stream<Integer> arrayListStream() {
if(alvalues==null) {
alvalues = new ArrayList<>();
for (int c = 0; c < COUNT; c++) {
alvalues.add(c);
}
}
return alvalues.stream();
}
view raw ArrayListStream hosted with ❤ by GitHub
Set Stream
public static Stream<Integer> setStream() {
if(setValues==null) {
setValues = new HashSet<>();
for (int c = 0; c < COUNT; c++) {
setValues.add(c);
}
}
return setValues.stream();
}
view raw SetStream hosted with ❤ by GitHub
LinkedList Stream

public static Stream<Integer> linkedListStream() {
if(linkedListValues==null) {
linkedListValues = new LinkedList<>();
for(int c=0;c<COUNT;c++) {
linkedListValues.add(c);
}
}
return linkedListValues.stream();
}
view raw LinkedList hosted with ❤ by GitHub

Measurement Code

public static void main(String...args) {
System.out.println("Array " + sumValues( () -> arrayStream()) + " sec");
System.out.println("ArrayList " + sumValues(() -> arrayListStream()) + " sec");
System.out.println("set " + sumValues(() -> setStream()) + " sec");
System.out.println("linkedlist " + sumValues(() -> linkedListStream()) + " sec");
}
public static long sumValues(Supplier<Stream<Integer>> numberSupplier) {
int result = 0;
long fastest = Integer.MAX_VALUE;
for(int cnt=0;cnt<10;cnt++) {
long start = System.nanoTime();
result=numberSupplier.get().parallel().reduce(Integer::sum).get();
long total = System.nanoTime() - start;
fastest = Math.min(fastest, total / 1_000_000);
}
return fastest;
}
view raw MeasurementCode hosted with ❤ by GitHub

Result 

Array 31 sec
ArrayList 31 sec
set 52 sec
linkedlist 106 sec

Result confirms the guess that Array based collections are fastest, so choose right datastructure before you start using parallel streams.

Code used in blog is available @ github

No comments:

Post a Comment