Important Concepts:
- groupingBy: A collector that groups elements by a classifier function, producing a Map.
- Classifier: A function that determines the key for each element (e.g., String::length).
- Downstream Collectors: Optional collectors applied to grouped elements (e.g., toList(), counting()).
- Multi-level Grouping: Nesting groupingBy for hierarchical grouping.
- Partitioning: A special case of grouping with a boolean classifier, using partitioningBy.
Collectors.groupingBy Variants
The groupingBy collector has three overloads:
- groupingBy(classifier): Groups elements into a Map<K, List<T>>.
- groupingBy(classifier, downstream): Groups elements and applies a downstream collector to each group.
- groupingBy(classifier, mapFactory, downstream): Specifies a custom Map implementation (e.g., TreeMap).
Basic Grouping Examples
Group by a Property (Single-Level) Groups strings by their length, with each value being a List of strings:
List<String> names = List.of("Alice", "Bob", "Charlie", "David");
Map<Integer, List<String>> namesByLength = names.stream()
.collect(Collectors.groupingBy(String::length));
// Result: {3=[Bob], 5=[Alice], 6=[David], 7=[Charlie]}
Code language: JavaScript (javascript)
Map<Integer, Long> countByLength = names.stream()
.collect(Collectors.groupingBy(String::length, Collectors.counting()));
// Result: {3=1, 5=1, 6=1, 7=1}
Code language: JavaScript (javascript)
Map<Integer, List<String>> sortedNamesByLength = names.stream()
.collect(Collectors.groupingBy(
String::length,
TreeMap::new,
Collectors.toList()
));
// Result: {3=[Bob], 5=[Alice], 6=[David], 7=[Charlie]} (keys sorted)
Code language: JavaScript (javascript)
Multi-Level Grouping
You can nest groupingBy for hierarchical grouping. For example, group by length, then by first letter:
List<String> names = List.of("Alice", "Bob", "Charlie", "David", "Amy");
Map<Integer, Map<Character, List<String>>> namesByLengthThenFirstLetter = names.stream()
.collect(Collectors.groupingBy(
String::length,
Collectors.groupingBy(name -> name.charAt(0))
));
// Result: {3={B=[Bob]}, 5={A=[Alice, Amy]}, 7={C=[Charlie], D=[David]}}
Code language: JavaScript (javascript)
Common Downstream Collectors
Downstream collectors transform the grouped elements. Popular ones include:
- toList(): Collects elements into a List (default).
- toSet(): Collects elements into a Set (removes duplicates).
- counting(): Counts elements in each group.
- mapping(): Transforms elements before collecting (e.g., extract a property).
- reducing(): Reduces elements in each group (e.g., sum, min, max).
- summarizingInt/Long/Double(): Computes statistics (e.g., sum, average, min, max).
Examples:
Extract a Property with mapping Group by length, collecting only the first letter of each name:
Map<Integer, List<Character>> firstLettersByLength = names.stream()
.collect(Collectors.groupingBy(
String::length,
Collectors.mapping(name -> name.charAt(0), Collectors.toList())
));
// Result: {3=[B], 5=[A, A], 7=[C, D]}
Code language: PHP (php)
Map<Integer, Integer> totalLengthByLength = names.stream()
.collect(Collectors.groupingBy(
String::length,
Collectors.reducing(0, String::length, Integer::sum)
));
// Result: {3=3, 5=10, 7=14}
Code language: PHP (php)
Statistics with summarizingInt Group by first letter, computing length statistics:
Map<Character, IntSummaryStatistics> statsByFirstLetter = names.stream()
.collect(Collectors.groupingBy(
name -> name.charAt(0),
Collectors.summarizingInt(String::length)
));
// Result: {A=IntSummaryStatistics{count=2, sum=10, min=5, average=5.0, max=5}, ...}
Code language: JavaScript (javascript)
Partitioning (Special Case of Grouping)
Collectors.partitioningBy groups elements into two categories (true and false) based on a predicate. It’s like groupingBy with a boolean classifier.
Example:
List<String> names = List.of("Alice", "Bob", "Charlie");
Map<Boolean, List<String>> longNames = names.stream()
.collect(Collectors.partitioningBy(name -> name.length() > 3));
// Result: {false=[Bob], true=[Alice, Charlie]}
Code language: JavaScript (javascript)
With a downstream collector:
Map<Boolean, Long> countLongNames = names.stream()
.collect(Collectors.partitioningBy(
name -> name.length() > 3,
Collectors.counting()
));
// Result: {false=1, true=2}
Code language: JavaScript (javascript)
When to Use Grouping
- When you need to categorize data based on one or more criteria.
- For aggregating data (e.g., counting, summing, or computing statistics).
- When building hierarchical or summarized views of data.
Grouping operations in Java Streams provide a robust and flexible way to partition and categorize data. By using Collectors.groupingBy(), you can easily group elements based on various criteria and apply downstream collectors for additional aggregation. This capability is essential for data analysis and transformation tasks, allowing for complex data manipulations with concise and readable code.