The RuleBasedCollator class, a subclass of Collator in java.text, allows custom string comparison rules. While Collator uses default locale-based rules, RuleBasedCollator gives you explicit control by defining your own rules.
This is useful when:
- Locale rules are insufficient or undesired
- You need to sort based on domain-specific conventions (e.g., chemical names, custom dictionaries)
Key Features:
- User-defined sorting rules (e.g., “b” < “c” < “ch” < “d”)
- Supports accent, case, and symbol handling
- Overrides locale default sorting behavior
Commonly Used Methods

Simple Program
import java.text.RuleBasedCollator;
import java.text.Collator;
public class SimpleRuleBasedCollatorExample {
public static void main(String[] args) throws Exception {
String rule = "< a < b < c < ch < d < e";
RuleBasedCollator collator = new RuleBasedCollator(rule);
String[] words = { "charm", "cut", "camel" };
java.util.Arrays.sort(words, collator);
for (String word : words) {
System.out.println(word);
}
}
}
/*
camel
charm
cut
*/Problem Statement
Paani and Mahesh are building a Sanskrit Dictionary App. Sanskrit has unique sorting rules, such as:
- “k” < “kh” < “g” < “gh” < “ṅ”
- “c” < “ch” < “j” < “jh” < “ñ”
Default locale-based sorting doesn’t apply. They decide to use RuleBasedCollator to enforce custom Sanskrit collation.
import java.text.RuleBasedCollator;
import java.util.Arrays;
public class SanskritDictionarySorter {
public static void main(String[] args) throws Exception {
String rules =
"< a < ā < i < ī < u < ū < ṛ < ṝ < ḷ < ḹ"
+ "< k < kh < g < gh < ṅ"
+ "< c < ch < j < jh < ñ"
+ "< ṭ < ṭh < ḍ < ḍh < ṇ"
+ "< t < th < d < dh < n"
+ "< p < ph < b < bh < m"
+ "< y < r < l < v < ś < ṣ < s < h";
RuleBasedCollator sanskritCollator = new RuleBasedCollator(rules);
String[] words = {
"ghosha", "khaga", "gita", "katha", "ñana", "chitra", "bhakti", "karma"
};
Arrays.sort(words, sanskritCollator);
System.out.println("Sanskrit Dictionary Sorted Words:");
for (String word : words) {
System.out.println(word);
}
}
}
/*
Sanskrit Dictionary Sorted Words:
karma
katha
khaga
gita
ghosha
chitra
ñana
bhakti
*/The RuleBasedCollator is ideal when default locale-based sorting doesn’t meet the needs of your domain:
- Sorting is based on custom alphabet orders
- You’re dealing with non-Latin scripts (like Sanskrit, Devanagari, etc.)
- Locale-based sorting (
Collator) isn’t accurate enough
