he Normalizer
class (from java.text
in Java 5 and later java.text.Normalizer
or java.text.Normalizer.Form
) is used to transform Unicode text into a normalized form. This is crucial when comparing, storing, or searching strings that may be visually identical but have different internal Unicode representations (e.g., accented characters).
Key Features:
- Ensures consistent Unicode representation
- Supports multiple normalization forms (NFC, NFD, NFKC, NFKD)
- Makes string comparison and storage reliable across languages and inputs
Commonly Used Methods

Normalization Forms(Normalization.Form)

Simple Program
import java.text.Normalizer; public class SimpleNormalizerExample { public static void main(String[] args) { String accented = "é"; // U+00E9 String decomposed = "e\u0301"; // U+0065 + U+0301 System.out.println("Are the strings equal (==)? " + (accented.equals(decomposed))); String normalized = Normalizer.normalize(decomposed, Normalizer.Form.NFC); System.out.println("Are they equal after normalization? " + accented.equals(normalized)); } } /* Are the strings equal (==)? false Are they equal after normalization? true */
Problem Statement
Paani and Mahesh are building a search engine for international names. Users might input names using keyboards that decompose characters (e + ´
instead of é
). To ensure all names match correctly, the system uses Normalizer
to unify representations before storing and searching.
import java.text.Normalizer; import java.util.*; public class NameSearchEngine { private static final List<String> database = Arrays.asList( Normalizer.normalize("José", Normalizer.Form.NFC), Normalizer.normalize("Renée", Normalizer.Form.NFC), Normalizer.normalize("Björk", Normalizer.Form.NFC) ); public static void main(String[] args) { List<String> userInputs = Arrays.asList( "Jose\u0301", // decomposed form of José "Rene\u0301e", // decomposed form of Renée "Bj\u00f6rk" // composed Björk ); for (String input : userInputs) { String normalizedInput = Normalizer.normalize(input, Normalizer.Form.NFC); System.out.println("\nSearching for: " + input); boolean found = false; for (String name : database) { if (name.equals(normalizedInput)) { System.out.println("Match found: " + name); found = true; break; } } if (!found) { System.out.println("No match found."); } } } } /* Searching for: José Match found: José Searching for: Renée Match found: Renée Searching for: Björk Match found: Björk */
The Normalizer
class is essential for text normalization, especially in globalized applications:
- Comparing strings from different sources (e.g., user input vs. DB)
- Storing names, text, or identifiers that involve accents or diacritics
- Building search engines, authentication systems, or data deduplication tools