The BreakIterator
class (in java.text
) is used to identify text boundaries such as words, sentences, lines, and characters. It is locale-sensitive and critical in text-processing applications like editors, search tools, and word counters.
Key Features:
- Locale-aware breaking of text into Words,Sentences,Lines,Characters
- Supports internationalized text segmentation
- Abstract class with factory methods for different break types
Commonly Used Methods

Simple Program
import java.text.BreakIterator; import java.util.Locale; public class SimpleBreakIteratorExample { public static void main(String[] args) { String text = "LotusJavaPrince is teaching Java. They love it!"; BreakIterator iterator = BreakIterator.getSentenceInstance(Locale.ENGLISH); iterator.setText(text); int start = iterator.first(); for (int end = iterator.next(); end != BreakIterator.DONE; start = end, end = iterator.next()) { System.out.println("Sentence: " + text.substring(start, end)); } } } /* Sentence: LotusJavaPrince is teaching Java. Sentence: They love it! */
Problem Statement
Paani and Mahesh are creating an AI-based writing assistant. To provide suggestions and feedback, they need to analyze each word and sentence in a given article. The system should split the text into sentences and then into words, using BreakIterator
for accurate boundary detection that works across different locales.
import java.text.BreakIterator; import java.util.Locale; public class TextAnalyzer { public static void main(String[] args) { String article = "Mahesh loves Java. LotusJavaPrince teaches him well. They build amazing programs."; // Sentence segmentation BreakIterator sentenceIterator = BreakIterator.getSentenceInstance(Locale.ENGLISH); sentenceIterator.setText(article); int sentenceStart = sentenceIterator.first(); for (int sentenceEnd = sentenceIterator.next(); sentenceEnd != BreakIterator.DONE; sentenceStart = sentenceEnd, sentenceEnd = sentenceIterator.next()) { String sentence = article.substring(sentenceStart, sentenceEnd); System.out.println("\nSentence: " + sentence.trim()); // Word segmentation within the sentence BreakIterator wordIterator = BreakIterator.getWordInstance(Locale.ENGLISH); wordIterator.setText(sentence); int wordStart = wordIterator.first(); for (int wordEnd = wordIterator.next(); wordEnd != BreakIterator.DONE; wordStart = wordEnd, wordEnd = wordIterator.next()) { String word = sentence.substring(wordStart, wordEnd).trim(); if (!word.isEmpty() && Character.isLetterOrDigit(word.charAt(0))) { System.out.println(" Word: " + word); } } } } } /* Sentence: Mahesh loves Java. Word: Mahesh Word: loves Word: Java Sentence: LotusJavaPrince teaches him well. Word: LotusJavaPrince Word: teaches Word: him Word: well Sentence: They build amazing programs. Word: They Word: build Word: amazing Word: programs */
The BreakIterator
class is vital for text boundary analysis, especially for internationalized and language-aware applications.
- Building editors, translators, summarizers, or NLP tools
- Need locale-sensitive sentence or word segmentation
- Handling user input, search indexing, or grammar checks