Michael Gearon

Using the Intl.Segmenter

Michael Gearon

I recently had the need for a prototype to break up a chunk of text into its sentences so that the user could interact with the individual sentences. For example, they could copy that sentence to their clipboard, view comments or leave a comment about that sentence.

I’m not that familiar with Javascript, something I’m trying to improve in 2025, but luckily I came across from this post from Stefan Judis about how to spilt JavaScript strings into sentences, words or graphemes with “Intl.Segmenter”.

The code

With the Intl.Segmenter object is hand I then went about coding this. The output I was looking for was to break up each sentence in a span tag, with the class of sentence. With the class name sentence I then wanted to add some CSS to say on hover add a background colour to highlight which sentence is being highlighted.

HTML

In this instance the HTML is straight forward, I’m just putting some Lorem Ipsum text inside a paragraph element which has an id of text. This id will then be used by the JavaScript to find the paragraph we want to break into sentences

<div class="box">
  <p id="text">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc vel justo vitae mauris lacinia malesuada. Praesent sed quam consequat, varius ex et, tempor nisl. Cras eget aliquet ligula. Sed nec tristique dolor, sed posuere enim. Aenean nec faucibus justo, molestie sodales ex. Suspendisse ullamcorper ullamcorper lorem, eu pulvinar orci sodales eget. Curabitur id enim dolor. Ut ullamcorper, ligula eget hendrerit posuere, nulla odio consequat urna, sit amet finibus tortor tortor eu libero. Duis fringilla quam nunc, a mollis est vulputate quis. Nunc pharetra turpis mauris, ac malesuada ante congue sit amet. Nam at nisl quis mauris aliquet maximus.</p>
</div>

JavaScript

In the JavaScript I’ve created a function called spiltSentence. Then created a const declaration to say use the Intl.Segmenter object, specifying that our text is English and we want to break the text into a sentence (this could be word for example).

Then I created a new const called element and then said find in the document an element with the ID text. If the code is able to find that then break up that paragraph into sentences and for each sentence wrap in a span element with a class of sentence.

If it can’t find the element then write a warning to the console with a warning to say no element with ID sentence found.

document.addEventListener("DOMContentLoaded", function () {
  function splitSentence() {
    
    const segmenter = new Intl.Segmenter('en', { granularity: 'sentence' });

    // Get the element by ID
    const element = document.getElementById('text');

    if (element) {
      // Use the segmenter to split the text content of the element into sentences
      const text = element.textContent;
      const segments = Array.from(segmenter.segment(text)).map(segment => segment.segment);

      // Wrap each sentence in a span with a class for styling
      const wrappedSegments = segments.map(sentence => `<span class="sentence">${sentence}</span>`);

      // Replace the element's HTML content with the wrapped sentences joined with <br>
      element.innerHTML = wrappedSegments.join(' ');
    } else {
      console.warn("No element with ID 'sentence' found.");
    }
  }

  // Call the function to process the element
  splitSentence();
});

CSS

.sentence {
    display: inline;
    margin: 8px 0;
    cursor: pointer;
    box-sizing: border-box;
    margin-right: 7px;
}

.sentence:hover{
    background-color: #ffdd00;
}

Output

And here is the output on CodePen

See the Pen
Using the Intl.Segmenter
by Michael Gearon (@michaelgearon)
on CodePen.

Michael Gearon

Written by

Michael Gearon

Senior Interaction Designer and Co-Author to Tiny CSS Projects