I want to receive a lot of text (e.g. a book chapter), and create an array of the sentences.
My current code is:
text.match( /[^\.!\?]+[\.!\?]+["']?/g );
This only works when the text ends with one of [. ! ?]. If the final sentence has no punctuation at the end, it's lost.
How do I split my text into sentences, allowing for the final sentence to not have punctuation?
I want to receive a lot of text (e.g. a book chapter), and create an array of the sentences.
My current code is:
text.match( /[^\.!\?]+[\.!\?]+["']?/g );
This only works when the text ends with one of [. ! ?]. If the final sentence has no punctuation at the end, it's lost.
How do I split my text into sentences, allowing for the final sentence to not have punctuation?
\n
i.e new line
– SaidbakR
Commented
Dec 4, 2016 at 11:30
Use $
to match the end of the string:
/[^\.!\?]+[\.!\?]+["']?|.+$/g
Or maybe you want to allow whitespace characters at the end:
/[^\.!\?]+[\.!\?]+["']?|\s*$/g
It depends on the characters in the text but
text.match( /[^\.!\?]+[\.!\?]+|[^\.!\?]+/g );
can do the job.
(If it doesn't work could you provide a few sentences what you can't match?)
Depending on whether you need the punctuation of your sentences in your result you can just use "split"
var txt="One potato. Two Potato. Three";
txt.split( /[\.!\?]+/ );
[ 'One potato', ' Two Potato', ' Three' ]
You can just use [^\.!\?]+
, you don't need the rest:
text = "Mr. Brown Fox. hello world. hi again! hello one more time"
console.log(text.match(/[^\.!\?]+/g))