Thursday, 3 September 2020

PHP Regex function to extract text between tags for a synopsis or intro text

The following method prepare the synopsis from the body and the second part is for unit tests:

/**
* Returns short synopsis
* @param $bodyString Body string
* @return string|null
*/
public function getSynopsis($bodyString): ?string
{
if (null === $bodyString) {
return null;
}
$matchFound = (
\preg_match(
'/(<li>\s*(.*?)\s*<\s*\/li>)/mu',
$bodyString,
$matches
)
);
if (0 < $matchFound) {
return \strip_tags(
\trim(
\current($matches)
)
);
}
return (
\strip_tags(
\trim(
\current(
\preg_split('/^\s?\W\s+/mu', $bodyString, 2, \PREG_SPLIT_NO_EMPTY)
)
)
)
?? $bodyString
);
}
public function dpTestSynopsis(): array
{
return [
[
'value' => "*<b><h1>First sentence for synopsis</b>\n * 2nd sentence from the body\n"
'expect' => 'First sentence for synopsis',
],
[
'value' => "<ul><li>list_1 sentence</li><li>List 2 sentence</li></ul>",
'expect' => 'list_1 sentence',
],
[
'value' => "First sentence for synopsis \n <ul><li>list_1 sentence</li><li>List 2 sentence</li></ul>",
'expect' => 'list_1 sentence',
],
];
}
/**
* @dataProvider dpTestSynopsis
*/
public function testGetSynopsis($value, $expect)
{
self::assertEquals($expect, $this->sut->getSynopsis($value));
}
Hope it helps!