PHP - Remove unopened or unclosed HTML tags.
Posted July 31st 2014, 8:51pm
I need to use PHP to remove HTML tags that either have an opening with no closing or a closing with no opening. For example, I would want <div><span></div> to convert to <div></div> and I would want <div></span></div> to also convert to <div></div>.

I've found a way to do this using the following code:

$domDoc = new DOMDocument();
$domDoc->loadHTML($htmlCodeString);
$htmlCodeString = $domDoc = $domDoc->saveHTML();


There are several reasons that I do not want to do this. They are:
1.) This method is capable of catching when a tag is not properly closed, but it throws the falling warning.
DOMDocument::loadHTML(): Unexpected end tag
2.) The HTML it generates is intended to be on a separate page. It decks the variable out with the <html></html> tags, and throws <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> in for good measure.

How can I do what I want without doing something I don't want at the same time? I suppose I could always just use the string length function to hack my way around this, but I would like to avoid doing so if possible.
φ
Posts: 280
Joined: October 2nd 2011, 11:00pm
Likes Given: 27
Likes Received: 4
PHP - Remove unopened or unclosed HTML tags.
Posted August 1st 2014, 9:17am
For your first issue, try prepending the @ character to the function that throws the warning:

@$domDoc->loadHTML($htmlCodeString);

For your second issue, you might want to take a look at the user comments on this page:

http://php.net/manual/en/domdocument.savehtml.php
φ
Posts: 1585
Joined: March 12th 2009, 11:00pm
Location: Uncertain due to momentum
Likes Given: 26
Likes Received: 356

Who is online

Users browsing this forum: No registered users and 1 guest