import - How can I prevent the WordPress Importer from munging double-newline paragraph breaks to a single newline?

admin2025-06-05  1

When you import posts with the standard WP Importer plugin, most users are going to find that it changes the standard paragraph break sequence of \n\n to a single \n. When it comes time to display the post later, this will prevent wpautop() from wrapping consecutive paragraphs in <p>...</p> tags, instead replacing the break between them with a <br />.

When you import posts with the standard WP Importer plugin, most users are going to find that it changes the standard paragraph break sequence of \n\n to a single \n. When it comes time to display the post later, this will prevent wpautop() from wrapping consecutive paragraphs in <p>...</p> tags, instead replacing the break between them with a <br />.

Share Improve this question asked Dec 29, 2018 at 0:21 scott8035scott8035 2302 silver badges10 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

I banged my head on the wall for hours before I realized there was a simple solution I could use to pre-process the import XML file prior to running WP Importer. The solution is to just run the file through a home-grown filter program that runs wpautop() on the content before it has a chance to get munged by the plugin.

Since getting the content blocks out of the input XML is a little tedious, I decided to share my code with the community to kick-start your next import. The code and a lengthier explanation are at "WordPress import with wpautop". I'll include the PHP code here for preservation within StackExchange:

$accum  = 0;
$buffer = '';

while ( $line = fgets( STDIN ) ) {
    $line = preg_replace( '/\r\n/', "\n", $line );
    $line = preg_replace( '/\r/',   "\n", $line );

    $start = false;
    $end   = false;
    if ( preg_match( '/^\s*<content:encoded><!\[CDATA\[/', $line ) ) {
        $line = preg_replace( '/^\s*<content:encoded><!\[CDATA\[/', '', $line ); 
        $start = true; 
    } 
    if ( preg_match( '/\]\]><\/content:encoded>\s*$/i', $line ) ) {
        $line = preg_replace( '/\]\]><\/content:encoded>\s*$/i', '', $line );
        $end = true;
    }

    if ( $start && $end ) {
        echo $line;
    } elseif ( $start ) {
        $accum = true;
        $buffer = $line;
    } elseif ( $end ) {
        $accum = false;
        $buffer .= $line;
        echo '<content:encoded><![CDATA[' . wpautop( $buffer ) . ']]></content:encoded>';
    } else {
        if ( $accum ) {
            $buffer .= $line;
        } else {
            echo $line;
        }
    }
}

exit(0);

This technique relies on being able to run wpautop() from inside a plain ol' command line program. See "Using WordPress in a non-interactive 'batch' CLI process" for details on how to do that.

转载请注明原文地址:http://conceptsofalgorithm.com/Algorithm/1749061337a316021.html

最新回复(0)