Category Archives: WP-CLI

Cleaning up hacked WordPress spam, content injection and defacement using WP-CLI regex search

This post will show how you can use the build in search-replace function in WP-CLI with regex matching to batch remove harmful content from your WordPress site. This means that you can remove hundreds or thousands of injections in a matter of seconds instead of going through content and dumps manually.

Example of post_content injection:

...
Lorem ipsum dolor sit amet. <script src="https://nameserverdom.tk/assdhdfer" type="text/javascript"></script><script src="https://nameserverdom.tk/assdhdfer" type="text/javascript"> </script>Lorem ipsum dolor sit amet.
...

Go to https://regex101.com/ and figure out a good regex that fits your type of defacement.

For the defacement above, I settled with:

<script.*?tk.*?<\/script>

Now it’s time to run WP-CLI to remove the defacement.

It’s always good to test first with the --dry-run flag, which simulates a run but doesn’t actually do any replacements.

wp search-replace '<script.*?tk.*?<\/script>' '' --all-tables --dry-run --report-changed-only --precise --regex --regex-delimiter='/'

WP-CLI will tell you how many replacements are expected. When you feel like you’ve got a good result, remove --dry-run and you get the final command to run:

wp search-replace '<script.*?tk.*?<\/script>' '' --all-tables --report-changed-only --precise --regex --regex-delimiter='/'

Related

Using Regex with WP CLI to Search & Replace in the Database

Migrating a WordPress Site From HTTP to HTTPS with WP-CLI

It’s as easy as:

Take a backup

wp db export

Test rewriting from HTTP to HTTPS

wp search-replace 'http://example.com' 'https://example.com' --dry-run

If everything is looking good, it’s time to rewrite:

wp search-replace 'http://example.com' 'https://example.com'

Some tutorials recommend adding the --skip-columns=guid flag. The train of thought is that old posts will be visible to RSS readers as new ones. I don’t think this is a big problem, and it’s much cleaner to not keep non-SSL urls in your database. It can also cause issues if you are using the GUID field to grab image urls/ids from attachments. (It’s a very popular way of mapping a URL to an attachment ID). With that in mind, I find it’s best to omit this flag.