Posts

Showing posts with the label Find and Replace

LibreOffice is Not Responding when doing massive Find and Replace ... or when doing other things. The Fix.

Image
We recommend you bookmark this blog. Trying to find helpful information in forums usually just ends in making you frustrated and angry. This blog does not speculate or guess. If we post it, we've tried it and it works. If you don't want to read all the intro stuff, skip to the section in yellow below and heed those instructions only. Scenario I do a lot of conversions of public domain books from pdf to .odt formats. In doing so, I am working from .txt versions of the books to convert them to .odt versions. That process requires me to do a ton of clean-up to the .odt files, including massive Find and Replace tasks. For example, the file will be full of multiple repeating spaces between words, where there should be only one space. To accomplish that task, I search for (space, space) and replace it with (space). And herein lies the rub: when doing this in a 600-page document, there can be upwards of 100,000 replacements. Naturally, it takes a long time to do all those replacement...

Common Regex Find & Replace strings used by this Blogger.

Image
As I have stated in other posts, I do a lot of document conversions from pdf, txt and epub to .docx, .odt and Google Docs.Most of the documents are between 5 and 500 pages, occasionally stretching to 700 or 1,000+ pages. I work in a Linux environment, so my most common means of converting documents are Okular, Tesseract (command line), G Suite, and Calibre--in that order, with Okular being the most-often used. In some cases, the documents I am converting have already been scanned into plain text (txt) and I am simply downloading, then converting to final format (usually .odt or .docx) so that I can add images, footnotes, comments, etc., often reconverting the final product back to pdf and/or epub formats. All that is more fully explained in this post.   Below are some of the more common (and not-so-common) Regex and non-Regex hacks I use. Again, I work strictly in a Linux OS environment, but much of my work is converted to MS Work and Google Docs, so I use LibreOffice's built-in Fi...