Posts

Converting an image pdf file to a searchable text pdf file in a Linux environment

 Okay, so that's a really long title for a blog post, but sometimes you must use many words to explain what it really is that you are doing, a lesson learned by spending a lot of time on the mostly worthless forums where people have very little ability to form a subject line that has anything to do with their issue.  At any rate, some background. I love downloading public-domain (mostly) books and documents, but often they are scanned as image files. As I am a writer and want to use quotes from the pdf, it is much easier if I convert the picture pdf to a text pdf so I can copy and paste, rather than re-typing the  quoted material.  There are lots of ways to go about this conversion task, but often they require buying conversion software or paying to play in the cloud. I hate spending money on work stuff, so here's my simple, quick solution.  Install gscan2pdf. In Ubuntu, you can do that from the Ubuntu Software Centre or the Gnome Software Center. If you are int...

OKULAR WARNING: Don't Lose Your Bookmarks

Image
Here at Luke's LO Hacks we love Okular , a free, open-source pdf program that is one of our main companion programs to LibreOffice Writer. It's great, but not perfect, which means from time to time we find either a glitch or have a suggestion for an improvement. In truth, glitches are few, but one significant glitch we discovered was the loss of bookmarks after renaming a pdf file that we bookmarked. Here's the sequence : Open pdf. Add bookmarks (in this case, quite a few). Save the file and close it. In Nautilus (file folders program for Ubuntu Linux), rename the file to append the word "Bookmarked." Re-open the file. Result : Bookmarks disappeared, had to re-bookmark everything. Bug Report : A bug report was submitted (Okular version 1.10.0).  WHAT YOU SHOULD DO: If you have experienced the same problem, make sure you re-name your file BEFORE adding bookmarks; and once you have added bookmarks, don't change the file name unless you don't mind re-bookmark...

Fixing pasted or imported text that won't break properly and causes large white-spaces between words

Image
The first passage below shows text copied and pasted "without formatting" into a new document. Note that some words break at the end of the sentence, but without hyphenation. Note also that the type is not justified, even though the paragraph style is fully justified. If you try to fix this the easy, normal way you might find that it cannot be done. I'm not going to go into detail to explain everything that won't work; instead I will just jump to a fix that does work. This fix involved the use of the LibreOffice add-on, "Advanced Find and Replace." Click the link to install it, then save your work, close LO and then re-open it. Now you will see the Advanced Find and Replace icon in the menu bar. --Frustrating, wrongly-formatted text after copy and paste. (Be sure to select the text you want to correct before you do the Find and Replace).  After you install the add-on, click the icon to launch "Alternative Find & Replace." Then make it look li...

Footnotes - Size and length of Divider Bar

Image
Finding this in the Toolbar is not at all intuitive, so here's how you can change the size and thickness of that little bar that separates a footnote from normal text on a page. Select Format, Page Style, Footnote, then make any desired changes in the dialog box that pops up and click "OK". That's it. You're done. 

Create and Modify Numbered Paragraph Styles

I know, our motto on this blog is "forums suck," but that is not an absolute. Occasionally the answers on a forum are spot-on, but even then they tend to be bloated.  We found a solution to " Create and Modify Numbered Paragraph Styles " on the LibreOffice Forum. While the solution works, it looks fairly bloated (perhaps, of necessity). Anyway, you can find it here.   We will test the solution and see if we can improve the answer by making it shorter and easier to follow. If or when we accomplish that task, we will modify this post to include our findings.

Make Your Numbering Numbers Bold in LibreOffice Writer

Image
 It seems like a lot of people struggle with this task that should be fairly simple. Let's solve it. Problem : In LibreOffice Writer, the numbers in your numbered list are not bold, and you cannot figure out how to make them bold. You created a numbered list, but the list numbers are not bold. You tried all sorts of ways to make the number bold, with no luck. That's because there is a simple, but somewhat obscure way to make the numbers bold in LO Writer. Once you learn it, you won't forget. Numbered list, but list number itself is not bold Solution :  After you have created your numbered list, right-click in the first numbered paragraph. Select "Bullets and Numbering," then "Bullets and Numbering" again in the drop-down. In the "Bullets and Numbering" dialog box that pops up, select the "Customize" tab. Click on the "Character Style" drop-down box and select "Strong Emphasis," then click okay. Numbered list, list ...

Common Regex Find & Replace strings used by this Blogger.

Image
As I have stated in other posts, I do a lot of document conversions from pdf, txt and epub to .docx, .odt and Google Docs.Most of the documents are between 5 and 500 pages, occasionally stretching to 700 or 1,000+ pages. I work in a Linux environment, so my most common means of converting documents are Okular, Tesseract (command line), G Suite, and Calibre--in that order, with Okular being the most-often used. In some cases, the documents I am converting have already been scanned into plain text (txt) and I am simply downloading, then converting to final format (usually .odt or .docx) so that I can add images, footnotes, comments, etc., often reconverting the final product back to pdf and/or epub formats. All that is more fully explained in this post.   Below are some of the more common (and not-so-common) Regex and non-Regex hacks I use. Again, I work strictly in a Linux OS environment, but much of my work is converted to MS Work and Google Docs, so I use LibreOffice's built-in Fi...