Page 1 of 1

PDF size too large

Posted: Tue Jun 19, 2018 2:40 am
by solti79forum
Dear Daphnis,

Hello. I had a question regarding scans and PDF's. Looking for answers through IMSLP I found the option to ask you about it.

I have recently created some PDF's with the purpose of uploading them to the website. However, I notice these are quite large compared to other PDF's I have downloaded before from IMSLP. For example, a 2 page score that I downloaded from the website is 311k while one that I created is 2.9 MB.

I looked into reducing size without losing quality, but the two options I tried did not work. I tried using the Reduce Size option from Preview in Mac, but quality here reduced considerably, making the file somewhat blurry and hard to read. I also tried an option in the Color Sync Utility, but even though I followed the steps listed in the process, the size remained unchanged.

I was just wondering if you had encountered this kind of problem before, and if you could perhaps offer a solution on how to reduce the size of a PDF without compromising its quality.

Any assistance you could provide me in this matter would be greatly appreciated.

Thank you in advance for your help,

Carlos

Re: PDF size too large

Posted: Tue Jun 19, 2018 9:16 am
by coulonnus
Is your scan color, grayscale or monochrome?

Re: PDF size too large

Posted: Tue Jun 19, 2018 12:23 pm
by solti79forum
My guess is that it is grayscale, which is why it is so much larger than it should be. I have to apologize, my know-how in scanning is very limited. How can I find out for sure if it is grayscale, and is there a way to convert it to monochrome?

Thank you in advance for your help!

Re: PDF size too large

Posted: Tue Jun 19, 2018 5:44 pm
by coulonnus
Compare the versions on http://imslp.org/wiki/Sous_les_palmiers ... C3%89mile) Compare the sizes!

If the program that goes with your scanner does not offer monochrome, there are http://imagemagick.org/script/index.php and https://www.irfanview.com/. Both have their forums.

Re: PDF size too large

Posted: Thu Jun 21, 2018 4:20 am
by coulonnus
Does your scanner provide a .pdf file directly? Otherwise what is the extension of the pages it provides?

Re: PDF size too large

Posted: Fri Jun 22, 2018 5:11 pm
by solti79forum
Thank you for your concern.

Yes, the scanner I used provided a .pdf file (this was a while back, and regrettably I don't have a chance of re-scanning these files at the moment).

I then used Preview on Mac to save that .pdf file as an image (PNG) with 600 pixels/inch resolution. I then edited the file and saved it as a JPG with the best quality. I changed it to JPG because the program I have on my Mac can only create PDF's from JPG files. However, I do have Nitro PDF creator on a PC laptop, which accepts other formats to create PDF's, and which I could also use, if necessary.

Following your suggestion, I went into

http://imslp.org/wiki/IMSLP:Image_Conversion

and downloaded IrfanView. I have gone through the steps listed on the page, although I was doubtful about the part where it asks to set a new size of 366%. This was making the page larger than 8.5 X 11, so I kept it at 100%.

I do get a much smaller file (108k), but the quality also reduced considerably. Sallen112 (an IMSLP administrator) looked at the file and also agrees that the quality is not up to par. I have been playing around with other formats (PNG, TIFF), changing the resolution on Preview, as well as other compression options on Irfan. The best result so far is a 234k file obtained by using the "Huffman" option. It is better than before, although I wouldn't say that the result is ideal.

If you have any suggestions of other things that I could try, either on Preview, Irfan, or both (or even some other program), I would really appreciate them.

Thank you in advance for your help!

Re: PDF size too large

Posted: Sat Jun 23, 2018 8:51 pm
by coulonnus
What is the page size when it is in the tif format? Which size does it say when in Irfanview you click on Image, Information?

Re: PDF size too large

Posted: Sun Jul 01, 2018 3:04 pm
by cheap imitation
Best results will be gained through converting the grayscale PDF file to individual black-and-white (1-bit) TIFF images, and then recombining those images into a new PDF. CCITT G4 offers lossless compression of 1-bit images, so that a 600dpi page comes out to about 100-200kB. There is no freeware I know of that can do this on OSX unless you are sufficiently comfortable with the command line to use ImageMagick. Also Acrobat still costs $450 and will slowly destroy your sanity if you try to use it.

My tactic was always to open the grayscale PDF in GIMP as images, use the threshold tool (Colors > Threshold) to reduce each image to two colours at whatever set point looks least bad, convert it to indexed 1-bit (Image > Mode > Indexed), save the image to TIFF CCITT G4, repeat for the other 57 images etc.... also taking the opportunity to clean up as many artifacts as possible.... and then reboot into Windows to convert the folder of TIFFs into a PDF with irfanview. So obviously they all had to be saved on an external drive for this purpose. It was quite cumbersome.

Re: PDF size too large

Posted: Wed Apr 17, 2019 5:20 am
by harryr
SCANNING AND PDF ON A MAC

What a great discussion! For software on a mac, I urge GraphicConverter. It's cheap and does the basics we need well and in straightforward ways. The author also responds readily and helpfully to questions and requests.

It has CCITT and converts easily between formats. If you already have scans in greyscale, don't trash them. Use a batch operation to convert them all together to black & white.

Finally, have all the images in one folder. The software can create the PDF from that.

Re: PDF size too large

Posted: Sat Apr 27, 2019 12:15 am
by harryr
CAVEAT - SAVING A SCAN ON A MAC

Oh, and I neglected to mention something crucial. Some years ago I noticed that if I had a 1-bit scan (black and white), edited it and then saved it on a mac, the size changed to much bigger. What perplexed me most was that even if I just scaled it down (actually made it smaller, to fit on A4) I still had this problem.

I got some advice from the author of GraphicConverter. (In it, he refers to Cocoa, an "under-the-hood" part of macOS.)

"GraphicConverter imports the 1-bit image as an 8-bit grayscale image due to Cocoa restrictions. The scale-down does create levels of grays.
So, the solution:
Open the image.
Scale it.
Select Picture/Black & White/Threshold -> OK
Save.
So, you will get a 1-bit image as the result."

Some rather important information! This repetitive task is easy to automate, moreover. Basically, I ignore the issue initially and create a folder of edited larger scan files. Then I use a batch task to convert them en-mass to 1-bit. It's become a ritual as the final step each time, before making a PDF.

The other day I remembered all this and asked him again:

"1) Is this still the case… and
"2) does it affect only GC or other apps like GIMP too?"

He replied:
"1) Yes, still use the same procedure.
"2) I don’t know the GIMP internals. Preview uses 8-bit gray or more, too."

(Preview is an app by Apple for reading and writing PDFs. It's part of the macOS installation.)