PDFsharp & MigraDoc Foundation
http://forum.pdfsharp.de/

How to compress the image in the pdf to reduce the size
http://forum.pdfsharp.de/viewtopic.php?f=2&t=3521
Page 1 of 1

Author:  ukanoldai [ Thu Jan 05, 2017 4:24 pm ]
Post subject:  How to compress the image in the pdf to reduce the size

Hello,

i'm converting .tiff into pdf, for document i have, for example 1300 pages and i do a single document with them.

all is going well, it makes the job in 10 mins, but i would like to reduce the size of it, the total size of the 1300 .tiff is 140mo, at the end the pdf size is 240mo.

I use the version 1.50.4000.0

i have tried all the options below but there's no change.

Code:
s_document.Options.UseFlateDecoderForJpegImages = PdfUseFlateDecoderForJpegImages.Automatic;
s_document.Options.FlateEncodeMode = PdfFlateEncodeMode.BestCompression;
s_document.Options.EnableCcittCompressionForBilevelImages = true;
s_document.Options.CompressContentStreams = true;
s_document.Options.NoCompression = false;


i have also tried to compress the .tiff first in jpeg and than send the stream to the pdf but the final size is even bigger and it consumes enormous quantity of ram.

Code:
            ImageCodecInfo codecInfo = ImageCodecInfo.GetImageEncoders()
                    .Where(r => r.CodecName.ToUpperInvariant().Contains("JPEG"))
                    .Select(r => r).FirstOrDefault();

            var encoder = System.Drawing.Imaging.Encoder.Quality;
            var parameters = new EncoderParameters(1);
            var parameter = new EncoderParameter(encoder, 50L);
            parameters.Param[0] = parameter;

            foreach (var file in filePaths)
            {
                PdfPage page = s_document.AddPage();
                XGraphics gfx = XGraphics.FromPdfPage(page);

                System.Drawing.Image imageSys = System.Drawing.Image.FromFile(file);
                MemoryStream streamJPG = new MemoryStream();
                imageSys.Save(streamJPG, codecInfo, parameters);
                XImage image = XImage.FromStream(streamJPG);
               
                page.Width = image.PointWidth;
                page.Height = image.PointHeight;
                gfx.DrawImage(image, 0, 0);
                image.Dispose();
            }
            s_document.Save(@"c:\DEV\docNoTiff.pdf");


do you have ideas about how i could reduce the size.

I have also tried the DevExpress plugin conversion, with no compression the size is 245 mo and take 20 mins to convert.
With the Jpeg compression set to high quality, the size is 160mo and it takes 35 mins to convert, there's nearly no visible loss of quality.
I Have millions of documents to convert so time is important.

Kind regards
Geoffrey

Author:  TH-Soft [ Fri Jan 06, 2017 6:48 pm ]
Post subject:  Re: How to compress the image in the pdf to reduce the size

Hi!
ukanoldai wrote:
i have tried all the options below but there's no change.
I don't believe that.
"PdfFlateEncodeMode.BestCompression" should make a difference, but only in the small single-digit percent range.

PDFsharp stores TIFF images using lossless compression. Do not expect miracles.
If you reduce the TIFF images (say 80% or 75% of the original size) then you should see a big difference of the file size, but with a loss of quality.

Do you use NuGet packages?
If you use the PDFsharp source code, make sure to make all tests with a Release build.

Author:  Gerben Vos [ Mon Jan 09, 2017 2:14 pm ]
Post subject:  Re: How to compress the image in the pdf to reduce the size

If, as it looks like, PdfSharp converts TIFFs that use JPEG compression to lossless compression, you're lucky that the files only increase from 140MB to 240MB; I would have expected more.

What you could do is to convert the TIFFs to JPEG files and then add those to the PDF, because PdfSharp will keep the image data in JPEG format. However, I know of no tool that converts TIFF to JPEG while avoiding generation loss.

However, something much easier that you can do is to use the tiff2pdf tool from libtiff ( http://libtiff.maptools.org/ ), which can do a lossless conversion directly from TIFF to PDF. You will need to first put the TIFFs together into one huge multi-page TIFF using tiffcp, also from libtiff. If necessary, you could then use PdfSharp to edit the PDF for any additional changes you need.

Author:  phirewind [ Wed May 10, 2017 2:41 pm ]
Post subject:  Re: How to compress the image in the pdf to reduce the size

or use the free ImageProcessor nuget package to pre-process JPG's like this:
using ImageProcessor;
using ImageProcessor.Imaging.Formats;
using System.Drawing;
Code:
private static void CompressImage(string filename)
{
    // Read a file and resize it.
    byte[] photoBytes = File.ReadAllBytes(filename);
    ISupportedImageFormat format = new JpegFormat { Quality = 50 };

    using (MemoryStream inStream = new MemoryStream(photoBytes))
        using (MemoryStream outStream = new MemoryStream())
            using (ImageFactory imageFactory = new ImageFactory())
                imageFactory.Load(inStream).Format(format).Save($"new_{filename}");
}

Author:  Gerben Vos [ Wed May 10, 2017 2:46 pm ]
Post subject:  Re: How to compress the image in the pdf to reduce the size

phirewind wrote:
or use the free ImageProcessor nuget package to pre-process JPG's like this:
That is possible, but note that you will lose image quality because you are uncompressing and re-compressing with JPEG compression.

Author:  phirewind [ Wed May 10, 2017 3:02 pm ]
Post subject:  Re: How to compress the image in the pdf to reduce the size

Yes, and that factor is best weighted against your document content. For artwork or resolution-sensitive images it would not be an application-compatible solution, however I am working with scanned paper documents, and even at 50% quality, the artifacts introduced are negligible for this purpose.

Author:  Gerben Vos [ Wed May 10, 2017 3:10 pm ]
Post subject:  Re: How to compress the image in the pdf to reduce the size

phirewind wrote:
Yes, and that factor is best weighted against your document content. For artwork or resolution-sensitive images it would not be an application-compatible solution, however I am working with scanned paper documents, and even at 50% quality, the artifacts introduced are negligible for this purpose.
Yes, but note that you should not do that with TIFFs that are already JPEG-compressed, and so already have some artifacts. Recompressing will make them worse. But now I read it again, it looks like the original poster's original solution involved TIFFs with another compression. Those would be okay to compress to JPEG, with the caveats you write.

Author:  FeiShengWu [ Mon Apr 20, 2020 3:18 am ]
Post subject:  Re: How to compress the image in the pdf to reduce the size

I used code below to compress PDF file:

Code:
            foreach (PdfPage page in document.Pages)
            {
                PdfDictionary resources = page.Elements.GetDictionary("/Resources");
                if (resources != null)
                {
                    PdfDictionary xObjects = resources.Elements.GetDictionary("/XObject");
                    if (xObjects != null)
                    {
                        ICollection<PdfItem> items = xObjects.Elements.Values;
                        foreach (PdfItem item in items)
                        {
                            if (item is PdfReference reference)
                            {
                                if (reference.Value is PdfDictionary xObject && xObject.Elements.GetString("/Subtype") == "/Image")
                                {
                                    byte[] stream = xObject.Stream.Value;
                                    int width = xObject.Elements.GetInteger(PdfImage.Keys.Width);
                                    int height = xObject.Elements.GetInteger(PdfImage.Keys.Height);

                                    using (MemoryStream inStream = new MemoryStream(stream))
                                    {
                                        using (MemoryStream outStream = new MemoryStream())
                                        {
                                            using (ImageFactory imageFactory = new ImageFactory())
                                            {
                                                imageFactory.Load(inStream).Format(new JpegFormat { Quality = 50 }).Resize(new System.Drawing.Size(width, height)).Resolution(96, 96).Save(outStream);
                                            }

                                            xObject.Stream.Value = outStream.ToArray();
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }


You need to add the reference of ImageProcessor and add these usings:

Code:
using PdfSharp.Pdf;
using PdfSharp.Pdf.Advanced;
using ImageProcessor;
using ImageProcessor.Imaging.Formats;

Author:  Thomas Hoevel [ Mon Apr 20, 2020 8:02 am ]
Post subject:  Re: How to compress the image in the pdf to reduce the size

Hi!
FeiShengWu wrote:
I used code below to compress PDF file

Thanks for the feedback.

I didn't try your code, but I guess it only works for images that only use the DCTFilter. Extra code is required to also support DCT images that are also flate encoded and to skip non-DCT images.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/