PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Sun Jun 30, 2024 3:19 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 9 posts ] 
Author Message
 Post subject: Split By Size
PostPosted: Thu Dec 16, 2010 12:46 pm 
Offline

Joined: Thu Dec 16, 2010 12:39 pm
Posts: 1
Hi,

We are merging 20s of PDF files into a bigger file before we email it out.

However, we might need to split the file if it is too big to be received by the email recipients.

Says they could only receipt file attachment, pdf in our case, of 4Mb, how could we split the massive file by every 4M of size?

Thanks for the help..
Wilson


Top
 Profile  
Reply with quote  
 Post subject: Re: Split By Size
PostPosted: Thu Dec 16, 2010 1:26 pm 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3100
Location: Cologne, Germany
Hi!

A better approach (IMHO): stop merging before file size exceeds 4 MB.

I think a joined PDF file won't be bigger than the individual PDF files it consists of.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
 Post subject: Re: Split By Size
PostPosted: Fri Dec 17, 2010 7:04 am 
Offline

Joined: Fri Dec 17, 2010 6:52 am
Posts: 2
I read the post and try it as below coding:
Dim outputDocument As PdfDocument = New PdfDocument()
Dim inputDocument As PdfDocument = PdfReader.Open(dtDocu.Rows(j).Item("FilePath").ToString.Trim & dtDocu.Rows(j).Item("FilesName").ToString.Trim, PdfDocumentOpenMode.Import)
' Iterate pages
Dim count As Long = inputDocument.PageCount
Dim k As Integer
For k = 0 To count - 1
' Get the page from the external document...
Dim page As PdfPage = inputDocument.Pages(k)
' ...and add it to the output document.
outputDocument.AddPage(page)
Next
I tried to get the outputdocument.filesize (the outputdocument has several page), but it is zero. How do I get outputDocument expected file size?
Attachment:
screen1.png
screen1.png [ 30.07 KiB | Viewed 10740 times ]


Attachments:
screen.png
screen.png [ 71.91 KiB | Viewed 10740 times ]
Top
 Profile  
Reply with quote  
 Post subject: Re: Split By Size
PostPosted: Sun Dec 19, 2010 9:56 pm 
Offline

Joined: Wed Nov 11, 2009 9:40 am
Posts: 17
2 wilson

I think you should stop spamming :) (IMHO your PDFs seem too big to email)

_________________
Regards,
Remigijus Pankevičius


Top
 Profile  
Reply with quote  
 Post subject: Re: Split By Size
PostPosted: Fri Dec 24, 2010 9:11 am 
Offline
PDFsharp Expert
User avatar

Joined: Wed Dec 09, 2009 8:59 am
Posts: 343
jackylui wrote:
How do I get outputDocument expected file size?

I don't know.
I would sum up the sizes of the input files and stop before this sum exceeds the allowed value.

The resulting combined file (using Release build) shouldn't be bigger than the sum of the combined files.

_________________
Öhmesh Volta ("() => true")
PDFsharp Team Holiday Substitute


Top
 Profile  
Reply with quote  
 Post subject: Re: Split By Size
PostPosted: Tue Jan 11, 2011 8:02 pm 
Offline

Joined: Mon Jan 10, 2011 8:37 pm
Posts: 3
jackylui wrote:
I tried to get the outputdocument.filesize (the outputdocument has several page), but it is zero. How do I get outputDocument expected file size?


It appears as though the FileSize property is only set when calling PdfReader.Open(). And it gets the value from the stream.Length property.

One hack I've come up with is to update the FileSize property in the Save method in PdfDocument.

I sure wish the FileSize property was consistently updated.

We have a similar problem we're trying to solve: take one large PDF and split it into manageable files that can be emailed. For example we want to take a 50MB pdf and split it into 10 5MB files.

Another hack I've thought about is coming up with a ratio of the average page size and dividing it up that way but this is flawed. If some of the pages are really large the file sizes could vary greatly.

Is anybody versed enough in the internals of PDFSharp to provide some ideas how to modify the code base so that the FileSize property is updated upon add/removing pages/objects?


Thanks,
Mike DePouw


Top
 Profile  
Reply with quote  
 Post subject: Re: Split By Size
PostPosted: Tue Jan 11, 2011 8:37 pm 
Offline

Joined: Mon Jan 10, 2011 8:37 pm
Posts: 3
Another hack I've come up is to call PDFDocument.Save and pass it a MemoryStream. Then check the Length of the MemoryStream to determine the current size. This is very inefficient but it works.

//pseudo code
foreach page in pdfToSplit.Pages
add page to a PDFDocument
write pdfDoc to a memoryStream
check the length of the memoryStream
if larger than desired
write pdfDoc to disk
create a new pdfDoc obj


Top
 Profile  
Reply with quote  
 Post subject: Re: Split By Size
PostPosted: Wed Jan 12, 2011 9:35 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3100
Location: Cologne, Germany
spottedmahn wrote:
For example we want to take a 50MB pdf and split it into 10 5MB files.

Images, fonts, and other resources can be used on several pages.
When splitting files, you might get several copies of those resources in the split files.

So 50 MB will give 11 or 12 files with 5 MB.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
 Post subject: Re: Split By Size
PostPosted: Wed Jan 12, 2011 5:45 pm 
Offline

Joined: Mon Jan 10, 2011 8:37 pm
Posts: 3
Hi Thomas,

Thanks for the reply. That's interesting to know, thanks for the info.

Do you have any input on how to determine the file size of a PdfDocument object? As it stands write now I'm using the Save method and passing in a MemoryStream and then checking the length of the memory stream object. Is there a better approach than that?

Thanks,
Mike D.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 18 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group