higher quality and less bandwidth by conversion to vector formats

Advice and Help

Moderator: kcleung

kuribas
Posts: 2
Joined: Wed May 17, 2017 3:13 pm
notabot: 42
notabot2: Human

higher quality and less bandwidth by conversion to vector formats

Postby kuribas » Wed May 17, 2017 3:31 pm

Hi,

I want to propose a way to create better quality output and less bandwidth by converting scanned pages to vector output. Bitmap images take a lot of space and have much redundancy. By converting the outlines to a vector image, and grouping similar elements together output can be improved, and bandwidth saved by a large factor. I am not suggesting OCR, because OCR does extra semantic analysis, and is fragile. Semantic analysis would be only an optional extra, but by default the system would work without actually understanding the content. I am willing to implement such a system, and I wonder if funding would be available for something like this.

Kristof Bastiaensen

Sallen112
active poster
Posts: 346
Joined: Wed Jan 12, 2011 12:52 pm
notabot: 42
notabot2: Human

Re: higher quality and less bandwidth by conversion to vector formats

Postby Sallen112 » Sat May 20, 2017 3:29 am

This sounds interesting, i'll let our leader, Feldmahler know.

imslp
Site Admin
Posts: 1608
Joined: Thu Jan 01, 1970 12:00 am

Re: higher quality and less bandwidth by conversion to vector formats

Postby imslp » Sat May 20, 2017 5:15 am

Hi Kristof,

This sounds quite interesting and yes, funding is available, but first I'll need to know a bit more about your background and how the conversion works on a more technical level. My e-mail is eguo@imslp.org, and it may also be helpful to have a Skype call at some point. Let me know.

Thanks,
Edward

coulonnus
active poster
Posts: 1110
Joined: Thu Jul 12, 2007 8:53 am
notabot: 42
notabot2: Human
Location: Nice, France
Contact:

Re: higher quality and less bandwidth by conversion to vector formats

Postby coulonnus » Sat May 20, 2017 8:37 am

Without moving to a vector format, the information theory makes bitmap images much smaller if they represent clean figures than if they represent dirty images. A typical typeset pdf page is about 15 kB big and a decent scanned page is about 100 kB big. When I convert a typeset pdf to tif, change the page layout and reconvert the images to pdf the result is not much bigger than the original pdf. And a Henle scan is smaller than a ca.1800 scan. :-)

Then all pdf's made with scans would be much smaller if we had an application that recognizes a staff line and replaces it with a clean staff line. Same for note stems, beams etc. Other symbols and text indications could come later (OCR).

I have already made .001% of the job :-) with an application that deletes stains smaller than the dot of a lowercase i in indication like vivace. But don't expect a size reduction bigger than about 5% so far. See an example here http://imslp.org/wiki/Piano_Sonata_in_F-sharp_minor,_Op.2_No.2_(Fodor,_Carel_Anton)

coulonnus
active poster
Posts: 1110
Joined: Thu Jul 12, 2007 8:53 am
notabot: 42
notabot2: Human
Location: Nice, France
Contact:

Re: higher quality and less bandwidth by conversion to vector formats

Postby coulonnus » Mon May 22, 2017 6:17 am

Also read https://en.wikipedia.org/wiki/SmartScore It converts images to MIDI and to MusicXML. I think the best bandwidth advice is: retypeset it! :-)

daphnis
Copyright Reviewer
Posts: 1609
Joined: Thu May 17, 2007 7:15 pm
notabot: 42
notabot2: Human

Re: higher quality and less bandwidth by conversion to vector formats

Postby daphnis » Tue Jun 06, 2017 2:17 pm

I'm unclear what OP is proposing here. An implementation of an existing process, methodology, and format; a new one; or both?

Choralia
Site Admin
Posts: 671
Joined: Fri Aug 28, 2009 9:08 pm
notabot: 42
notabot2: Human

Re: higher quality and less bandwidth by conversion to vector formats

Postby Choralia » Wed Jun 07, 2017 10:12 pm

imslp wrote:Hi Kristof,

This sounds quite interesting and yes, funding is available, but first I'll need to know a bit more about your background and how the conversion works on a more technical level. My e-mail is eguo@imslp.org, and it may also be helpful to have a Skype call at some point. Let me know.

Thanks,
Edward

I hope this idea is progressing behind the scenes. Further to better quality and reduced bandwidth (as well as reduced storage space), conversion to vector format may work as a pre-processing layer for optical music recognition programs, thus facilitating the transformation of scanned scores into files compatible with music editing software. Quite interesting, IMO.

Max

imslp
Site Admin
Posts: 1608
Joined: Thu Jan 01, 1970 12:00 am

Re: higher quality and less bandwidth by conversion to vector formats

Postby imslp » Thu Jun 08, 2017 12:20 am

Yep, this is progressing, will announce when the time comes.

coulonnus
active poster
Posts: 1110
Joined: Thu Jul 12, 2007 8:53 am
notabot: 42
notabot2: Human
Location: Nice, France
Contact:

Re: higher quality and less bandwidth by conversion to vector formats

Postby coulonnus » Thu Jun 08, 2017 5:29 am



Return to “Scanning and PDF Creation”

Who is online

Users browsing this forum: Yahoo [Bot] and 1 guest