higher quality and less bandwidth by conversion to vector formats

Advice and Help

Moderator: kcleung

Post Reply
kuribas
Posts: 2
Joined: Wed May 17, 2017 3:13 pm
notabot: 42
notabot2: Human

higher quality and less bandwidth by conversion to vector formats

Post by kuribas »

Hi,

I want to propose a way to create better quality output and less bandwidth by converting scanned pages to vector output. Bitmap images take a lot of space and have much redundancy. By converting the outlines to a vector image, and grouping similar elements together output can be improved, and bandwidth saved by a large factor. I am not suggesting OCR, because OCR does extra semantic analysis, and is fragile. Semantic analysis would be only an optional extra, but by default the system would work without actually understanding the content. I am willing to implement such a system, and I wonder if funding would be available for something like this.

Kristof Bastiaensen
Sallen112
active poster
Posts: 867
Joined: Wed Jan 12, 2011 12:52 pm
notabot: 42
notabot2: Human

Re: higher quality and less bandwidth by conversion to vector formats

Post by Sallen112 »

This sounds interesting, i'll let our leader, Feldmahler know.
imslp
Site Admin
Posts: 1639
Joined: Thu Jan 01, 1970 12:00 am

Re: higher quality and less bandwidth by conversion to vector formats

Post by imslp »

Hi Kristof,

This sounds quite interesting and yes, funding is available, but first I'll need to know a bit more about your background and how the conversion works on a more technical level. My e-mail is eguo@imslp.org, and it may also be helpful to have a Skype call at some point. Let me know.

Thanks,
Edward
coulonnus
active poster
Posts: 1530
Joined: Thu Jul 12, 2007 8:53 am
notabot: 42
notabot2: Human
Location: Nice, France
Contact:

Re: higher quality and less bandwidth by conversion to vector formats

Post by coulonnus »

Without moving to a vector format, the information theory makes bitmap images much smaller if they represent clean figures than if they represent dirty images. A typical typeset pdf page is about 15 kB big and a decent scanned page is about 100 kB big. When I convert a typeset pdf to tif, change the page layout and reconvert the images to pdf the result is not much bigger than the original pdf. And a Henle scan is smaller than a ca.1800 scan. :-)

Then all pdf's made with scans would be much smaller if we had an application that recognizes a staff line and replaces it with a clean staff line. Same for note stems, beams etc. Other symbols and text indications could come later (OCR).

I have already made .001% of the job :-) with an application that deletes stains smaller than the dot of a lowercase i in indication like vivace. But don't expect a size reduction bigger than about 5% so far. See an example here http://imslp.org/wiki/Piano_Sonata_in_F ... rel_Anton)
coulonnus
active poster
Posts: 1530
Joined: Thu Jul 12, 2007 8:53 am
notabot: 42
notabot2: Human
Location: Nice, France
Contact:

Re: higher quality and less bandwidth by conversion to vector formats

Post by coulonnus »

Also read https://en.wikipedia.org/wiki/SmartScore It converts images to MIDI and to MusicXML. I think the best bandwidth advice is: retypeset it! :-)
daphnis
Copyright Reviewer
Posts: 1633
Joined: Thu May 17, 2007 7:15 pm
notabot: 42
notabot2: Human

Re: higher quality and less bandwidth by conversion to vector formats

Post by daphnis »

I'm unclear what OP is proposing here. An implementation of an existing process, methodology, and format; a new one; or both?
Choralia
Site Admin
Posts: 762
Joined: Fri Aug 28, 2009 9:08 pm
notabot: 42
notabot2: Human

Re: higher quality and less bandwidth by conversion to vector formats

Post by Choralia »

imslp wrote:Hi Kristof,

This sounds quite interesting and yes, funding is available, but first I'll need to know a bit more about your background and how the conversion works on a more technical level. My e-mail is eguo@imslp.org, and it may also be helpful to have a Skype call at some point. Let me know.

Thanks,
Edward
I hope this idea is progressing behind the scenes. Further to better quality and reduced bandwidth (as well as reduced storage space), conversion to vector format may work as a pre-processing layer for optical music recognition programs, thus facilitating the transformation of scanned scores into files compatible with music editing software. Quite interesting, IMO.

Max
imslp
Site Admin
Posts: 1639
Joined: Thu Jan 01, 1970 12:00 am

Re: higher quality and less bandwidth by conversion to vector formats

Post by imslp »

Yep, this is progressing, will announce when the time comes.
coulonnus
active poster
Posts: 1530
Joined: Thu Jul 12, 2007 8:53 am
notabot: 42
notabot2: Human
Location: Nice, France
Contact:

Re: higher quality and less bandwidth by conversion to vector formats

Post by coulonnus »

Post Reply