FTP Server

Messages from and Discussions about IMSLP

Moderator: kcleung

FTP Server

Postby imslp » Mon Dec 08, 2008 4:07 pm

Hi all!

The FTP server is up, with two accounts:

1) An account for general uploads, for people who have large collections, but no time to submit them to IMSLP themselves.
2) An account for logo-infested files, so that other people can remove the logos and submit them.

If you want access to either one, simply e-mail me and I'll send you the user/pass. Note that at least at the beginning I'll restrict the FTP usage to seasoned IMSLP contributors, or people who have file collections. If you have some specific reason for wanting access, please also note that in the e-mail.

If there are any IMSLP contributors who are willing to manage this project, please reply here.
imslp
Site Admin
 
Posts: 1472
Joined: Thu Jan 01, 1970 12:00 am

Postby kcleung » Tue Dec 09, 2008 1:22 am

Thanks Feldmahler! This would make us able to tap into CDROM score resources and in most cases saves us from scanning scores.

So would the next step be to set up a CDROM-request forum for scores which are known to be available to one of the PD CDROM series that are not yet on IMSLP?

Then for OM, it would be nice if we catelogue the copyright status and progress status of the works and have a system where people register before they rip the CDs and strip the works (separately).

For determining copyright status of each work, would need people who are more experienced than me on copyright stuff........
kcleung
Copyright Reviewer
 
Posts: 124
Joined: Fri Sep 12, 2008 9:38 pm

Postby Yagan Kiely » Tue Dec 09, 2008 1:57 am

So would the next step be to set up a CDROM-request forum for scores which are known to be available to one of the PD CDROM series that are not yet on IMSLP?
I don't believe this is all that necessary, it isn't like we are getting overrun by requests.

Then for OM, it would be nice if we catelogue the copyright status and progress status of the works and have a system where people register before they rip the CDs and strip the works (separately).
The project will have a select few number of people in it, so place where people register probably isn't necessary. In terms of copyright, I'm not sure how to work this out yet.
Yagan Kiely
Site Admin
 
Posts: 1139
Joined: Sun Jan 14, 2007 8:16 am
Location: Perth, Australia

Postby tilmaen » Tue Dec 09, 2008 12:42 pm

maybe we can get horndude77 to use his software to automatically remove the logos from the logo infested files
http://github.com/horndude77/image-scripts/tree/master
it'd be awesome to see the Orchestra musicians CD-rom library on imslp!
tilmaen
forum adept
 
Posts: 78
Joined: Fri Nov 21, 2008 2:10 pm

Postby Carolus » Thu Dec 11, 2008 8:38 am

Seems to be working quite nicely. I've already added some Beethoven scores for clean-up. We'll have a nice pile of things there before long.
Carolus
Site Admin
 
Posts: 1774
Joined: Sun Dec 10, 2006 11:18 pm
Location: St. Louis, MO, USA

Postby ras1 » Thu Dec 11, 2008 8:14 pm

And I've thrown in Violin Volume 7 of the orchestral parts.
ras1
active poster
 
Posts: 164
Joined: Thu Jul 26, 2007 8:28 pm

Postby Generoso » Fri Dec 12, 2008 9:03 am

I have just uploaded all 9 volumes of the Orchestra Musicians Cello parts!
Generoso
active poster
 
Posts: 251
Joined: Mon Mar 12, 2007 1:49 pm

Postby ras1 » Fri Dec 12, 2008 1:22 pm

Amazing! Next week I might have some time to start cleaning things up.
ras1
active poster
 
Posts: 164
Joined: Thu Jul 26, 2007 8:28 pm

Postby horndude77 » Sun Dec 14, 2008 5:03 pm

I created a clean folder underneath the tchaikovsky cello section. The program I have got rid of most of the logos. There are just a few front pages which did not work. Perhaps this can be improved. Take a look.
horndude77
active poster
 
Posts: 271
Joined: Sun Apr 23, 2006 5:08 am
Location: Phoenix, AZ

Postby ras1 » Sun Dec 14, 2008 8:29 pm

That's great! The only issue I have is that the labels at the top of each page, which weren't removed, are also added by CDSM. Is there an easy way to change the program to take that out too?
ras1
active poster
 
Posts: 164
Joined: Thu Jul 26, 2007 8:28 pm

Postby Leonard Vertighel » Sun Dec 14, 2008 11:32 pm

I see no reason for removing anything but the trademarks. Everything else doesn't violate any laws - and neither would it hide the fact where those scans come from, since it's trivial to prove that they are pixel wise identical.

Note however that it's not only front pages where logos were missed. (I've been toying with a script myself, and I've been running into the same kind of problem.)
Leonard Vertighel
Groundskeeper
 
Posts: 553
Joined: Fri Feb 16, 2007 8:55 am

Postby Carolus » Mon Dec 15, 2008 7:25 am

While there is technically no need to remove the added page numbers and titles, I'm generally not very fond of their crude added page numbers and titles and replace them in scores I've processed (see the vocal score for Puccini's Edgar for an example).

I do recommend removing all metatags, bookmarks and any other such embedded added info from the files. Giving them less than a leg to stand on is always prudent.
Carolus
Site Admin
 
Posts: 1774
Joined: Sun Dec 10, 2006 11:18 pm
Location: St. Louis, MO, USA

Postby Leonard Vertighel » Mon Dec 15, 2008 9:45 am

If we are going to remove the titles as well, then I'm afraid it's manual labor all the way. There is an added title at the top of every single page in the OM scores. (The page numbers on the other hand seem to be the original ones.) Removal based on coordinate position does not work (and even if it did, it would just leave us with no title at all), which is why horndude was working with a pattern matching algorithm for the logos. But since obviously the title is different for each file, this method is not an option. The only theoretical solution would be OCR, but in practice I don't believe this to be feasible.
Leonard Vertighel
Groundskeeper
 
Posts: 553
Joined: Fri Feb 16, 2007 8:55 am

Postby Lyle Neff » Mon Dec 15, 2008 12:06 pm

Carolus wrote:[...] I do recommend removing all metatags, bookmarks and any other such embedded added info from the files. Giving them less than a leg to stand on is always prudent.

That reminds me: The other day I looked at a file (I think it was the first of the PDFs for Tchaikovsky's "The Seasons") and found that the uploader had embedded a yellow pop-up message on the first page (and apparently at the head of each piece) stating that it is from so-and-so's library and warning the reader to observe the copyright rules of his/her own country.

Since IMSLP already contains a warning like that, and since the source should be given in the "scanner" and "uploaded by" fields, that kind of thing should be removed as well, I should think -- although in this case it was apparently the uploader him/herself who had added the annoying messages to the PDF.
"A libretto, a libretto, my kingdom for a libretto!" -- Cesar Cui (letter to Stasov, Feb. 20, 1877)
Lyle Neff
active poster
 
Posts: 648
Joined: Wed Mar 14, 2007 3:21 pm
Location: Delaware, USA

Postby ras1 » Mon Dec 15, 2008 5:10 pm

In those cleaned Tchaikovsky files, are the titles on the first pages left over from the OMCDL labels, or did the program add them?
ras1
active poster
 
Posts: 164
Joined: Thu Jul 26, 2007 8:28 pm

Next

Return to IMSLP Announcements

Who is online

Users browsing this forum: No registered users and 1 guest