Renaming and relocating folders based on CSV files
## Deliverables
Need the software (desktop application) which can be done in programming language of your own choice but it has to work on ALL versions of Windows OS. The description is very large because I want to be as much understandable as possible. Please don't equal the length of description with amount of programming.
Features 1 and 2 in the software should be runned in the same order comparing to the order of my description (first Feature 1 and then Feature 2). For the rest of features you will decide when they should be runned. The software is based on FOUR kinds of CSV files. Their names are:
Folders-WithLinks
Folders-WithoutLinks
Files-WithLinks
Files-WithoutLinks
To the winning bidder: Please in case if I forget later, remember me and ask me where are those files located on the hard drive.
I will describe for first two files (Folders). The same logic is for other two files (Files). In the file ''Folders-WithLinks'' there are foldernames with links in the following format/syntax (one data for one folder per line):
foldername,''link''
In this file ''Folders-WithLinks'' there are always links given near foldernames. But in the file ''Folders-WithoutLinks'' there might or might not be links given. Software must always check both files. It should NEVER, without any exception, even touch the foldernames where link is NOT given. The entire software is meant to work only with foldernames where links are given. Please reread previouns sentence to make sure you understand it. As I said, the same logic is for files ''Files-WithLinks'' and ''Files-WithoutLinks''.
The software will never work with the same two files at the same time (same run). At the same time it will work with either first and second csv files OR third and fourth csv files.
Each of them has its own link so its really important which exact link is NEAR foldername. Which means you are all the time looking on the right side (SAME line in csv file) of the foldername. There could be just few links, houndreds, thousands or tens of thousands. It should not matter. The software has to work with any amount. If I go to some extreme description with file ''Folders-WithoutLinks'' and file ''Files-WithoutLinks'' there could be out of thousands of lines only ONE link provided. Therefore the software will work with this one and with foldername (file ''Folders-WithoutLinks'') or with filename (file ''Files-WithoutLinks'') that belongs to it.
It is very important that the software wouldn't mixed up those links. They are totally different. It would be a disaster taking link in the next line for the folder/file in current line. So:
Foldername,''link1''
Therefore for folder ''Foldername'' MUST be used ''link1'' (link in the same line) and NOT any other one. In all four csv files will always be only one link per folder/file.
Links can have some ''weird'' symbols (not available on keyboard) and letters (nonenglish) also. The software should still work without any problem.
I will discuss later with winning bidder where are those folders/files located on hard drive. There is no rule (limit - lower and upper limit) about how many files do folders contain.
The software will need to connect to each link and do some renaming and relocating (NOT copying but only relocating!) of foldernames from one folder to another. When working with files, the software will need to create folders, move some files inside and relocate (NOT copy) everything from one to another location. For details, see feature 2.
To all potential bidders: With your bid (with definited amount either at the beginning or your bid or after you ask me some questions) you automatically AGREE to the following three facts:
- I won't even respond to the bidders who will in his/her/their first post talk about themself only or their company only. At least one question must be asked - of course you are more than welcome to ask more as well. If someone won't ask me anything in the first message then I will have to ignore the bid and the message. The question must be related to the project.
- You are very welcome to require the samples of CSV files. No further obligation after looking at them - its 100% your decision.
- Some of the files/folders can have rude and insulting names. It is my duty to mentoin this. So with your bid you guarantee that you won't be anyhow offended with such words. In our communication, if such files/folders will be under our discussion, I won't censor them and you also don't need.
Here are the descriptions of features:
FEATURE 1:
This feature 1 is related strictly only to folders and only to the ones that do NOT have links given. So strictly only to the file Folders-WithoutLinks.csv. The feature has nothing to do with folders that have the links. The feature has nothing to do with files and it doesn't matter whether they have links or not.
Condition:
Links are being provided based on several sources. But for this project description and for this condition are relevant the following id3v2 tags:
artist name
album title
publisher
discogs_catalog
catalog #
catalogue_number
Some (or all) of those id3v2 tags are custom ones and can be accessed/viewed/editted only via free software called ''foobar2000''. You can download this software at:
[login to view URL]
The condition has two subconditions and those BOTH (!) subconditions must be met so the condition is met also:
First subcondition of Condition: The following three id3v2 tags MUST ALL, i repeat ALL, be given (so those three id3v2 tags should NOT be empty):
artist name
album title
publisher
Second subcondition of Condition: At least ONE (so one or more) of the following id3v2 tags MUST be given (so it/they should NOT be empty):
discogs_catalog
catalog #
catalogue_number
I repeat that condition is met only in case if both conditions subconditions are met also. If condition is NOT (!) met then the feature 1 should delete the folder permanently (without moving to recycle bin). If condition is met then the feature 1 should NOT do anything.
FEATURE 2:
I would start the description of this feature with the paths. I don't want to mentoin the real paths yet (will do later) so I will, for now, mentoin just the ''temporary'' paths:
Folders (with files inside) are located here: C:\XXXXX
Files are located here: C:\XXXXX\YYYYY
Folders, that are going to be renamed (how? see the rest of description) and relocated, will be transfered from C:\XXXXX to C:\XXXXX\ZZZZZ
Again, I will mentoin path names of XXXXX, YYYYY, ZZZZZ to my accepted bidder. In all four csv files with lists of links there could be links (URLs) given from 6 different websites:
[login to view URL]
[login to view URL]
[login to view URL]
[login to view URL]
[login to view URL]
[login to view URL]
Each of those website (whichever link occur in list for specific foldername/filename) is a source of gathering datas for renaming folders. How they should be renamed and where exactly into C:\XXXXX\ZZZZZ they should be relocated (NOT copied!) depends on what it is saying on the page of the link. Along those 6 possible sources, also 7 source occurs. This 7th source is relevant ONLY for folders where link is NOT given. 7th source has nothing to do with folders/files where link is given.
7th source are the following id3v2 tags:
artist name
album title
publisher
discogs_catalog
catalog #
catalogue_number
The feature 2 can work in four different enviroments:
1. Environment: foldernames, links given (6 possible sources: 6 possible websites)
2. Environment: foldernames, links not given (only 1 possible source: 7th, mentoined above)
3. Environment: filenames, links given (5 possible sources: 5 possible websites - all links from Triplevision website must be ignored but only when working with filenames - only this third environment)
4. Environment: filenames, links not given (no souce available)
For each environment, I will explain separatly what does feature 2 need to do. Also for first environment, I will explain one random chosen folder/file per every source.
1. Environment: foldernames, links given:
First possible source:
foldername is: deniro_-_state_of_mind-(hk039)-promo-vinyl-1999-tdj
near this foldername is in the list of links given the following link (of course in the same line):
[login to view URL]
This folder is (as every other mentoined in the list) located in the folder: C:\XXXXX
The syntax according to which the foldername ''deniro_-_state_of_mind-(hk039)-promo-vinyl-1999-tdj'' gets renamed is this:
[CATID] Artist Name - Album Title
if you compare the particular link (please open the link and carefully look at the page), you know that:
CATID: HK 039
Artist Name: Deniro
Album Title: State Of Mind
Everything must exactly match to what says on particular page of the link which is located near the foldername. So with this syntax then the foldername should be renamed into:
[HK 039] Deniro - State Of Mind
Filename(s), inside this foldername, don't need to be ranamed.
Along renaming the folder, another folder must be created. I repeat, its not renamed but created (software needs to create it). Name of this new folder should match exactly the label name. So in this particular case, the name of new folder will be:
Hook Recordings
On the page you can see hyperlink but please don't rely on this hyperlink too much. Website webmasters can remove it any time and therefore ''Hook Recordings'' with no hyperlink will remain.
So after renaming the folder, creating new one and naming this new one, the fourth step for particular folder need to be done: relocating.
As I said, folder that has/had the name ''deniro_-_state_of_mind-(hk039)-promo-vinyl-1999-tdj'' must, while renaming, be relocated at exactly the same time from C:\XXXXX into C:\XXXXX\ZZZZZ. Also the new created (and named) folder (''Hook Recordings'') must be moved into the same path C:\XXXXX\ZZZZZ. At exactly the same time, when this is being done, another step needs to be done. The renamed folder (''[HK 039] Deniro - State Of Mind'') must now be moved into new created (and named) folder (''Hook Recordings''). So the final result (path) is the following:
C:\XXXXX\ZZZZZ\Hook Recordings\[HK 039] Deniro - State Of Mind
I need to repeat that XXXXX, ZZZZZ (and later YYYYY) are just temporary folders and not real path. I will later tell you each of those three names.
Always must all of the files, located in particular folder before renaming/relocating, also end up in exactly the same (but now renamed and relocated) folder. In the same time the software will be working with houndreds of folders and therefore thousands of files and its really important that files (no need to be renamed) also after final result end up in the same folder.
Note that the software might at the same run (session) work with another folder that will also belong to ''Hook Recordings''. In this case another folder ''Hook Recordings'' should NOT be created again. It couldn't be anyway because Windows doesn't allow having two folders with exactly the same names on the same location. In this case just relocate (i repeat: not copy but relocate) this another folder into existing and recently created folder ''Hook Recordings''.
Case sensitivity is important. It all depends on what is being showed on the link's page.
In C:\XXXXX should NEVER remain any signs / traces of anything being left there. That could occur if you accidently wrote the programming code for copying instead of relocating.
The same logic is relevant for other 5 possible sources of the same environment (First one). So I will reduce a lot the description of those 5 possible sources which is following with next paragraph.
Second possible source:
foldername: lego_planet_-_indigo__insert__toxic-(clel030)-web-2008-euphoric
link: [login to view URL]
foldername will be renamed into: [CLEL 030] Lego Planet - Toxic
new folder: Club Elite Holland
final result (path): C:\XXXXX\ZZZZZ\Club Elite Holland\[CLEL 030] Lego Planet - Toxic
Third possible source:
foldername: mat_zo-moonset-(rd026)-web-2008-ukhx
link: [login to view URL]
foldername will be renamed into: [RD026] Mat Zo - Moonset
new folder: RealDeep
final result (path): C:\XXXXX\ZZZZZ\RealDeep\[RD026] Mat Zo - Moonset
Fourth possible source:
foldername: Regis and ruskin - blueprint son
link (much longer link but hopefully it won't confuse you):
[login to view URL]~details,u~1699772,p1~vinyl/xe/[login to view URL]
foldername will be renamed into: [BP030.1] Ben Klock - Post-Traumatic Son (Regis & Ruskin Rmxs)
new folder: Blueprint
final result (path): C:\XXXXX\ZZZZZ\Blueprint\[BP030.1] Ben Klock - Post-Traumatic Son (Regis & Ruskin Rmxs)
Note *****: I typed that symbol ''*****'' just because I am going to use the same note later in description again for once more so you will know where to the project's description to look at. So for vinyl-distribution website: In this particular case there are only one brackets on the page: ''Ben Klock (Blueprint)''. New folder is named with whatever is between the brackets so ''Blueprint''. Situation could occur where on the same location of the page are more than one brackets such as ''Ben Klock (Blueprint) (Xz)''. In this case use, to name the new folder, whatever is between LAST brackets. In this situation the new folder would be ''Xz''.
Fifth possible source:
foldername: VA-Run_Rabbit_Run-WEB-2010-MTC
link: http://www.somixx.com/en/eps/profile/Run+Rabbit+Run
foldername will be renamed into: [Devils011] Jason Little, Weichentechnikk, Dj Hammond - Run Rabbit Run
new folder: The Devils Rejects
final result (path): C:\XXXXX\ZZZZZ\The Devils Rejects\[Devils011] Jason Little, Weichentechnikk, Dj Hammond - Run Rabbit Run
Sixth possible source:
foldername: HT_RWRX-Boom_Blast_BW_Once_Upon_A_Time_In_The_West-(VLS)-(HTRWRX_002)-2010-VAG
link: http://www.triplevision.nl/release/HTRWRX+002/
foldername will be renamed into: [HTRWRX 002] Unknown - Boom Blast - Once Upon A Time In The West
new folder: HT RWRX
final result (path): C:\XXXXX\ZZZZZ\HT RWRX\[HTRWRX 002] Unknown - Boom Blast - Once Upon A Time In The West
2. Environment: foldernames, links not given
Obviously here are no links near foldernames and therefore no pages to gather datas. But id3v2 tags of foldername (and therefore filename(s) inside) exsist. So they need to be used. As already mentoined, important id3v2 tags are:
artist name
album title
publisher
discogs_catalog
catalog #
catalogue_number
Because of condition in feature 1 all datas are given. If they weren't, the folder would be gone (feature 1 would permanently delete it without moving to recycle bin). I said at least one of the following td3v2 tags must have data (information inside):
discogs_catalog
catalog #
catalogue_number
But it can happen that two or even all three will have information. If this happens, the biggest priority, to gather information from, should have discogs_catalog. So first check this tag. If this tag doesn't have information, other two can be randomely chosen. But NEVER, i repeat NEVER, use more than one tag out of those three.
So the foldername, when relocating it from C:\XXXXX into C:\XXXXX\ZZZZZ, should be renamed into the following syntax:
[discogs_catalog OR catalog # OR catalogue_number] Artist Name - Album Title
new created folder: publisher
And then the final result should be in the path:
C:\XXXXX\ZZZZZ\publisher\renamed folder
Example:
artist name: some Name
album title: example of title
publisher: example
discogs_catalog: abc034
catalog #: abc34
Note: I purposely ignored catalogue_number id3v2 tag. The reason for this is because I said that AT LEAST ONE out of those three id3v2 tags (discogs_catalog, catalog #, catalogue_number) must be given (= shouldn't be empty).
So:
foldername: [text] some_name-example_title-tbc
foldername will be renamed into: [abc034] Some Name - Example Of Title
new folder: Example
final result (path): C:\XXXXX\ZZZZZ\Example\[abc034] Some Name - Example Of Title
Note: As I mentoined; the reason why it is used ''abc034'' instead of ''abc34'' is because in those three id3v2 tags (discogs_catalog, catalog #, catalogue_number) is discogs_catalog the one which has the biggest priority to be used. Of course in case if it isn't empty. If particular tag is empty or doesn't exist - thats the same meaning.
3. Environment: filenames, links given:
Files (the ones which have the filenames inside file with list of potential links) are located in the folder C:\XXXXX\YYYYY. In this case the story gets a bit different than in first and second enviroment. First step is that the software should check the number of tracks on specific given link's page. To show what should be counted as tracks (and sumed +1) I will use the same random chosen links as I did in description of first environment.
Link: [login to view URL]
Number of tracks (see number of lines below ''Tracklist''): 2
Link: [login to view URL]
Number of tracks (see the number of lines below ''Title''): 3
Link: [login to view URL]
Number of tracks (see the number of lines below ''Tracks In This Release''): 2
Link: [login to view URL]~details,u~1699772,p1~vinyl/xe/[login to view URL]
Number of tracks (see number of lines below ''Tracks auf diesem Album''): 3
Link: http://www.somixx.com/en/eps/profile/Run+Rabbit+Run
Number of tracks (see number of lines below ''Track Artist(s) Genre(s) Price Rate''): 4
Links from Triplevision must be in this third environment IGNORED! With ''IGNORED'' I meant there shouldn't be any deleting involved or anything else (such as relocating) but software should just skip them and do nothing. I repeat this is relevant only for third environment (only for filenames, never foldernames) and strictly only for Triplevision website.
Once software gets number of tracks (IF the file has link then per one file is one link) on specific page, it has to check in entire file with lists of filenames and links if this exactly the same link is repeated the same amount of times as the number of the tracks is on link's page. For example the link:
[login to view URL]
must be repeated exactly 2 times. Why? Because there are 2 tracks on the page. As I mentoined, ENTIRE file ([login to view URL]) with filenames must be checked.
If the link occurs different times, no matter more often or more rarely, for example only once, three times, four times and so on then the software shouldn't do anything with this particular file. Absolutelly no deleting, no relocating, no creating of files, no checking, nothing. It just skip it and go to next file and link.
But if the link is repeated the same amount of times (in this particular case 2 times) then it has to go through 3 steps:
1. In the C:\XXXXX\ZZZZZ it creates the folder with syntax which is the same as in first environment:
[CATID] Artist Name - Album Title
But the logic is a bit different than in first enviroment. There (first environment) the foldername was REnamed but here it is newly created and named. Here in first step (previouns paragraph) I mentoined just the syntax but not the actual name. So in the case of link:
[login to view URL]
New folder has to be created and named:
[RD026] Mat Zo - Moonset
Note: be careful that you don't replace ''Artist Name'' with ''Album Title''. Artist Name comes first then space, then ''-'', then another space and then Album Title.
2. Another folder needs to be created. The name of the folder is ''RealDeep'' (please look on the link's page and see what says near ''Label:''). Also this is the same as in first environment.
Now in the same second step, the folder created in the first step, which is
[RD026] Mat Zo - Moonset
and which must be located in C:\XXXXX\ZZZZZ
must be moved inside folder created in second step which is ''RealDeep''.
Note *****. I have already explained the note with symbol ''*****'' which is relevant for vinyl-distribution website. Keep it in mind also in this step.
3. As already mentoined: files (mp3s) are located in C:\XXXXX\YYYYY. Software by now already recognized (otherwise it wouldn't go to step 1) that there are exactly two files that have exactly the same link ([login to view URL]) in the file with list of filenames (mp3s) and links. Those two files must be relocated (=moved and NOT copied) from C:\XXXXX\YYYYY into:
C:\XXXXX\ZZZZZ\RealDeep\[RD026] Mat Zo - Moonset
And the work for particular session is done. So the software can go in the list to the next file with link.
4. Environment: filenames, links not given:
In this situation the software shouldn't do anything. No relocating, no checking, no deleting, absolutelly nothing. The reason for this is because files will never have all needed id3v2 tags given for that specific environment. So again, if filename (NOT foldername!) in the list won't have link given, do nothing. Only skip it and go to the next filename.
With this I am ending the description of four environments.
FEATURE 3:
As said in feature 2, a lot of REnaming and naming must be done but Windows OS doesn't allow some particular symbols to be located in the foldername. Feature 3 should take care of this. So it is supposted to convert NOT allowed charachters into allowed ones when REnaming and naming. So:
If occurs symbol ''/'' then convert it to ''-''.
If occurs symbol ''\'' then convert it to ''-''.
If occurs symbol ''?'' then remove it. Don't add the space. For example ''Question?'' should become ''Question''.
If occurs symbol ''*'' then remove it. Don't add the space. For example ''Somethi*ng'' should become ''Something''.
If occurs symbol ''?'' then convert it to ''e'' (case sensitivity of symbol doesn't matter).
If occurs symbol ''?'' then convert it to ''o'' (case sensitivity of symbol doesn't matter).
If occurs symbol ''?'' (see where this ''1'' is being located) then convert it to ''1''. If occurs symbol ''?'' then convert it to ''2'' and so on.
If occurs symbol ''?'' then ignore it. Don't add the space.
If occurs symbol ''?'' then convert it to ''i'' (case sensitivity of symbol doesn't matter).
If occurs symbol ''β'' then convert it to ''B''.
If occurs symbol ''?'', even if Windows accept it to be located in foldername, convert it to ''a'' (case sensitivity of symbol doesn't matter).
If occurs symbol ''?'' then convert it to ''o'' (case sensitivity of symbol doesn't matter).
If occurs symbol ''A'' then convert it to ''A''.
If occurs symbol ''?'' then remove it. Don't add the space.
If occurs symbol ''?'' then convert it to ''n'' (case sensitivity of symbol doesn't matter).
If occurs symbol ''?'' then remove it. Don't add the space.
The rest of not allowed symbols (nonenglish alphabet) shall be discussed later.
FEATURE 4:
If either REnamed or named folder (in feature 2) ENDS with one of the following words (words are NOT case sensitive so ''ep'' is the same as ''eP''):
ep
e.p
e.p.
vinyl
then remove/delete/ignore this word because the (re)named folder shouldn't END (only END) with any of those. BUT VERY IMPORTANT:
This feature 4 is relevant ONLY for the folders that are getting renamed or named (if third environment in feature 2) according to the syntax:
[CATID] Artist Name - Album Title
Those four words should NOT be removed/deleted/ignored (on endings) on any other folders (foldernames).
I know that Windows OS already removed the dot if foldername ends with it but I still mentoined this example. Note that removing word might cause occuring of additional space. Remove/delete/ignore this additional space also.
FEATURE 5:
In the feature 2 you saw that first letters of words (except the ones from discogs_catalog, catalog #, catalogue_number id3v2 tags) are being capitalized (''abc'' becomes ''Abc'') if not already capitalized either on the page or in the id3v2 tags. The feature 5 should take care that the first letters of the following words will never get capitalized (words are NOT case sensitive):
vs
vs.
van
den
dem
der
das
die
versu
versus
feat
feat.
ft
ft.
featuring
featuring.
featuring'
featurin
featurin'
featuri
featuri'
present
present'
present.
pres
pres.
pres'
presents
presents'
presents.
presenting
presenting'
presenting.
presentin
presentin.
presenti'
meet
meet.
meets
meets.
So whatever is between ''['' and '']'' (=CATID in folder syntax) shouldn't get capitalized unless it is already in such type of fonts on website's page.
FEATURE 6:
This feature is related to only first and third environment of feature 2 and only to discogs website so only to this particular source out of 7 possible sources.
Warning: I exceeded the limit of description length. Please download attached rar for remaining. Confirm to me that you did this.