Questions on, limitations of and some insights on web sources


#1

I am about to complete a larger web source script, but there are a couple of things which i do not know or not have a solution for. So i thought it might be a good idea to ask others for help and/or document things i learned in the last few days. I will also add some additional information below on undocumented or probably deprecated keywords. Please add your corrections, additional documentation or your own findings through replies.

Questions:

  • Is there a "supported" way to get information from the ParserScriptIndex block to the ParserScriptAlbum block? I tried to set output buffers in the ParserScriptIndex block with OutputTo "XYZ" Say "BlaBla", but this does not work, the buffers are lost and do not exist during ParserScriptAlbum execution.
  • Is there a way to access existing tag fields of the selected file on which the web source is executed? The only place with read access to these fields seems to be the [SearchBy] line, is that correct?
  • Is there a way to automatically (download and) embed a cover? I have seen other scripts using an output buffer named coverurl. Does mp3tag embed the image or the url? What happens if the url is using the file:/// protocol?
  • Is there a way to "clear" or delete an existing tag? I tried to use set "tagname" which should according to the online help reset the contents of tagname. But the empty buffer is not propagated out of the web source script. So, for example, if my file has set the DISCNUMBER tag and my websource is indicating that it should not have a DISCNUMBER, i can not erase or delete the tag from the websource, right? Instead, i need to take a two-step approach: Set a temp tag whose presence or contents i have to check from an action invoked after running the websource. The action then deletes the tag in questions (and the temp tag too, of course.) Is there a better solution?
  • Language specific tags, how to write more than one? There are at least two language specific tags in id3. USLT (Unsynced lyrics) and COMM (Comment). The spec states that there may exist multiple USLT tags, but only one per given language. Same for COMM. So it is - in theory - possible for an mp3 file to contain a comment marked as english (ENG) and one marked as german (deu). This is important as some devices display only comments (or lyrics) if they are in a specific language. For example iTunes only displays english lyrics and comments, windows xp displays only comments that match the system language. mp3tag web sources do _not_ allow to specifiy the language. Instead, mp3tag takes the language value it writes to the tag from its own language files (mp3tag/lang/*.lng). The language is specified in _M_STR_ID3V2LANG. So, it seems that there is no way to write a german and a english comment from a websource, right? Furthermore, for the tag panel, things behave differently. COMM is mostly the same (language is taken from the lng-file and can not be changed once mp3tag is running). USLT tags however get their initial language value from the language file, *but* if one invokes the extended tag dialog, the language is not only displayed, it can even be changed. I would really appreciate a consistent way to create multiple COMM and USLT tags with different languages from within an action or a web source.

Some things probably not commonly known:

  • debugwriteinput This undocumented command, which is used in some other web sources, saves the unprocessed received response to the Index and the Album requests to a file. Use it similar to the "debug" command in the ParserScriptIndex and the ParserScriptAlbum block. The single mandatory parameter specifies the file name. Example: debugwriteinput "C:\test\juno_co_uk_A.htm"
  • %_preview%/CurrentPreview In addition to the %_url% a web source script might specify that it will provide a distinct URL for previews (used when invoking the preview button). In the [IndexFormat] line, the pattern %_preview% determines where the web source will provide the preview url to mp3tag. Access to the actual preview url from within the ParserScriptIndex and the ParserScriptAlbum block is possible through the output buffer "CurrentPreview".
  • More undocumented websource keywords? There seem to be some additional commands but I do not have enough time for more experiments. Maybe someone can share insights on sayas, incoutput, decoutput, extern, onerror and savedata? Here is what I found out until now:
    • Savedata "filename" seems to be an alias of debugwriteinput "filename".
    • decoutput takes the name of the active output buffer, removes the last character and creates a new output buffer with the shortened name.
    • incoutput "suffix" takes the name of the current output buffer, appends the single mandatory parameter to the name and creates a new output buffer with the combined name.
    • sayas "arg1" "arg2" takes two arguments. Not sure what it is expected to do.
    I have no specific ideas regarding extern and onerror.

BTW: i really do like mp3tag.
-u302320


"\\" für Multi-Value-Tags funktioniert nur bei manueller Eingabe
#2

Good topic!

As far as know, no. As far as know, no. outputto "coverurl" say "http://forums.mp3tag.de/style_images/13/logo4.gif" The output has to be the url of a picture. The image is saved, not the url. It is embeded in the tag or/and saved to the folder. The user can controll this with the Utils button down left at the "Adjust tag information" window of web sources. say " " I use this trick at one of my scripts. It does not delete a tag field but overwirtes it with a single space. Not the best sollution, but a quick workaround. I guess it's not possible. Not only with web sources but also with Mp3tag itself.

#3

Thank you for explaining the CoverUrl behaviour. I will come back on this in a separate feature request posting.

I want to add, that I actually know two ways of transferring information from ParserScriptIndex to ParserScriptAlbum, but both are so clumsy that I didnt want to list them in the original post.
First, you can add your information add the end of the Album url. You can for example append a dummy parameter &mydata=somestring. Restricted access to this information is possible with SayOutput "CurrentUrl" from within the ParserScriptAlbum block. Did I call it clumsy?
The second way is similar, but uses the PreviewUrl. Add the %_preview% to the IndexFormat, provide the data in the album urls where indicated from the %_preview% pattern. Access is possible through SayOutput "CurrentPreview". Of course you'll loose the preview feature in turn, but you can at least get one block of data from the first to the second block. Additional benefit: you dont have to worry about entity encoding and similar stuff - you get the data provided as %_preview% back unmodified.

Regarding your " " trick. If my goal was to supress the old information that would be a way to go. But I want to delete the tag itsself so i suppose i need the additional remove action. I will adress this as well in feature request.

-u302320


#4

Today, i discovered the lower left table of the Adjust tag dialog. Not sure why it took so long, but maybe its because i mostly use single file tagging with websources that do not support this nice feature. Shame on me! How could i miss that?

So, as far as i understand, it is possible to return a lists of track specific information from the ParserScriptAlbum block. I've seen the following columns with distinct values for each track: tracknumber (through output buffer 'tracks'), track length (via '_length'), artist name (artist) and of course title (title). From reading the scripts i guess that the disc number should could also contain track-specific data and both discnumber/totaldisks or just discnumber is supported.

I tried to find further information on how this works exactly and what can be done with this feature in the programs help, but had no success. I remember having read a thread here in the forum regarding tracks and problems with the track/numtracks format or something like that, but i did not fully understand the problem at the time i read the posts.

So, some questions again:

  • which columns may contain track specific data? Does it work with any column/field as long as its formatted as value1|value2|...valueN? If not, which fields are supported? Whats about coverurl?
  • does _length contain just user information or will its values end in TLEN? Are there additional builtin _fields, like _length?
  • what restrictions should i better take into account before changing everything to support this .... how is it called, btw?

#5

In case you haven't come up with that yourself already, I would suggest setting each output to predefined value (e.g. $delete$) and then use 'Replace with regular expression' action to parse _TAG for it's presence (set it to replace with nothingness).

I'm glad you're trying to clarify all this. I wish I knew the answers, too. Let us know if you discover anything.


#6

I experimented a little bit more. This is what i found out so far, based on mp3tag 2.48d:

(At least) two variables (aka output buffers) with a special meaning exist.
tracks

This output buffer may hold multiple title values. The end of every value must be marked with | (vertical bar). If tracks contains at least two values the track view in the lower left area of the adjust tag information dialog will be populated with a track number (index starting with 1) and the title value taken from the tracks output buffer.
_length
This output buffer may contain the individual track lengths. It is possible to specifiy hours, minutes and seconds as hh:mm:ss. Hours and minutes are optional. Numbers greater than 6000 are interpreted as milliseconds. If the websource only has access to minutes, one may append ":00" to prevent mp3tag treating minutes as seconds. If used, at least one _length value must be non-empty. The length has only informative purposes, it is not stored in, for example, %length%.
coverurl
Multiple values in the coverurl output buffer are not supported. If present, coverurl will be ignored completly and no covers will be stored.
other fields
mp3tag seems to support multiple value in any other fields too. However, the track list will only be enabled, if a) multiple values in fields other than tracks are provided and B) multiple files are selected. If tracks contains multiple values the track list is enabled even if only a single file is selected. Be warned: I have had mixed results with multi-valued buffers (other than tracks, discnumber, title and _length). I have even seen trashed data in the track list or the Adjust tag information dialog not showing up at all!

In case the tracks view is enabled (see above for conditions) some additional things may happen automatically:

  1. The user can adjust the mapping between files (selected in the file view) and tracks (provided from the websource) with the "Move up" and "Move down" buttons. The default mapping is one to one, so the first track is associated with the first track, the second file with the second track and so on. If there are fewer files than tracks, dummy lines are inserted into the file list. In case of more files than tracks the surplus files will remain untagged.
  2. mp3tag will automatically populate the %track% field from the track index column. The total number of tracks is not stored in the %track% field.
  3. mp3tag will automatically populate the %title% field with values taken from the tracks output buffer. The values of the title output buffer will be stored as distinct %subtitle%. A non-empty subtitle output buffer will be ignored from mp3tag in this case. If the title output buffer is single valued, it will be ignored in favor of the tracks value and the subtitle buffer will work as usual.

As I'm only a plain user the above information may be accurate or not. I tried to be as specific as possible.


#7

I also struggled with processing track information a few months ago, but could not find in-depth documentation or any forum comments back then. I'm glad this topic was raised, and think u302320 wrote down an excellent - simple yet clear - post, and hope we will still get some more replies (automating tagging becomes more important as a collections grows).

By doing several tests, I would conclude that almost any tag can be populated with 'looping' track information, except maybe the tracknumber (generated automatically). The tracks loop is started by outputting the first 'TITLE' value to 'TRACKS' , and then add a pipe as delimiter '|'. Before outputting the next track title, I make sure to add output to any other tag, and again add a pipe delimiter to each of them. When adding the values of the last track to the tags, they get again the pipe-delimiter at the end.

This means that :

  1. Each field thus gets as many pipes as the album has tracks.
  2. The script should always add the same set of 'track-tags' even if the input stream would not contain any (once I get in the do-loop for tracks, I first add the value for tag TITLE, and then it seems I have to add any other value (i.e. to other tags for that track) before looping to the next track; it was impossible to change previously outputted tracks... a bit funky, but it could have to do with how the web sources engine has been written)

I hope I can clarify with a simplified example. Imagine the album has 3 tracks; track 1 & 2 simply have a title and artist, but the 3rd track is a remix by another artist and this should be stored in tag 'MIXARTIST'. Then the input stream could look like :

1SpiritualAllaby 2GladeAllaby 3GladeAllabyTimewaveRemix

Then the script would look like :
findline ""
moveline 1
do

outputto "TRACKS"
findinline "<title>"
sayuntil "</title>"
say "|"
outputto "ARTIST"
findinline "<artist>"
sayuntil "</artist>"
say "|"
# We use a temporary variable for ensure MIXARTIST gets proper empty values
# ... and whatever value it had, it's temporary ...
set "TEMP_ARTIST" ""
movechar 9
if "<extraartist>"
    # We put the extra artist in the temporary field, not knowing yet if it is the mixartist
    outputto "TEMP_ARTIST"
    sayuntil "<role>"
    movechar 6
    ifnot "Remix"
        # The extra artist is not the remixer, the temp variable is cleared
        set "TEMP_ARTIST" ""
    endif
else
    movechar -9
endif
# The temp variable will only contain a value if a extra artist was a 'remixer'
outputto "MIXARTIST"
sayoutput "TEMP_ARTIST"
say "|"
moveline 1

while ""

In DEBUG output you can nicely see the constructed values for each tag :

Total output:
output["TRACKS"]= "Spiritual|Glade|Glade|"
output["ARTIST"]= "Allaby|Allaby|Allaby|"
output["MIXARTIST"]= "||Timewave|"

The websources tag screen will have 4 columns for tracks :
Tracks / Title / Artist / Mixartist

I experimented with constructing every tag in its own do-loop, but could not get that to work. It is like the values are committed when outputting anything to TRACKS after a pipe delimiter. I also got the impression that any modifications done with the statement REPLACE were undone and the original input stream was restored.
And I would assume that once the TRACKS loop is started, all 'album-tags' should have been added, although I did not test explicitly on that.

The above behaviour, together with the limitation that nested do-loops are impossible, sometimes makes it hard to have your script react correctly on all the differences that can occur within the input stream. I simply accept that some manual finalizing should be done sometimes...

I attached my script to this reply, so you can play with it. Keep in mind though that I did not write it for all discogs entries, it is tailored to the kind of albums I tag. I also run an Action afterwards, but those are not relevant wrt the topic discussed here (however, 1 of the things it does is to update the tag TOTALRECORDS with # if selected tracks).
These are 2 discogs-ids that run smoothly : 121839, 119286
And this is one where I hit the limits : 38985

(If you are not familiar with Discogs, it is a website that stores album information; each album is unique discogs id, the album can be accessed with a url starting with 'http://www.discogs.com/release/' followed by the id. The script does not expect html though, but uses the discogs api to retrieve xml; you can view that xml using the link on the websources screen or type the url http://www.discogs.com/release/38985?f=xml..._key=1e48c7f4e4 where 38985 is the discogs id of the album).

P.S. I also experimented to get multiple coverurls using different delimiters, nothing worked.

Discogs_by_ID.src (7.43 KB)


#8

You don't have to start with TITLE/TRACKS in the loop. E.g. my discogs scripts write DISCNUMBER, TRACK TEMP and ARTIST before it comes to TITLE/TRACKS.

That's right. What is once in the output can't be changed with commands like "regexpreplace" or "replace".

All modifications of the input text are undone when you use any command which changes the line. Your example script in your post has "moveline 1" at the end of the track loop, so replacements done before the loop are undone after the first track. There are two solutions:

  1. do the replacements/modifications whithin the loop, so that they are done for every track again.
  2. use a joinuntil command which gets the whole tracklist into one line before you use your tracklist loop.

But your attached discogs api script doesn't change the line within the track loop, since the xml files have the whole tracklist in one line. You can make replacements right before the loop (but after findline ""). These replacements will be valid for the whole loop.


#9

Pone, thank you for your comments.
I assume the FINDLINE will cause problems only if it really results in moving to another line. That is very helpful, and could explain my failure trying to fill the tags each with its own loop... interesting. Will have another go at it, as it would make it much easier to catch all the additional track values.