CSV export fails on value containing semicolon

The export fails to quote the property value, so the semicolon is output as a false separator.

Workaround: recode.

There is no world wide standard for the so called CSV file format, but there are some recommendations.
http://de.wikipedia.org/wiki/CSV_(Dateiformat)
http://en.wikipedia.org/wiki/Comma-separated_values
http://tools.ietf.org/html/rfc4180

DD.20091003.1001.CEST

There is no world wide standard for the so called CSV file format

Sorry Detlev but that's irrelevent. This bug breaks the file for /any/ app that imports CSV format.

Yes, the Mp3tag user should always know and make sure what is right or wrong when using Mp3tag templates.

I recommend the nice 'CSVed' tool for working with csv files.
http://csved.sjfrancke.nl/

DD.20091003.1132.CEST

the Mp3tag user should always know and make sure what is
right or wrong when using Mp3tag templates.

Well I suggest it would be easier for the user if there was not this thing wrong with Mp3tag for him to have to know.

I recommend the nice 'CSVed' tool for working with csv files.
http://csved.sjfrancke.nl/

Thanks.

I'll happily incorporate any fix you can provide that covers any types of separators that can be interpreted by Excel in an erroneous way.

Please also note, that the behaviour of Excel differs depending on your current system locale.

I'll happily incorporate any fix you can provide that covers
any types of separators that can be interpreted by Excel in an erroneous way.

There is only one type of separator used in this file - that's the semicolon I am reporting fails. And the fault affects all importing programs, not just Excel.

CSV = means "Comma Separated Values"

but Microsoft (Excel) uses the settings from the Regional Settings - which depends on the country + language or manual change.

There is a RFC also but completely a shit (I had a contact before they have defined it but ... today this standard doesn't help of course).

If you using "America" the separator is a comma (",") - if you are using "Germany" the separator is the "semicolon" (";"). There may be other separators too of course (a vertical bar "|" or the tabulator).

e.g.
a,b,123, a value
a;b;123, a value

The delimiter character is a double quote ("). It must be used if a control character, the separator or the double quote character itself is inside the values. It is not required for spaces - but highly recommended. It depends on the application which reads the data if it auto-removes the spaces (with or without quotes).

e.g.
a,"b,",123, a value
or better
a;"b;";123," a value"

A double quote character will be "quoted" by itself

e.g.
the values (enclosed in <>):
<He said "How are you?"> <Here is a ; separator.>
will be (by using ; as separator
pueblo funky;"He said ""How are you?""";"Here is a ; separator."

Sometimes erroneous a separator will be called delimiter - but that is not the same. A separator "separates" the values. A delimiter encloses the value at the beginning and at the end.

Also all trailing separators could be removed before output.

e.g.
a,"b, ",123, a value,,,,
=>
a,"b, ",123, a value

Empty values - nothing between separators or nothing inside between double quotes - should be interpreted as NULL values (as for databases).

It is also important that CSV does NOT define data types!

When you write a date in the format "YYYY-MM-DD" - MS Excel detects this date format but converts it into the regional date format - which might be "DD.MM.YYYY" for Germany. When you write back the CSV file you destroy the file because "DD.MM.YYYY" will be written!

There is also a problem with the decimal point. MS Excel does not detect the decimal point (a comma "," or a point "."). There might be a auto-detection but it doesn't work if you using some values with or without a comma.

These should be all options how to write/read a CSV file. Do not take the regional settings for reading/writing - it the depends on the destination application how to read the file. That's not always Excel!

And note:

MS Excel however always writes the character set "iso-8859-1" and can only read this ...

EXCEPT:

The best is you are using the so called - "Unicode Text" under MS EXCEL.

This is:
Character encoding: UTF-16LE-BOM (Little Endian with BOM header ("FF FE"))
Separator character: Tabulator
Quoting character: Double Quote
File extension: .csv

Normally when you click "Save As" under MS EXCEL the file extension ".txt" will be used - but that is shit because when you double click on this file the editor (e.g. notepad) opens the file (and all looks destroyed).

The file extension ".csv" is the best. MS EXCEL has an auto-detection and opens the file correctly.

The values above is the only one solution + combination which works in MS EXCEL (- tested with 2003).

It's also important that the date format is the ISO date format: YYYY-MM-DD.

For the decimal character the decimal point "." should be used - but has to be tested - don't know it yet exactly what happens.

And the big advantage is also that ANY Unicode character can be used.

This works worldwide and is the best solution for Florian - no own options for a "csv" are required.

I hope this helps how too ...

ROBERT

PS: Don't forget ID3 for WAV :wink:

PSS: There is also a RFC for TSV (tab separated values but this specification does not allow the tab inside the values - and .tsv is nothing used anywhere).

Re. Export to CSV:
Is anyone going to fix the issue of texts containing the semicolon?
I would suggest surrounding values with the ; with a quotation mark ("), which IIRC will make the csv open correctly in Excel :

a;"b;";123,a value

Can you just change the template to surround any placeholders by double quotation marks then?

Example: "%title%";"%artist%";"%album%" instead of %title%;%artist%;%album%

I believe that can fail if any value contains a double quote.

It does not completely fail:
It keeps the columns in one piece (as opposed to the plain semicolon that leads to an extra column)
But the double quotes move somehow:
original Title: The Return "of" Babel
exported data: The Return of" Babel"

The best way to export to csv files would still be to insert $char(9) as field separator.
`%title%$char(9)%artist%$char(9)%album%
exported data now: The Return "of" Babel

It does not completely fail

It can completely fail to preserve the value.

Ah, this reminds me of the unsyncedlyrics field - this also has to be taken care of if it contains linebreaks which will probably be interpreted as "end of record".
So while the user takes care of this field, he could also make sure that the chosen separator is actually a unique one.

Using $char(9), as @ohrenkino suggested, really seems the way to go there. It represents the TAB character and is very unlikely to be used in any tag field.

Any objections to using this as the separator in the default CSV export configuration?

I would suggest to issue 2 csv-exports:
one for Excel with $char(9) as separator
and another one with the semicolon - because:
if you use the result of the $char(9) csv-export, it will be tricky to re-import that data back into MP3tag with the function Convert>Text file-Tag as you cannot enter the tab character in the input box for the field pattern.

I would prefer a solution that worked for all characters. I don't want an in-value separator character messing up the import.

I just imported data generated by the csv-export with this as exported fields:
%title%;%artist%;%album%;%track%;%year%;%_length_seconds%;%_file_size%;%_file_mod_date%;%_folderpath%;%_filename_ext%;
and the data in TITLE
Out the Blue "Xilent Remix"

and the exact same data was imported into Excel 2016.
I had to use the import assistant, though.
So could it be that this is mainly a problem of the target application?