Json_select without a key

So I’m trying to grab separate values using json_foreach in an array without an initial key which works as documented, but wondering how to select each value when they don’t have a key. For example:

["af-ZA","ar-AE","ar-BH"..."zu-ZA"]

I do have a solution using RegexpReplace to slot in a key per value:

[{"value":"af-ZA"},{"value":"ar-AE"},{"value":"ar-BH"}...{"value":"zu-ZA"}]

A similar method could be achieved with json_select_array but still gets me wondering if I’m missing something. In terms of grabbing the value using only json_foreach, no luck with json_select or any Say commands.

Am I missing something or is there potential for using json_select without referring to a key? This might also benefit json_select_object for accessing any unkeyed objects in an array.

Question: Was this example ported from real-world data, or did you create the array yourself?

From the definition of the JSON format, objects are meant to be represented as key-value pairs; so that an array with only values is not supposed to exist (and be considered valid JSON data).

As for your solution, it's a perfectly fine workaround.

Apologies, I should have specified it’s an array generated using this. I’m using ellipses just to minimise an example list, this one being 143 values.

Summary
["af-ZA","ar-AE","ar-BH","ar-EG","ar-IQ","ar-JO","ar-LY","ar-MA","ar-QA","ar-SA","ar-TD","ar-YE","be-BY","bg-BG","bn-BD","bn-IN","br-FR","ca-AD","ca-ES","ch-GU","cs-CZ","cy-GB","da-DK","de-AT","de-CH","de-DE","el-CY","el-GR","en-AG","en-AU","en-BB","en-BZ","en-CA","en-CM","en-GB","en-GG","en-GH","en-GI","en-GY","en-IE","en-JM","en-KE","en-LC","en-MW","en-NZ","en-PG","en-TC","en-US","en-ZM","en-ZW","eo-EO","es-AR","es-CL","es-DO","es-EC","es-ES","es-GQ","es-GT","es-HN","es-MX","es-NI","es-PA","es-PE","es-PY","es-SV","es-UY","et-EE","eu-ES","fa-IR","fi-FI","fr-BF","fr-CA","fr-CD","fr-CI","fr-FR","fr-GF","fr-GP","fr-MC","fr-ML","fr-MU","fr-PF","ga-IE","gd-GB","gl-ES","he-IL","hi-IN","hr-HR","hu-HU","id-ID","it-IT","it-VA","ja-JP","ka-GE","kk-KZ","kn-IN","ko-KR","ku-TR","ky-KG","lt-LT","lv-LV","ml-IN","mr-IN","ms-MY","ms-SG","nb-NO","ne-NP","nl-BE","nl-NL","no-NO","oc-FR","pa-IN","pl-PL","pt-AO","pt-BR","pt-MZ","pt-PT","ro-MD","ro-RO","ru-RU","si-LK","sk-SK","sl-SI","so-SO","sq-AL","sq-XK","sr-ME","sr-RS","sv-SE","sw-TZ","ta-IN","te-IN","th-TH","tl-PH","tr-TR","uk-UA","ur-PK","uz-UZ","vi-VN","zh-CN","zh-HK","zh-SG","zh-TW","zu-ZA"]

Another abbreviated example I can give is an API query to TMDB to get the episodes from season 1 of a TV show. Most of the crucial parts are keyed no problem, except for episodes which are objects in an array like:

"episodes":[{"air_date":"2011-04-17","episode_number":1 ... "name":"Winter Is Coming"},{"air_date":"2011-04-24","episode_number":2 ... "name":"The Kingsroad"}]

So sadly, TMDB does deal with invalid JSON. My solution to get a specific episode without having to json_foreach through all the objects was saving the website response to an “Input” output buffer, another RegexpReplace to find the" “episodes” object in raw JSON then regex again to find the episode that way. A command for that purpose would need a bit more thought to get an object via a value instead of a key.

Fair enough if there isn’t an official way, just considerations for making JSON easier to handle. I’m okay with the workarounds and hopefully this helps others if they come across a similar issue :grin:

And I too must apologize, as I was also mistaken.

I understood your use of ellipsis to shorten a long array. Going to the link you provided, it shows the data with the same formatting as in your 1st post. And that data is JSON formatted.

I was not aware of this until now, so we both have learned something new from this topic.


From your original question, the unnamed array can be called using json_foreach "", but only if it's a root array. When dealing with arrays nested in objects, then they must be named, or Mp3Tag will throw an error.

So for your case of a root array populated with only array literals, the (unnamed) array *can* be selected, but currently there doesn't seem to be a way to select those array literals using JSON commands.

Your idea of using RegexpReplace to turn the array literals into objects (and then call the object literals from the corresponding key) was spot on. :+1:


Edit: the entire entry on JSON on W3schools makes a very interesting read on this topic, specially the JSON syntax page.

Thank you @rboss for sticking with me :smile:

Granted, I got my experience with JSON through iOS Shortcuts (basically unusable now, thanks Apple :face_with_symbols_on_mouth: ) but thanks for clarifying further that arrays are still valid without keys :grinning_face:

If given the same ability to read unnamed/unkeyed targets, could json_select_array solve this? As long as a website response is consistently structured for every query, e.g. “episodes” always returns episodes in ascending numerical order?

…I realise I’m now pitching several new things at once but it could lead to a unified solution as opposed to changing the behaviour of several commands. :woozy_face:

Oh no. I have some reading to do. :face_with_bags_under_eyes:

I had to run a few tests to be sure, but the ability to read unnamed root arrays seems to be limited to json_foreach.

Given a sample named JSON array of literals and a few test commands:

Use "{"array":["a","b","c","d"]}"
json "ON" "current"
json_select_array "array" -1 ","
json_select_array "array" 1
json_foreach "array"
json_foreach_end

I get (respectively):

a,b,c,d     ##_ json_select_array "array" -1 ","
a           ##_  json_select_array "array" 1
4           ##_ json_foreach "array"   ---> (number of literals in array)

The same (array name removed) commands on the unnamed variant:

Use "["a","b","c","d"]"
json "ON" "current"
json_select_array "" -1 ","
json_select_array "" 1
json_foreach ""
json_foreach_end

now returns

<empty>     ##_ json_select_array "" -1 ","
<empty>     ##_ json_select_array "" 1
4           ##_ json_foreach ""   ---> (also number of literals in array)

So it would seem that

only json_foreach "" has support for unnamed root arrays, unlike json_select_array.

@Florian can I bother you again to chime in an opinion on this, please? :sweat_smile:

I've just released Mp3tag v3.33-beta.5, where the function json_select_array now supports JSON arrays as unnamed root element.

Had a read up on those W3schools links and I’ve got a far better understanding now of what’s valid/invalid JSON. The initial suggestions of json_select "" or json_select_object "" are flawed as this would suggest working with invalid JSON. The only way to use an unnamed array is indeed if it’s the first “root” value in a response.

Anyway, ran some tests with json_select_array "" which might help others trying to understand how to access arrays and their contents depending of different data types. Everything works except for singular numeric values:

Use "["1",2,{"test":"3"},{"test":4},{"test":"5","key":"value"},{"object":{"test":"6"}},{"test":["7","text"]},{"test1":["8","a"],"test2":["b","c"]}]"
json "on" "current"

# Any #
json_select_array "" -1 "/"
    SayRest
SayNewline

# String #
json_select_array "" 1
    SayRest
SayNewline

# Number #
json_select_array "" 2
    SayRest
    SayNextNumber
SayNewline

# Object with Key:String (needs unselected) #
json_select_array "" 3
    json_select "test"
    SayRest
json_unselect_object
SayNewline

# Object with Key:Number #
json_select_array "" 4
    json_select "test"
    SayRest
json_unselect_object
SayNewline

# Object with Multiple Values #
json_select_array "" 5
    json_select "test"
    SayRest
    json_select "key"
    SayRest
json_unselect_object
SayNewline

# Object in Object #
json_select_array "" 6
    json_select_object "object"
        json_select "test"
	SayRest
    json_unselect_object
json_unselect_object
SayNewline

# Array #
json_select_array "" 7
    json_select_array "test" -1
	SayRest
json_unselect_object
SayNewline

# Multiple Arrays #
json_select_array "" 8
    json_select_array "test1" -1
	SayRest
    json_select_array "test2" -1
	SayRest
json_unselect_object
SayNewline

["Output"]= 1///////
1

3
4
5value
6
7text
8abc

I noticed a funky thing in debug files when an object is selected that it reveals its internal index (0-based). Not a problem, just an observation:

Command        : json_select_array
Parameter 2    : >3<

JSON objects   : >[].2<
JSON loops     : ><

json_select_array is better suited for selecting via an index rather than the initially described issue of getting singular values from a json_foreach loop. That being said, json_select_array doesn’t support %placeholder% numerical parameters to retrieve these values so if a loop-through is required, the solution is again just transforming an array of singular values into key:values using RegexReplace or json_select_array with "},{"value":" as a delimiter.

In the mean-time, another new feature's working well. :slightly_smiling_face:

On Mp3tag v3.33-beta.5 the results are now:

a,b,c,d     ##_ json_select_array "" -1 ","
a           ##_ json_select_array "" 1
4           ##_ json_foreach ""   ---> (also number of literals in array)

Thank You :love_you_gesture: very much @Florian.

Yes it does. And it's a godsend. :baby_angel:

I know this because I remember (vividly) the exact bug that eventually led to this addition. If you're interested you can read about it here

Before this we just had to mentally visualize (and believe) that the pointer would move where it was supposed to relative to its previous position during JSON data operations, which didn't always happened.

Nice one, it does help show the fact it’s in an unnamed array. I don’t see an issue having it 0-based as opposed to 1-based like json_foreach as it could help teach others the difference. :nerd_face:

I’m going to capitalise on this to bring attention to the fact it can hard at times to see what sort of values you can actually extract from what’s being pointed at, meaning going back-and-forth between debugwriteinput files and your own script.

At the risk of yet another request (I’ve already tried pitching or alluding to too many things this topic and I apologise :face_with_open_eyes_and_hand_over_mouth:), it would be great to have a command that reveals in raw JSON text what’s currently being pointed at: a single key:value, an object, a json_foreach's current item, whatever it is. Could be named along the lines of UseJSON or json_current, there’s more practicality in it being loaded into Line and position than another Say.

I’m thinking two-birds-one-stone as this would also directly solve the topic’s initial problem of grabbing a singular unkeyed value specifically from a loop, without messing around with any other commands or creating “Input” output buffers that can produce utterly disgusting debug files.

(Someone else can pitch something for variable numerical parameters another time, my head hurts. :sneezing_face:)

Would you mind posting an example or mock-up of how this suggestion would work? From the debug output we can already get information about where in the JSON data the pointer is:

Script-Line    : 108
Command        : json_select
Parameter 1    : >air_date<

JSON objects   : ><
JSON loops     : >episodes: 1/2<

Output         : >

Line and position:
2011-04-17

From this selection I can read that the pointer is:

  • on the 'episodes' array, which contains 2 objects;
  • the objects are unnamed
  • the pointer is on the first loop, and on the 'air_date' key
  • the corresponding value of that key is selected

While it does take some practice to interpret the debug file, I looks to me that what you're suggesting is already somewhat possible with the current build.

If you have experience with debugging with another coding language/tool and think that method would be easier/simpler, then would you mind sharing it?

From my personal debugging method, I use JSON Editor Online opened on another tab with my JSON data, and go back and forth between it, my script, and my current debug output to analise what my script is doing. Over there arrays are displayed 0-based; but when going through a loop with json_foreach the first loop is #1, so in my opinion, the current process is correct.

Of course your debugging method will probably differ from mine, and that is perfectly fine.

Well, debug files are meant to show what goes on 'behind the scenes' during script execution, so IMO they're not supposed to look pretty..... to the extent that they merely reflect the good (or mainly bad) operation of its parent script.

TLDR: if scripting were easy, debug outputs wouldn't be necessary.............. :grin:

Ahhh, but what’s in the current episode object? The pointer knows but it's keeping it on the down-low... :shushing_face:

json_select does do this already for key:values, json_select_object as you rightly say describes it but doesn't actually output the contents until probed by other commands. Sometimes it's far easier to have full sections available as actual text for commands other than json-based ones to access. I'll even risk an assumption this could be made to help HTML parsing.

Examples are at the bottom but for more context, I’ll attach an example query response from my TMDB scripts:

AlbumInput.txt (129.4 KB)

This is everything necessary for "Game of Thrones" season 1. Part of the script’s speed is avoiding iterations where possible and using methods to select and process multiple batches of data at once using RegexpReplace, like getting all the actors at once in just a few commands. A simpler example is checking if an episode is available:

Use "%INPUT%"
RegexpReplace ".*?\"episode_number\":(%NUMBER%),.*" "$1"

From testing other movies/TV, the response data can also include data with newlines (\n), quotations (\”) and vertical bars (|) needing replaced as RegexpReplace doesn't like these. That means having to json “on” “current” the transformed response inside an output buffer as opposed to json “on” whenever a new section of data is required in text form.

Just Useing this “Input” buffer quickly ramps up a debug file's file size and there’ll be times when I need tons of data processed to find a fault within a debug file foregoing the file size parameter and hammering CTRL+F with any clues to the cause. Even with careful planning to stop or reduce an impact, I’ve often accidentally had debug files nearing 1GB... which to clarify is the disgusting part. :nauseated_face:

The ability to reveal currently pointed-at sections as “raw” text through this suggested command would greatly reduce this. It could also save some extra prep-work transforming strings of data that would otherwise be readily usable for commands to use.

I admit I’ve still a way to go to become more professional with debugging strategies though I do use movies/TV that return smaller, controlled responses and try to use separate test scripts for each new module of code. I also use this and that to validate and proof-read (but your site does look better :face_with_tongue:) as it is vital to review the debug files no matter how nasty.

So with a progressing example of a potential command:

After using json "on", reveal as text:
...129.4 KB's worth of text.

After json_select_object "genres":
[{"id":10765,"name":"Sci-Fi & Fantasy"},{"id":18,"name":"Drama"},{"id":10759,"name":"Action & Adventure"}]

After json_foreach "genres", it'll be 1/3:
{"id":10765,"name":"Sci-Fi & Fantasy"}

After json_foreach_end, 2/3:
{"id":18,"name":"Drama"}

And as a final example, after json_foreach "" on ["one","two"] at 1/2:
one

Edit: Okay, one more example.
IndexInput.txt (36.8 KB)

Assuming from a smaller selection of the set:

{"certifications":{"DE":[{"certification":"12"}],"BR":[{"certification":"14"}]}}

...transformed into an array for json_foreach or json_select_array to be able to use:

{"certifications":[{"DE":[{"certification":"12"}]},{"BR":[{"certification":"14"}]}]}

I'd want to get each region (DE,BR) from each object in the array to then get its own contents but given this list could update as new regions are added/removed, I wouldn't know every possible value to json_select. I would need to read the the region key from each object somehow. Currently, it's cross-referencing the loop of objects with a separate array of all region lifted using regex...

OR after json_foreach_end, 1/2:
{"DE":[{"certification":"12"}]}

with a quick regex to lift the region key, use it to access its own contents, use it to name its own contents for outputs, sorted. Either that or again, ways for other json commands to read unknown keys and/or output such otherwise inaccessible keys.

Basically, it could make debugging easier, expand more JSON functionality and give commands more accessibility to the response.