The problem here is that several tracks have identical or very similar track lengths. I'm using a one-second tolerance when computing the sorting suggestion, and this is already enough to make the result not matching the actual correct sorting — I think this is also why I've named the feature "Suggest ..." and not "Sort by ...".
I could remove the one-second tolerance, but this would then fail with other releases where there is a slight mismatch between the reported track length and the actual track length.
Another option would be to really use a multi-pass sorting computation, but this seems to be too much work for such a convenience feature.
I forgot to scroll over on the left pane for the last screenshot but here's it sorted by length. There seems to be something wrong with the sorting function.