Excellent analysy Alan, I'm impressed you took the time to break it down. I massaged it a bit:
MakeSortIndex pseudo code function expects to be passed a list of alpha numeric strings
convert each string to a list of ascii codes (hereafter referred to as codes)
group codes in each list into sublists of digits and non digits
find the length of the longest digit sublist, put value in maxlen
pad each digit sublist with leading zeros so that the sublist length equals the maxlen value
*Option A convert each sublist to substrings and strcat the substrings to one string
Option B (should be faster, only 1 strcat call per list item, append is faster than strcat) append each list of sublists to return one code list per original list item
convert each code list to a string
Using the MakeSortIndex combine the new list with the original list
sort, using elements in the new list
discard the new list elements
Here's a version that uses Option B:(defun CreateSortIndex ( lst / isdigit pad tolist main )
;; This function (CreateSortIndex ...) is not optimized,
;; nor generic. It is specific to this discussion and was
;; penned quickly. It has errors, warts and pimples.
;;
;; Proceed accordingly.
(defun isdigit ( code )
(< 47 code 58)
)
(defun pad ( codes code len )
(while (< (length codes) len)
(setq codes
(cons code codes)
)
)
codes
)
(defun tolist ( string / codes result )
(foreach code (reverse (vl-string->list string))
(cond
( (null codes)
(setq codes (list code))
)
( (isdigit code)
(if (isdigit (car codes))
(setq codes (cons code codes))
(setq
result (cons codes result)
codes (list code)
)
)
)
( (if (isdigit (car codes))
(setq
result (cons codes result)
codes (list code)
)
(setq codes (cons code codes))
)
)
)
)
(if codes
(cons codes result)
result
)
)
(defun main ( lst / maxlen result )
(setq maxlen 0)
(foreach lst (setq result (mapcar 'tolist lst))
(foreach codes lst
(if (isdigit (car codes))
(setq maxlen
(max maxlen
(length codes)
)
)
)
)
)
(mapcar
'(lambda ( lst )
(vl-list->string
(apply 'append
(mapcar
'(lambda ( codes )
(if (isdigit (car codes))
(pad codes 48 maxlen)
codes
)
)
lst
)
)
)
)
result
)
)
(main lst)
)
Only 2% faster.
PS:
One of the inefficiencies is marked with the red asterisk (
*) above. "But", you say "that's how it works". True that, but if it performed more analysis it would also consider position. However, my guess was that the amount of overhead required to do the position analysis outweighed the benefits of position context padding.
But if one could write a fast and furious position analyser, well ...
Thanks for letting me play in your thread Alan, was fun.