Author Topic: Duplicates in sorted lists  (Read 2271 times)

0 Members and 1 Guest are viewing this topic.

irneb

  • Water Moccasin
  • Posts: 1794
  • ACad R9-2016, Revit Arch 6-2016
Duplicates in sorted lists
« on: July 10, 2012, 05:18:38 AM »
The vl-sort removes duplicates if each item in the list is an atom. Thus you use the vl-sort-i instead. But what happens if your comparison function is only comparing to a portion of each item. If vl-sort(-i) uses the normal QSort algorithm, then such items' positions would be undefined amongst each other. But:
Code - Auto/Visual Lisp: [Select]
  1. _$ (setq lst '(1.0))
  2. (1.0)
  3. _$ (repeat 10 (setq lst (cons (+ (car lst) 0.1) lst)))
  4. (2.0 1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2 1.1 1.0)
  5. _$ (mapcar '(lambda (n) (nth n lst)) (vl-sort-i lst '(lambda (a b) (< (fix a) (fix b)))))
  6. (1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0)
  7. _$ (mapcar '(lambda (n) (nth n lst)) (vl-sort-i lst '(lambda (a b) (<= (fix a) (fix b)))))
  8. (1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2 1.1 1.0 2.0)
Notice the <= keeps the original order for the "equal" items (i.e. where the integer portions are equal), while the < comparison causes reversal. Even if I randomize the list beforehand:
Code - Auto/Visual Lisp: [Select]
  1. _$ (setq lst '(1.0 1.9 1.7 2.0 1.8 1.1 1.4 1.5 1.3 1.6))
  2. (1.0 1.9 1.7 2.0 1.8 1.1 1.4 1.5 1.3 1.6)
  3. _$ (mapcar '(lambda (n) (nth n lst)) (vl-sort-i lst '(lambda (a b) (< (fix a) (fix b)))))
  4. (1.6 1.3 1.5 1.4 1.1 1.8 1.7 1.9 1.0 2.0)
  5. _$ (mapcar '(lambda (n) (nth n lst)) (vl-sort-i lst '(lambda (a b) (<= (fix a) (fix b)))))
  6. (1.0 1.9 1.7 1.8 1.1 1.4 1.5 1.3 1.6 2.0)
So what this tells me is the vl-sort(-i) functions don't use a qsort algorithm. Does anyone know what it does use then?
Common sense - the curse in disguise. Because if you have it, you have to live with those that don't.

ribarm

  • Gator
  • Posts: 3248
  • Marko Ribar, architect
Re: Duplicates in sorted lists
« Reply #1 on: July 10, 2012, 06:32:07 AM »
I don't know what algorithm it's using, but with examples you shown it is doing exactly what it should :

o FIRST CASE : (operand '<) comparison between two same integers (fix a) and (fix b) are done the way '< operand tells - integers are sorted like they are integers, so the whole list is reversed (sign <, or sign >, or sign /=) with exception of two different integers 1 /= 2, in withch case sorting is in order by operand '<...

o SECOND CASE : (= sign in '<= operand) tells that while comparing two same integers (fix a) and (fix b) original order of sorting should be preserved, but if integers are different 1 /= 2, then < sign of '<= operand is doing it's function in sorting...

M.R.
« Last Edit: July 10, 2012, 06:45:06 AM by ribarm »
Marko Ribar, d.i.a. (graduated engineer of architecture)

:)

M.R. on Youtube

irneb

  • Water Moccasin
  • Posts: 1794
  • ACad R9-2016, Revit Arch 6-2016
Re: Duplicates in sorted lists
« Reply #2 on: July 10, 2012, 07:13:56 AM »
That is not "why" it's doing this. It's got nothing to do with whether the integer is smaller than the same number as real. Consider the following:
Code: [Select]
_$ (< (fix 1.0) (fix 1.9))
nil
_$ (< (fix 1.9) (fix 1.0))
nil
_$ (= (fix 1.9) (fix 1.0))
T
Especially in this case you can see there's nothing about a floating point being compared at all. The above becomes a situation of
Code: [Select]
_$ (< 1 1)
nil
_$ (< 1 1)
nil
_$ (= 1 1)
T
The reason the < and <= causes differences is because of the sorting. Pseudo code:
Code: [Select]
QuickSort (List, Comparison)
If length<=1
  Return list
  Else
    Set Pivot to Middle Item of List
    Step through each remaining item in list
      If Comparison of item to pivot returns True
        Append item to LessList
        Else Append item to MoreList
    Return Concatenate (QuickSort (LessList), Pivot, QuickSort(MoreList))
In lisp, the middle item is slow to get to so we use the 1st item as pivot instead:
Code - Auto/Visual Lisp: [Select]
  1. (defun QuickSort (lst compare / less more pivot)
  2.   (if (<= (length lst) 1)
  3.     lst
  4.     (progn (setq pivot (car lst))
  5.       (foreach item (cdr lst)
  6.         (if (apply compare (list item pivot))
  7.           (setq less (cons item less))
  8.           (setq more (cons item more))))
  9.       (append (QuickSort (reverse less) compare) (list pivot) (QuickSort (reverse more) compare)))))
And I think I've just answered my own question. If vl-sort uses a QSort as if working on linked lists, then it seems to work correctly. The QSort only has issues with equal comparisons when working on a fixed array size - the linked list version duplicates the data by splitting it into 2 new lists on each recursion - this is actually known to keep "stability" - which is what I'm on about.

So most probably the vl-sort(-i) function use a non-space optimized QSort algorithm, i.e. not-inplace QSort on an array, but a linked-list version. Either that or it's using something like MergeSort.
Common sense - the curse in disguise. Because if you have it, you have to live with those that don't.

ribarm

  • Gator
  • Posts: 3248
  • Marko Ribar, architect
Re: Duplicates in sorted lists
« Reply #3 on: July 10, 2012, 07:57:01 AM »
Probalby you're right ab qsort... I don't know much ab various algorithms, but my conclusion is that when vl-sort-i sorts equal elements it sorts by reverse order of list if operands are <, >, /=... If operands are =, <=, >= then it sorts by preserved order of elements in original list...

So answer on provious thread ab :
Code - Auto/Visual Lisp: [Select]
  1. (defun f (l n) (nth (nth n (vl-sort-i l '<)) l))
  2.  
is that vl-sort-i will look for the smallest element in reversed list of original so :
Code: [Select]
(f (setq lst '(1 1.0 1 1.0 2 3)) 0) => 1.0
and that is equal as if :
Code - Auto/Visual Lisp: [Select]
  1. (defun ff (l n) (nth (nth n (vl-sort-i l '<=)) l))
  2.  
now vl-sort-i will look for the smallest element in preserved order of elements of list so :
Code: [Select]
(ff (reverse lst) 0) => 1.0

I didn't check but I suppose this is true :
Code: [Select]
(equal (f lst n) (ff (reverse lst) n)) => T

M.R.

[EDIT : Yes, it's true, I've checked it...]
« Last Edit: July 10, 2012, 08:05:18 AM by ribarm »
Marko Ribar, d.i.a. (graduated engineer of architecture)

:)

M.R. on Youtube

irneb

  • Water Moccasin
  • Posts: 1794
  • ACad R9-2016, Revit Arch 6-2016
Re: Duplicates in sorted lists
« Reply #4 on: July 10, 2012, 08:11:31 AM »
Exactly:
Code: [Select]
_$ (setq lst '(1 1.0 1 1.0 2 3))
(1 1.0 1 1.0 2 3)
_$ (mapcar '(lambda (n) (nth n lst)) (vl-sort-i lst '<=))
(1 1.0 1 1.0 2 3)
_$ (mapcar '(lambda (n) (nth n lst)) (vl-sort-i lst '<))
(1.0 1 1.0 1 2 3)
Notice that the 1 and 1.0 are placed in the orders they were to start off with (either reversed with < or not with <=). If you were correct in stating that 1 < 1.0 then both those results should have been the same:
Code: [Select]
(1 1 1.0 1.0 2 3)
For some readup on sorting algorithms: http://en.wikipedia.org/wiki/Sorting_algorithm and http://www.sorting-algorithms.com/


Common sense - the curse in disguise. Because if you have it, you have to live with those that don't.