Author Topic: {Resolved}How to find the duplicate items in the list  (Read 9093 times)

0 Members and 1 Guest are viewing this topic.

xiaxiang

• Guest
Re: {Resolved}How to find the duplicate items in the list
« Reply #30 on: January 20, 2015, 08:23:50 PM »
IMHO, I think to properly terminate this thread need to distinguish three types of "ListDupesFuzz" functions:
Code: [Select]
`=> fuzz 0.05Remove Uniques: (1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0)              =>(1.2              1.21 1.22 1.23)`

Sorry but I couldn't understand the function "Remove Uniques".
Why does it return (1.2 1.21 1.22 1.23)?
In fact I could not state the situation of the duplicate items exactly yet.

Marc'Antonio Alessi

• Swamp Rat
• Posts: 1210
• Marco
Re: {Resolved}How to find the duplicate items in the list
« Reply #31 on: January 21, 2015, 03:49:50 AM »
Sorry but I couldn't understand the function "Remove Uniques".
Why does it return (1.2 1.21 1.22 1.23)?
In fact I could not state the situation of the duplicate items exactly yet.
My apologies the exact result is:
Code: [Select]
`Remove Uniques: (1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0)              =>(             1.2 1.21 1.22 1.23              3.0 3.0)`You can understand what I meant if you see my previous post: http://www.theswamp.org/index.php?topic=48646.msg537421#msg537421

Code: [Select]
`(LM:UniqueFuzz ' (1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0) 0.05)                 (1.0 1.1 1.19  ^   ^    ^     ^  1.25 1.3 3.0  ^   ^ )Remove Uniques:=>(             1.2 1.21 1.22 1.23              3.0 3.0)`
I do not know if I can explain well: in the case above the item "1:19" is unique in the  out list of LM:UniqueFuzz but but just because it is the first that the function processes:
Code: [Select]
`(LM:UniqueFuzz '(1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0) 0.05)              =>(1.0 1.1 1.19  ^   ^    ^     ^  1.25 1.3 3.0  ^   ^ )(LM:UniqueFuzz '(1.0 1.1 1.2 1.21 1.22 1.23 1.25 1.3 3.0 1.19 3.0 3.0) 0.05)              =>(1.0 1.1 1.2  ^    ^    ^   1.25 1.3 3.0  ^    ^   ^ )`

irneb

• Water Moccasin
• Posts: 1794
• ACad R9-2016, Revit Arch 6-2016
Re: {Resolved}How to find the duplicate items in the list
« Reply #32 on: January 21, 2015, 06:07:18 AM »
Here's another idea
Code - Auto/Visual Lisp: [Select]
1. (defun group-duplicates  (L fuzz / aL grp sem)
2.   (foreach item  L
3.     (if (setq grp (assoc (setq sem (fix (/ item fuzz))) aL))
4.       (setq aL (subst (cons sem (cons item (cdr grp))) grp aL))
5.       (setq aL (cons (list sem item) aL))))
6.   (setq aL (vl-sort aL (function (lambda (a b) (<= (car a) (car b)))))
7.         L nil)
8.   (while aL
9.     (setq L (cond
10.               ((and L (= (1+ (caar L)) (caar aL)))
11.                (cons (cons (caar aL) (append (cdar aL) (cdar L))) (cdr L)))
12.               (T (cons (car aL) L)))
13.           aL (cdr aL)))
14.                    (vl-remove-if (function (lambda (a) (<= (length a) 2))) L))))

Basically it uses the same principle as I've done in another duplicate finder: http://www.revitforum.org/third-party-add-ins-api-r-d/19248-find-duplicate-items.html

That one was finding "fuzz" duplicate items in Revit through the Dynamo visual programming language, but extended using IronPython to make use of hash table equality matching. In this example I'm using Lisp's built in association list instead of a hash table - actually pretty sad that Lisp doesn't have this capability (makes for huge speed increases in some situations).
Common sense - the curse in disguise. Because if you have it, you have to live with those that don't.

Marc'Antonio Alessi

• Swamp Rat
• Posts: 1210
• Marco
Re: {Resolved}How to find the duplicate items in the list
« Reply #33 on: January 21, 2015, 08:36:21 AM »
Here's another idea
...
Is it right?
Code: [Select]
`(setq fuzz 0.05   alist '(1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0))=> (1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0)(group-duplicates alist fuzz)=> ((1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3) (3.0 3.0 3.0))(equal 1.1 1.0  0.05) => nil(equal 1.1 1.19 0.05) => nil(equal 1.3 1.25 0.05) => nil(equal 1.3 3.0  0.05) => nil`

irneb

• Water Moccasin
• Posts: 1794
• ACad R9-2016, Revit Arch 6-2016
Re: {Resolved}How to find the duplicate items in the list
« Reply #34 on: January 21, 2015, 09:01:10 AM »
Is it right?
Code: [Select]
`(group-duplicates alist fuzz)=> ((1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3) (3.0 3.0 3.0))(group-duplicates alist fuzz)=> ((1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3) (3.0 3.0 3.0))`
Following ... as I understand this:

Code: [Select]
`(equal 1.1 1.19 0.05) => nil`
This one I do see that I make a mistake. Will have to look at how to get around it without making a huge extra iteration just for that edge case.

Code: [Select]
`(equal 1.3 1.25 0.05) => nil`
Floating point inacuracies. I use a division on the fuzz, that's why in my defun that is considered equal. It's just an implementation issue.

Code: [Select]
`(equal 1.1 1.0  0.05) => nil(equal 1.3 3.0  0.05) => nil`
I don't understand. Isn't that exactly what my defun did?
Common sense - the curse in disguise. Because if you have it, you have to live with those that don't.

Marc'Antonio Alessi

• Swamp Rat
• Posts: 1210
• Marco
Re: {Resolved}How to find the duplicate items in the list
« Reply #35 on: January 21, 2015, 09:17:07 AM »
Code: [Select]
`(equal 1.1 1.0  0.05) => nil(equal 1.3 3.0  0.05) => nil`
I don't understand. Isn't that exactly what my defun did?
Yes, I just wanted to say that:
Code: [Select]
`for 1.1 >>> 1.0  is the the nearest lower (equal 1.1 1.0  0.05) => nilfor 1.1 >>> 1.19 is the closest top       (equal 1.1 1.19 0.05) => nilfor 1.3 >>> 1.25 is the the nearest lower (equal 1.3 1.25 0.05) => nilfor 1.3 >>> 3.0  is the closest top       (equal 1.3 3.0  0.05) => nil`

irneb

• Water Moccasin
• Posts: 1794
• ACad R9-2016, Revit Arch 6-2016
Re: {Resolved}How to find the duplicate items in the list
« Reply #36 on: January 21, 2015, 09:20:18 AM »
So here goes my 2nd attempt at the association list method:
Code - Auto/Visual Lisp: [Select]
1. (defun group-duplicates1  (L fuzz / aL grp sem mx mn)
2.   (foreach item  L
3.     (if (setq grp (assoc (setq sem (fix (/ item fuzz))) aL))
4.       (setq aL (subst (cons sem (cons item (cdr grp))) grp aL))
5.       (setq aL (cons (list sem item) aL))))
6.   (setq aL (vl-sort aL (function (lambda (a b) (>= (car a) (car b)))))
7.         L (list (car aL))
8.         aL (cdr aL))
9.   (while aL
10.     (setq mx (cond ((= (length (car L)) 2) (cadar L)) (T (apply 'max (cdar L))))
11.           mn (cond ((= (length (car aL)) 2) (cadar aL)) (T (apply 'min (cdar aL))))
12.           L (cond
13.               ((equal mx mn fuzz)
14.                (cons (cons (caar aL) (append (cdar aL) (cdar L))) (cdr L)))
15.               (T (cons (car aL) L)))
16.           aL (cdr aL)))
17.   (mapcar (function (lambda (a) (reverse (cdr a))))
18.           (vl-remove-if (function (lambda (a) (<= (length a) 2))) L)))

Seems to work:
Code: [Select]
`_\$ (group-duplicates1 '(1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0) 0.05)((1.19 1.2) (1.25 1.21 1.22 1.23) (3.0 3.0 3.0))`
Sorts out even the cases where the divide turned the floating point error arround from what the equal did.
Common sense - the curse in disguise. Because if you have it, you have to live with those that don't.

irneb

• Water Moccasin
• Posts: 1794
• ACad R9-2016, Revit Arch 6-2016
Re: {Resolved}How to find the duplicate items in the list
« Reply #37 on: January 21, 2015, 09:28:56 AM »
Seems to work:
Code: [Select]
`_\$ (group-duplicates1 '(1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0) 0.05)((1.19 1.2) (1.25 1.21 1.22 1.23) (3.0 3.0 3.0))`
Hang-on ... that's even worse isn't it!

Back to the drawing board
Common sense - the curse in disguise. Because if you have it, you have to live with those that don't.

irneb

• Water Moccasin
• Posts: 1794
• ACad R9-2016, Revit Arch 6-2016
Re: {Resolved}How to find the duplicate items in the list
« Reply #38 on: January 21, 2015, 09:38:59 AM »
OK, 3rd attempt ... this time it does seem to produce the correct result. I was going all wrong due to the added optimization I tried by sorting in reverse order to omit one of the reverse calls.

Code - Auto/Visual Lisp: [Select]
1. (defun group-duplicates2  (L fuzz / aL grp sem mx mn)
2.   (foreach item  L
3.     (if (setq grp (assoc (setq sem (fix (/ item fuzz))) aL))
4.       (setq aL (subst (cons sem (cons item (cdr grp))) grp aL))
5.       (setq aL (cons (list sem item) aL))))
6.   (setq aL (vl-sort aL (function (lambda (a b) (>= (car a) (car b)))))
7.         L (list (car aL))
8.         aL (cdr aL))
9.   (while aL
10.     (setq mx (apply 'max (cdar aL))
11.           mn (apply 'min (cdar L))
12.           L (cond
13.               ((equal mx mn fuzz)
14.                (cons (cons (caar aL) (append (cdar L) (cdar aL))) (cdr L)))
15.               (T (cons (car aL) L)))
16.           aL (cdr aL)))
17.   (mapcar (function (lambda (a) (reverse (cdr a))))
18.           (vl-remove-if (function (lambda (a) (<= (length a) 2))) L)))

This looks much more like it:
Code: [Select]
`_\$ (group-duplicates2 '(1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0) 0.05)((1.19 1.2 1.21 1.22 1.23 1.25) (3.0 3.0 3.0))`
« Last Edit: January 21, 2015, 09:45:32 AM by irneb »
Common sense - the curse in disguise. Because if you have it, you have to live with those that don't.

Marc'Antonio Alessi

• Swamp Rat
• Posts: 1210
• Marco
Re: {Resolved}How to find the duplicate items in the list
« Reply #39 on: January 21, 2015, 10:43:39 AM »
Another simple version (perhaps too simple), not very tested...
Code: [Select]
`(defun ALE_List_ShowDupesFuzz (In_Lst FuzFac / For001 For002 OutLst)  (foreach ForElm (mapcar '(lambda (x) (nth x In_Lst)) (vl-sort-i In_Lst '<))    (and      (or (equal ForElm For001 FuzFac) (and For001 (equal For001 For002 FuzFac)))      (setq OutLst (cons For001 OutLst))    )    (setq For002 For001 For001 ForElm)  )  (if (equal For001 For002 FuzFac) (reverse (cons For001 OutLst)) (reverse OutLst)))`
Code: [Select]
`(setq fuzz 0.05   alist '(1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0))(ALE_List_ShowDupesFuzz alist fuzz)=>(1.19 1.2 1.21 1.22 1.23 1.25   3.0 3.0 3.0)`

Marc'Antonio Alessi

• Swamp Rat
• Posts: 1210
• Marco
Re: {Resolved}How to find the duplicate items in the list
« Reply #40 on: January 22, 2015, 12:21:13 PM »
With the same concept can also do the reverse version:
Code: [Select]
`; Function: ALE_List_RemoveAllDupesFuzz - Version 1.01 - 2015/01/21;(defun ALE_List_RemoveAllDupesFuzz (In_Lst FuzFac / For001 For002 OutLst )  (foreach ForElm (mapcar '(lambda (x) (nth x In_Lst)) (vl-sort-i In_Lst '<))    (and      For001      (or        (equal ForElm For001 FuzFac) (equal For001 For002 FuzFac)        (setq OutLst (cons For001 OutLst))      )    )    (setq For002 For001 For001 ForElm)  )  (if (equal For001 For002 FuzFac) (reverse OutLst) (reverse (cons For001 OutLst))))`
Code: [Select]
`(setq fuzz 0.05   alist '(1.0 1.1 1.19 1.2 1.21 1.22 1.23 1.25 1.3 3.0 3.0 3.0))                     =>  (1.0 1.1                              1.3)`
« Last Edit: January 22, 2015, 12:27:44 PM by Marc'Antonio Alessi »