Author Topic: On Reading a CSV File  (Read 6094 times)

0 Members and 1 Guest are viewing this topic.

Lee Mac

  • Seagull
  • Posts: 12914
  • London, England
On Reading a CSV File
« on: January 14, 2012, 07:58:57 AM »
Previously I have used the following relatively simple code to parse an Excel CSV File into a list of lists - the function simply reads each line of the file and splits the contents of each line by the comma's found in the string.

Code - Auto/Visual Lisp: [Select]
  1. ;; ReadCSV  -  Lee Mac
  2. ;; Parses a CSV file into a list of lists, each sublist is a row of CSV cell values.
  3.  
  4. (defun LM:ReadCSV ( filename / _csv->lst file line lst )
  5.  
  6.     (defun _csv->lst ( str / pos )
  7.         (if (setq pos (vl-string-position 44 str))
  8.             (cons (substr str 1 pos) (_csv->lst (substr str (+ pos 2))))
  9.             (list str)
  10.         )
  11.     )
  12.  
  13.     (if (setq file (open filename "r"))
  14.         (progn
  15.             (while (setq line (read-line file))
  16.                 (setq lst (cons (_csv->lst line) lst))
  17.             )
  18.             (close file)
  19.         )
  20.     )
  21.     (reverse lst)
  22. )

However, consider a CSV file with the following contents:



The above function will incorrectly return a list of:

Code - Auto/Visual Lisp: [Select]
  1. (("abc" "\"a" "b" "c\"" "\"\"\"abc\"\"\"" "\"\"\"a" "b" "c\"\"\""))

So, I now propose the following function to parse the contents of a CSV file:

Code - Auto/Visual Lisp: [Select]
  1. ;; ReadCSV  -  Lee Mac
  2. ;; Parses a CSV file into a list of lists, each sublist is a row of CSV cell values.
  3.  
  4. (defun LM:ReadCSV ( filename / _replacequotes _csv->lst file line lst )
  5.    
  6.     (defun _replacequotes ( str / pos )
  7.         (setq pos 0)
  8.         (while (setq pos (vl-string-search  "\"\"" str pos))
  9.             (setq str (vl-string-subst "\"" "\"\"" str pos)
  10.                   pos (1+ pos)
  11.             )
  12.         )
  13.         str
  14.     )
  15.  
  16.     (defun _csv->lst ( str pos / s )
  17.         (cond
  18.             (   (null (setq pos (vl-string-position 44 str pos)))
  19.                 (if (wcmatch str "\"*\"")
  20.                     (list (_replacequotes (substr str 2 (- (strlen str) 2))))
  21.                     (list str)
  22.                 )
  23.             )
  24.             (   (wcmatch (setq s (substr str 1 pos)) "\"*\"")
  25.                 (cons
  26.                     (_replacequotes (substr str 2 (- pos 2)))
  27.                     (_csv->lst (substr str (+ pos 2)) 0)
  28.                 )
  29.             )
  30.             (   (wcmatch s "\"*[~\"]")
  31.                 (_csv->lst str (+ pos 2))
  32.             )
  33.             (   (cons s (_csv->lst (substr str (+ pos 2)) 0)))
  34.         )
  35.     )
  36.  
  37.     (if (setq file (open filename "r"))
  38.         (progn
  39.             (while (setq line (read-line file))
  40.                 (setq lst (cons (_csv->lst line 0) lst))
  41.             )
  42.             (close file)
  43.         )
  44.     )
  45.     (reverse lst)
  46. )
  47.  

Upon testing with the same example CSV file shown above:

Code - Auto/Visual Lisp: [Select]
  1. (("abc" "a,b,c" "\"abc\"" "\"a,b,c\""))

I welcome your thoughts | ideas | suggestions  :-)


ElpanovEvgeniy

  • Water Moccasin
  • Posts: 1569
  • Moscow (Russia)
Re: On Reading a CSV File
« Reply #1 on: January 14, 2012, 08:05:43 AM »
Hi Lee Mac! :)
You have to work only with small files?

Lee Mac

  • Seagull
  • Posts: 12914
  • London, England
Re: On Reading a CSV File
« Reply #2 on: January 14, 2012, 08:12:07 AM »
Hi Lee Mac! :)
You have to work only with small files?

Sorry Evgeniy, I'm not sure that I understand?

ElpanovEvgeniy

  • Water Moccasin
  • Posts: 1569
  • Moscow (Russia)
Re: On Reading a CSV File
« Reply #3 on: January 14, 2012, 08:20:40 AM »
I do not use read-line, for large files - very slow.

Lee Mac

  • Seagull
  • Posts: 12914
  • London, England
Re: On Reading a CSV File
« Reply #4 on: January 14, 2012, 09:10:19 AM »
I do not use read-line, for large files - very slow.

I see - I haven't noticed any significant performance issues, but maybe I just haven't used it with large enough files... I assume you use the Read method of the FSO? Or an ADODB.Stream object?

ElpanovEvgeniy

  • Water Moccasin
  • Posts: 1569
  • Moscow (Russia)
Re: On Reading a CSV File
« Reply #5 on: January 14, 2012, 09:14:06 AM »
for the texts I use fso.
file 1mb speed difference would be a lot of time!

Lee Mac

  • Seagull
  • Posts: 12914
  • London, England
Re: On Reading a CSV File
« Reply #6 on: January 14, 2012, 07:16:51 PM »
for the texts I use fso.
file 1mb speed difference would be a lot of time!

Thank you for the recommendation; for simplicity and ease of use, I shall remain with the read-line method, however, I shall keep your suggestion in mind should I encounter the need to process a huge text file.

Kerry

  • Mesozoic relic
  • Seagull
  • Posts: 11654
  • class keyThumper<T>:ILazy<T>
Re: On Reading a CSV File
« Reply #7 on: January 14, 2012, 08:41:00 PM »

For what it's worth, I've used FSO very sucessfully for years.

These functions are generic and I use them to read write a variety text files.

Code - Auto/Visual Lisp: [Select]
  1. (defun kdub:textfile-to-list
  2.        (filename / fileobject filefound filesystem openfileas returnvalue)
  3.   (if (setq filefound (findfile filename))
  4.     (progn (setq filesystem (vlax-create-object "Scripting.FileSystemObject")
  5.                  fileobject (vlax-invoke filesystem "GetFile" filefound)
  6.                  openfileas (vlax-invoke fileobject "OpenAsTextStream" 1 0)
  7.            )
  8.            (while (= (vlax-get openfileas "AtEndOfStream") 0)
  9.              (setq
  10.                returnvalue (cons (vlax-invoke openfileas "ReadLine") returnvalue)
  11.              )
  12.            )
  13.            (vlax-invoke openfileas "Close")
  14.            (vlax-release-object openfileas)
  15.            (vlax-release-object fileobject)
  16.            (vlax-release-object filesystem)
  17.            (reverse returnvalue)
  18.     )
  19.     nil
  20.   )
  21. )
Code - Auto/Visual Lisp: [Select]
  1. (defun kdub:list-to-textfile (filename      datalist      /
  2.                               fileobject    filefound     filesystem
  3.                               openfileas    returnvalue
  4.                              )
  5.   (if (setq filefound (findfile filename))
  6.     (progn (setq filesystem (vlax-create-object "Scripting.FileSystemObject")
  7.                  fileobject (vlax-invoke filesystem "GetFile" filefound)
  8.                  openfileas (vlax-invoke fileobject "OpenAsTextStream" 2 0)
  9.            )
  10.            (foreach line datalist (vlax-invoke openfileas "WriteLine" line))
  11.            (vlax-invoke openfileas "Close")
  12.            (vlax-release-object openfileas)
  13.            (vlax-release-object fileobject)
  14.            (vlax-release-object filesystem)
  15.            t
  16.     )
  17.     nil
  18.   )
  19. )
  20.  
kdub, kdub_nz in other timelines.
Perfection is not optional.
Everything will work just as you expect it to, unless your expectations are incorrect.
Discipline: None at all.

ElpanovEvgeniy

  • Water Moccasin
  • Posts: 1569
  • Moscow (Russia)
Re: On Reading a CSV File
« Reply #8 on: January 15, 2012, 01:46:47 AM »

For what it's worth, I've used FSO very sucessfully for years.

These functions are generic and I use them to read write a variety text files.

Code - Auto/Visual Lisp: [Select]
  1. (defun kdub:textfile-to-list
  2.        (filename / fileobject filefound filesystem openfileas returnvalue)
  3.   (if (setq filefound (findfile filename))
  4.     (progn (setq filesystem (vlax-create-object "Scripting.FileSystemObject")
  5.                  fileobject (vlax-invoke filesystem "GetFile" filefound)
  6.                  openfileas (vlax-invoke fileobject "OpenAsTextStream" 1 0)
  7.            )
  8.            (while (= (vlax-get openfileas "AtEndOfStream") 0)
  9.              (setq
  10.                returnvalue (cons (vlax-invoke openfileas "ReadLine") returnvalue)
  11.              )
  12.            )
  13.            (vlax-invoke openfileas "Close")
  14.            (vlax-release-object openfileas)
  15.            (vlax-release-object fileobject)
  16.            (vlax-release-object filesystem)
  17.            (reverse returnvalue)
  18.     )
  19.     nil
  20.   )
  21. )
Code - Auto/Visual Lisp: [Select]
  1. (defun kdub:list-to-textfile (filename      datalist      /
  2.                               fileobject    filefound     filesystem
  3.                               openfileas    returnvalue
  4.                              )
  5.   (if (setq filefound (findfile filename))
  6.     (progn (setq filesystem (vlax-create-object "Scripting.FileSystemObject")
  7.                  fileobject (vlax-invoke filesystem "GetFile" filefound)
  8.                  openfileas (vlax-invoke fileobject "OpenAsTextStream" 2 0)
  9.            )
  10.            (foreach line datalist (vlax-invoke openfileas "WriteLine" line))
  11.            (vlax-invoke openfileas "Close")
  12.            (vlax-release-object openfileas)
  13.            (vlax-release-object fileobject)
  14.            (vlax-release-object filesystem)
  15.            t
  16.     )
  17.     nil
  18.   )
  19. )
  20.  

In your example, again using line-reading...
I use a function like:
ReadStream & WriteStream MP