Author Topic: Parse CSV string including quoted fields with commas?  (Read 6633 times)

0 Members and 1 Guest are viewing this topic.

mkweaver

  • Bull Frog
  • Posts: 352
Parse CSV string including quoted fields with commas?
« on: October 22, 2007, 08:14:22 AM »
I have searched here, and Augi for a string parser that will handle quoted delimiters.  I found routines by Peter Jamtgaard, and John Uhden, but nothing that will handle fields with quoted delimiters thus:

(csvstringtolist "Field 1, Field 2, Field 3, \"Field 4, This is\" still field 4, Field5" ",")

Everything I've found, including my own standby routine, returns 6 fields, rather than 5.

Here's my standby routine:
Code: [Select]
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;  Routine: Split ;;;
;;;  Purpose: Parse a delimited string ;;;
;;;  Arguments: First - a delimited string ;;;
;;; Second - the delimiting character ;;;
;;;  Returns: A list of strings, consecutive delimiters produce blank fields ;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun split ( ;parse a string into a list
      str ;string to parse
      delim ;delimiter
      / ;end of formal arguments
      rtlist i
      ) ;end of local variable list
  (while (setq i (vl-string-search delim str))
    (setq
      rtlist (cons (substr str 1 i) rtlist)
      str    (substr str (+ 2 i))
      )
    )
  (if (= str "")
    (setq rtlist (cons "" rtlist))
    (setq rtlist (cons str rtlist))
    )
  (reverse rtlist)
  )

CAB

  • Global Moderator
  • Seagull
  • Posts: 10401
Re: Parse CSV string including quoted fields with commas?
« Reply #1 on: October 22, 2007, 08:54:19 AM »
Oops, mine doesn't work eather.
Hummmm looking into it. 8-)
« Last Edit: October 22, 2007, 09:40:02 AM by CAB »
I've reached the age where the happy hour is a nap. (°Ώ°)
Windows 10 core i7 4790k 4Ghz 32GB GTX 970
Please support this web site.

CAB

  • Global Moderator
  • Seagull
  • Posts: 10401
Re: Parse CSV string including quoted fields with commas?
« Reply #2 on: October 22, 2007, 10:27:52 AM »
OK, see if this will work.
Code: [Select]
  ;;  CAB 10.22.2007
  ;;  ignore quoted sections of the string
  (defun sparsers (str delim / ok c new$ plst)
    (setq ok t
          new$ ""
    )
    (while (/= str "")
      (setq c   (substr str 1 1)
            str (substr str 2))
      (cond
        ((= c "\"")
         (setq new$ (strcat new$ c)
               ok (null ok)))
        ((and ok (= c delim))
         (setq plst   (cons new$ plst)
               new$ ""))
        ((setq new$ (strcat new$ c)))
      )
    )
    (reverse (cons new$ plst))
  )

Code: [Select]
(defun c:test ()
  (sparsers "Field 1, Field 2, Field 3, \"Field 4, This is\" still field 4, Field5" ",")
)
« Last Edit: October 22, 2007, 10:35:01 AM by CAB »
I've reached the age where the happy hour is a nap. (°Ώ°)
Windows 10 core i7 4790k 4Ghz 32GB GTX 970
Please support this web site.

MP

  • Seagull
  • Posts: 17750
  • Have thousands of dwgs to process? Contact me.
Re: Parse CSV string including quoted fields with commas?
« Reply #3 on: October 22, 2007, 02:44:35 PM »
Thinking outside defun ...

I made a csv data file named data.csv, located at C:\Databases. It containes this data:

Code: [Select]
Field 1, Field 2, Field 3, "Field 4, This is still field 4", Field5
Field 1, Field 2, Field 3, "Field 4, This is still field 4", Field5
Field 1, Field 2, Field 3, "Field 4, This is still field 4", Field5

Then I (quick and dirtily) translated some simple VB code from MSDN to LISP:

Code: [Select]
(defun c:Test

    (   /
        adOpenStatic
        adLockOptimistic
        adCmdText
        conn
        rset
        path
        csvfile
        fields
        count
        index
    )

    ;;  Quick and dirty translation from MSDN:
    ;;
    ;;  http://msdn2.microsoft.com/en-us/library/ms974559.aspx

    ;;  VB CODE
    ;;  ==========================
    ;;  Const adOpenStatic = 3
    ;;  Const adLockOptimistic = 3
    ;;  Const adCmdText = &H0001

    (setq
        adOpenStatic     3
        adLockOptimistic 3
        adCmdText        1
    )   
   
    ;;  VB CODE
    ;;  ==========================
    ;;  Set objConnection = CreateObject("ADODB.Connection")
    ;;  Set objRecordSet = CreateObject("ADODB.Recordset")
    ;;  strPathtoTextFile = "C:\Databases\"

    (setq
        conn    (vlax-create-object "ADODB.Connection")
        rset    (vlax-create-object "ADODB.Recordset")
        path    "C:\\Databases\\"
        csvfile "data.csv"
    )
   
    ;;  VB CODE
    ;;  ==========================
    ;;  objConnection.Open _
    ;;      "Provider=Microsoft.Jet.OLEDB.4.0;" & _
    ;;      "Data Source=" & strPathtoTextFile & ";" & _
    ;;      "Extended Properties=""text;HDR=YES;FMT=Delimited"""   
   
    (vlax-invoke
        conn
       'Open
        (strcat
            "Provider=Microsoft.Jet.OLEDB.4.0;"
            "Data Source="
            path
            ";"
            ;;  note the following "NO" as I used a data
            ;;  file that did not sport a field header
            "Extended Properties=\"text;HDR=NO;FMT=Delimited\""
        )
    )

    ;;  VB CODE
    ;;  ==========================
    ;;  objRecordset.Open _
    ;;      "SELECT * FROM PhoneList.csv", _
    ;;      objConnection, _
    ;;      adOpenStatic, _
    ;;      adLockOptimistic, _
    ;;      adCmdText

    (vlax-invoke
        rset
       'Open       
        (strcat   
            "SELECT * FROM " ;; I used a variable rather than
            csvfile          ;; hard coding a file name   
        )
        conn
        adOpenStatic           
        adLockOptimistic
        adCmdText
    )
   
    ;;  VB CODE
    ;;  ==========================
    ;;  Do Until objRecordset.EOF
    ;;      Wscript.Echo "Name: " & objRecordset.Fields.Item("Name")
    ;;      Wscript.Echo "Department: " & _
    ;;          objRecordset.Fields.Item("Department")
    ;;      Wscript.Echo "Extension: " & objRecordset.Fields.Item("Extension")   
    ;;      objRecordset.MoveNext
    ;;  Loop   
   
    ;;  Note, rather than translate the code above I opted
    ;;  to print out all the fields of every record AND
    ;;  to bookend each field with braces {} so one could
    ;;  discern the parsing.
   
    (setq count
        (vlax-get
            (setq fields
                (vlax-get-property
                    rset
                   'Fields
                )
            )           
           'Count
        )
    )   
   
    (while (/= :vlax-true (vlax-get-property rset 'EOF))
        (setq index -1)
        (while (< (setq index (1+ index)) count)
            (princ
                (strcat
                    "{"
                    (vl-prin1-to-string
                        (vlax-get
                            (vlax-get-property
                                fields
                               'Item
                                index
                            )
                           'Value
                        )   
                    )                       
                    "},"
                )
            )
        )
        (princ "\n")
        (vlax-invoke rset 'MoveNext)
    )
   
    ;;  clean up isle 4
   
    (vlax-invoke rset 'Close)
    (vlax-invoke conn 'Close)
   
    (vlax-release-object rset)
    (vlax-release-object conn)

    (princ)         

)

Output:

Code: [Select]
{"Field 1"},{"Field 2"},{"Field 3"},{"Field 4, This is still field 4"},{"Field5"},
{"Field 1"},{"Field 2"},{"Field 3"},{"Field 4, This is still field 4"},{"Field5"},
{"Field 1"},{"Field 2"},{"Field 3"},{"Field 4, This is still field 4"},{"Field5"},

Intended to be a quick hint of how to translate code. Does not include error checking or even robust programming. Sorry, don't have the time.

/Later.
Engineering Technologist • CAD Automation Practitioner
Automation ▸ Design ▸ Drafting ▸ Document Control ▸ Client
cadanalyst@gmail.com • http://cadanalyst.slack.com • http://linkedin.com/in/cadanalyst

mkweaver

  • Bull Frog
  • Posts: 352
Re: Parse CSV string including quoted fields with commas?
« Reply #4 on: October 22, 2007, 03:31:50 PM »
Way cool code.  The first lisp routine does everything I thought I needed, but the second opens up a whole new world.  I knew a txt file could be opened as a database, but I wouldn't have known where to start.

I'll bookmark these.

Thanks,
Mike

MP

  • Seagull
  • Posts: 17750
  • Have thousands of dwgs to process? Contact me.
Re: Parse CSV string including quoted fields with commas?
« Reply #5 on: October 22, 2007, 05:53:18 PM »
You're most welcome Mike; have fun.
Engineering Technologist • CAD Automation Practitioner
Automation ▸ Design ▸ Drafting ▸ Document Control ▸ Client
cadanalyst@gmail.com • http://cadanalyst.slack.com • http://linkedin.com/in/cadanalyst

It's Alive!

  • Retired
  • Needs a day job
  • Posts: 8740
  • AKA Daniel
Re: Parse CSV string including quoted fields with commas?
« Reply #6 on: October 24, 2007, 11:32:29 AM »
Excellent MP!

I hope that everyone is aware that this technique is great for querying all kinds of files .xls , .txt, .mdb, .csv etc.
I use the heck out of this little routine I wrote in C# for lisp see
http://www.theswamp.org/index.php?topic=12077.msg186641#msg186641
An example would be

Code: [Select]
;c:\\test.csv   
(ADO:Tools "C:\\;Extended Properties='text;HDR=Yes;FMT=Delimited';" "SELECT * FROM test.csv")

Returns

Code: [Select]
(("Field 1" "Field 2" "Field 3" "Field 4, This is still field 4" "Field5")
 ("Field 1" "Field 2" "Field 3" "Field 4, This is still field 4" "Field5")
 ("Field 1" "Field 2" "Field 3" "Field 4, This is still field 4" "Field5"))

Aslo Jon Fleming’s adolisp works fantastic for this.
http://acad.fleming-group.com/index.html

Dan

MP

  • Seagull
  • Posts: 17750
  • Have thousands of dwgs to process? Contact me.
Re: Parse CSV string including quoted fields with commas?
« Reply #7 on: October 25, 2007, 12:25:24 PM »
Thanks for the nod Daniel. I will have to look at your most excellent (as usual) c# implementation.

:)
Engineering Technologist • CAD Automation Practitioner
Automation ▸ Design ▸ Drafting ▸ Document Control ▸ Client
cadanalyst@gmail.com • http://cadanalyst.slack.com • http://linkedin.com/in/cadanalyst

MP

  • Seagull
  • Posts: 17750
  • Have thousands of dwgs to process? Contact me.
Re: Parse CSV string including quoted fields with commas?
« Reply #8 on: December 01, 2014, 04:40:32 PM »
Updated for Windows 7 (merely changed the provider, see related).

Based on nominal testing works fine under Windows 7 (Home Premium and Pro):

Code: [Select]
(defun c:Test

    (   /
        adOpenStatic
        adLockOptimistic
        adCmdText
        conn
        rset
        path
        csvfile
        fields
        count
        index
    )

    ;;  Quick and dirty translation from MSDN:
    ;;
    ;;  http://msdn2.microsoft.com/en-us/library/ms974559.aspx

    ;;  VB CODE
    ;;  ==========================
    ;;  Const adOpenStatic = 3
    ;;  Const adLockOptimistic = 3
    ;;  Const adCmdText = &H0001

    (setq
        adOpenStatic     3
        adLockOptimistic 3
        adCmdText        1
    )   
   
    ;;  VB CODE
    ;;  ==========================
    ;;  Set objConnection = CreateObject("ADODB.Connection")
    ;;  Set objRecordSet = CreateObject("ADODB.Recordset")
    ;;  strPathtoTextFile = "C:\Databases\"

    (setq
        conn    (vlax-create-object "ADODB.Connection")
        rset    (vlax-create-object "ADODB.Recordset")
        path    "C:\\docs\\"
        csvfile "test.csv"
    )
   
    ;;  VB CODE
    ;;  ==========================
    ;;  objConnection.Open _
    ;;      "Provider=Microsoft.Jet.OLEDB.4.0;" & _
    ;;      "Data Source=" & strPathtoTextFile & ";" & _
    ;;      "Extended Properties=""text;HDR=YES;FMT=Delimited"""   
   
    (vlax-invoke
        conn
       'Open
        (strcat
            ;;  "Provider=Microsoft.Jet.OLEDB.4.0;" ;; ! Windows 7
            "Provider=Microsoft.ACE.OLEDB.12.0;" ;; Windows 7
            "Data Source="
            path
            ";"
            ;;  note the following "NO" as I used a data
            ;;  file that did not sport a field header
            "Extended Properties=\"text;HDR=NO;FMT=Delimited\""
        )
    )

    ;;  VB CODE
    ;;  ==========================
    ;;  objRecordset.Open _
    ;;      "SELECT * FROM PhoneList.csv", _
    ;;      objConnection, _
    ;;      adOpenStatic, _
    ;;      adLockOptimistic, _
    ;;      adCmdText

    (vlax-invoke
        rset
       'Open       
        (strcat   
            "SELECT * FROM " ;; I used a variable rather than
            csvfile          ;; hard coding a file name   
        )
        conn
        adOpenStatic           
        adLockOptimistic
        adCmdText
    )
   
    ;;  VB CODE
    ;;  ==========================
    ;;  Do Until objRecordset.EOF
    ;;      Wscript.Echo "Name: " & objRecordset.Fields.Item("Name")
    ;;      Wscript.Echo "Department: " & _
    ;;          objRecordset.Fields.Item("Department")
    ;;      Wscript.Echo "Extension: " & objRecordset.Fields.Item("Extension")   
    ;;      objRecordset.MoveNext
    ;;  Loop   
   
    ;;  Note, rather than translate the code above I opted
    ;;  to print out all the fields of every record AND
    ;;  to bookend each field with braces {} so one could
    ;;  discern the parsing.
   
    (setq count
        (vlax-get
            (setq fields
                (vlax-get-property
                    rset
                   'Fields
                )
            )           
           'Count
        )
    )   
   
    (while (/= :vlax-true (vlax-get-property rset 'EOF))
        (setq index -1)
        (while (< (setq index (1+ index)) count)
            (princ
                (strcat
                    "{"
                    (vl-prin1-to-string
                        (vlax-get
                            (vlax-get-property
                                fields
                               'Item
                                index
                            )
                           'Value
                        )   
                    )                       
                    "},"
                )
            )
        )
        (princ "\n")
        (vlax-invoke rset 'MoveNext)
    )
   
    ;;  clean up isle 4
   
    (vlax-invoke rset 'Close)
    (vlax-invoke conn 'Close)
   
    (vlax-release-object rset)
    (vlax-release-object conn)

    (princ)         

)
« Last Edit: December 01, 2014, 05:02:20 PM by MP »
Engineering Technologist • CAD Automation Practitioner
Automation ▸ Design ▸ Drafting ▸ Document Control ▸ Client
cadanalyst@gmail.com • http://cadanalyst.slack.com • http://linkedin.com/in/cadanalyst

Marc'Antonio Alessi

  • Swamp Rat
  • Posts: 1453
  • Marco
Re: Parse CSV string including quoted fields with commas?
« Reply #9 on: December 03, 2014, 05:20:35 AM »
I have searched here, and Augi for a string parser that will handle quoted delimiters.  I found routines by Peter Jamtgaard, and John Uhden, but nothing that will handle fields with quoted delimiters thus:
...
http://www.theswamp.org/index.php?topic=44260.msg496182#msg496182
http://lee-mac.com/readcsv.html

Ciao.