Code Red > AutoLISP (Vanilla / Visual)

Autolisp XML parsing with Columbia's code

(1/3) > >>

jgett002:
Hello all,

I am using Columbia's xml parser code found here https://www.theswamp.org/index.php?topic=525.30
It seems to be perfect for what I'm trying to accomplish but I have two small problems. Below is some example code of what my XML format looks like, except the actual will be much longer and have more layers.

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>

  <book>
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <!-- Note -->
    <price>30.00</price>
  </book>

  <book>
    <title lang="en">Harry Potter</title>
    <author>J K. Rowling</author>
    <!-- Note -->
    <year>2005</year>
    <price>29.99</price>
  </book>

</bookstore>

Problem 1: Since the two child nodes "book" have the same name, I can't find a way to differentiate the two when using Columbia's get-child or get-child-value. The code always gives back the first child node because it searches by name. I was thinking about using get-child-list and using the positions to assign VLA objects to child nodes of the same name, but the number of "book" nodes will change with each XML file and I don't know if their positions will always be constant, I don't want that to make things messy. I was also thinking about replacing "book" with "book1", "book2", ... while there is a child node with the name "book". I think put-value function does that. Or, if there's something very simple I'm missing I would greatly appreciate someone filling me in.

Problem 2: As far as I know, Columbia's code can not handle notes in XML. When I use get-child-value on the first node "book" I can extract data from title, author, and year, but not price. When I use get-child-value on the second node "book" (and change it to a unique title) I can extract data from title and author, but not year and price. It returns error unknown name: TAGNAME. When I remove the notes, I can get all values no problem. Should I somehow remove all notes from the files beforehand? Can I write a couple lines of code into Columbia's file to skip over the notes? Or again, is there something simple I am missing here?

Please inform me on what the simplest solutions are to my two problems. Also, I am very new to computer programming in general so I would appreciate it if everyone spoke to me as if I were a small child.

Thanks

steve.carson:
Problem #1 - I've been playing around with Columbia's code and DST files (thanks to MP's recent contributions found here: https://www.theswamp.org/index.php?topic=52362.0). Using the xml-get-childlist is the way I'd do it, but instead of using positions, I would iterate over the list and make a new list only containing the xml objects whose nodename is "book". Something like this (untested):

--- Code: ---(defun get-xml-books (XmlO / return )

    (foreach i (reverse (xml-get-childlist XmlO))
        (if (and (vlax-property-available-p i 'nodeName)
                 (= (vlax-get-property i 'nodeName) "book")
            )
            (setq return (if return (cons i return) (list i)))
        )
    )
    return
)
--- End code ---
I don't know enough about xml's to know if all child's will always have a nodename property, so I put the check in there. This should return a list of xml objects of each book.

Problem #2 - To get the value of specific data I've had more luck using the xml-get-child-byattribute function. In your case something like this may work:

--- Code: ---(defun get-book-property (bookXmlO propName / )

    (xml-get-value
        (xml-get-child-byattribute bookXmlO nil "propname" propName)
    )

)
--- End code ---

Keep in mind everything I know about xml's I've learned from messing around with sheetset stuff, so it may not apply to your situation, but I hope it at least nudges you in the right direction.


Steve

dgorsman:
Have a look through the MSXML DOM implementation.  It sounds like the code being used is manually parsing content, or perhaps stepping through child nodes.  Better way of getting things would be to execute selections using XPATH, which ignores comment elements.  Proving the select returns a node (via select) or a node list (via select nodes), the results can be stepped through quickly.

Edit: after a longer peek at the posted code I only see stepping through children without any use of XPATH.  That would clean up a *lot* of searching.

VovKa:
jgett002, if you plan to read lots of data from xml i'd suggest reading the whole xml file as one list and then work with it the lisp way

jgett002:
Steve,

Thanks for the input. If I continue with Columbia's code I think your solution to Problem #1 would work. Though I guess I should look into other xml parsing methods based on what the other replies are telling me. Although, I don't understand how your solution to Problem #2 will work since my child nodes like "year" and "price" don't have attributes.

dgorsman,

Do you know of any references to xpath that can help with my specific application? As I said in my original post, I'm really just a novice at programming. When I try to research these things its so hard to understand what anybody is talking about since I don't know a lot of the basics. I feel like I need to google ten different programming terms that are written in the explanation of the topic I was originally researching. I went back and forth deciding if I should really spend some time learning about the MSXML DOM and Activex and vlax- commands but eventually decided that I didn't want to open up that can of worms since 1) Columbia's code is really close to working and 2) after I assign variables to the xml data the rest of my program is mostly just drawing (and the user interface which I am working through fine). I figured I would save all this research for when I have a better understanding of coding in general.

Anyway, if you can either help me with an explanation or with a reference I can read, I will definitely look into it. If you think it would be too difficult for a beginner maybe I will just have a very inefficient code for my first program. FYI, please read my reply to VovKa below and let me know if you think I should absolutely use XPATH instead of stepping through.

Vovka,

My XML documents will be about 2,500 lines long and I am extracting information from around 100 nodes or so. What do you suggest?


Thanks everybody

Navigation

[0] Message Index

[#] Next page

Go to full version