Author Topic: Extract data from HTML-tables  (Read 1532 times)

0 Members and 1 Guest are viewing this topic.

Peter2

  • Swamp Rat
  • Posts: 653
Extract data from HTML-tables
« on: July 25, 2017, 04:19:53 AM »
Maybe I have to read data from HTML-tables. :embarrassed:

Any advices for solutions / workarounds / code-snippets are appreciated  :-)
Peter

AutoCAD Map 3D 2023 German (so some technical terms will be badly retranslated to English)
BricsCAD V23

MickD

  • King Gator
  • Posts: 3636
  • (x-in)->[process]->(y-out) ... simples!
Re: Extract data from HTML-tables
« Reply #1 on: July 25, 2017, 06:17:41 AM »
html is just xml if it's done properly (i.e. all tags are closed) so if you can use a COM xml lib or there's one for AutoLisp just use that to 'scrape' the data out. Hopefully the table will have an 'id' or 'class' attribute to make selecting the table nodes a bit easier.
hth
"Programming is really just the mundane aspect of expressing a solution to a problem."
- John Carmack

"Short cuts make long delays,' argued Pippin.”
- J.R.R. Tolkien

Peter2

  • Swamp Rat
  • Posts: 653
Re: Extract data from HTML-tables
« Reply #2 on: July 25, 2017, 07:51:52 AM »
...a COM xml lib or there's one for AutoLisp just use that ...
Can you / somebody recommend a special software?
Peter

AutoCAD Map 3D 2023 German (so some technical terms will be badly retranslated to English)
BricsCAD V23

Lee Mac

  • Seagull
  • Posts: 12914
  • London, England
Re: Extract data from HTML-tables
« Reply #3 on: July 25, 2017, 07:59:47 AM »

MickD

  • King Gator
  • Posts: 3636
  • (x-in)->[process]->(y-out) ... simples!
Re: Extract data from HTML-tables
« Reply #4 on: July 25, 2017, 06:46:55 PM »
Once you get something going you might want to use an XPath to zero in on your elements, sing out if you need a hand with the XPath query, will just need a sample to work with.
"Programming is really just the mundane aspect of expressing a solution to a problem."
- John Carmack

"Short cuts make long delays,' argued Pippin.”
- J.R.R. Tolkien

dgorsman

  • Water Moccasin
  • Posts: 2437
Re: Extract data from HTML-tables
« Reply #5 on: July 26, 2017, 10:13:56 AM »
When I generate HTML reports, I usually provide the option to write XML as well (fairly simple, as the former is derived from the latter).  That way if the raw data is required it's more easily accessed.

But if you aren't generating the HTML yourself, pray that it's well formed.  You can MSXML6 in LISP to read it as XML.
If you are going to fly by the seat of your pants, expect friction burns.

try {GreatPower;}
   catch (notResponsible)
      {NextTime(PlanAhead);}
   finally
      {MasterBasics;}

Peter2

  • Swamp Rat
  • Posts: 653
Re: Extract data from HTML-tables
« Reply #6 on: July 27, 2017, 03:10:55 AM »
Thanks to all.
At the moment it seems that I can avoid the HTML-extract, but I will keep (I will try to ...) the infos in mind for the next time.
Peter

AutoCAD Map 3D 2023 German (so some technical terms will be badly retranslated to English)
BricsCAD V23