Grabbing Data from the Net

PostPosted: Thu Apr 22, 2010 2:09 pm
by datadon
Looking for comments on the possiblity of how to do this best.
If I have a database that list companies and limited location info such as:

AECOM Technology Corp., Los Angeles, Calif.

for example. if I paste that into Google, it will come up with a fairly standard page that gives a map and address and phone number.

What I want to do is extract the address info off the web page and put it in the database. I think it is possible. Anybody got a good idea or example of doing something like this that I could start off with?

Actually, it may be harder than I thought as looking at the page script it does not contain the information, it is hidden. Any ideas?


PostPosted: Thu Apr 22, 2010 3:15 pm
by Neosoft Support
You can use NeoBook's InternetGet action to request information from Google. For example:

Code: Select all
InternetGet ",+Los+Angeles,+Calif." "[Results]" ""

Google's response will be stored in the variable [Results], which you can parse to retrieve the data you're looking for.

PostPosted: Fri Apr 23, 2010 3:11 am
by datadon
That works. The parsing can be a bit of a chore but certainly possible.

The reason for my question is that a lot of times you can get some really generic list of companies, but no address or phone information. They want to sell you that part of it. This way the info can be extracted without the cost. A bit devious but non the less legal.

Another method, but not as automatic but still fairly fast if the list is not huge is to simply create a pub with a browser, feed the location into the url and bingo it shows up on the google page. At least it does on my browser. If you have a text entry box open that shows the current record of the location you are looking for you can very quickly cut and paste the info from the browser into the box. bingo again, you have the data. Then hit the next buttton on the database and do the next record.

Or, using Gaev's nice screen capture tool, you can get the image and paste it into the database picture blob field. or do whatever you want with it.

Again, with Neobook, lots of options....