groovy screen scraping example


#!/usr/bin/env groovy
// Depends on tagsoup library:
// http://ccil.org/~cowan/XML/tagsoup/

def slurper = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser())

def url = new URL("http://fcd.mcw.edu/?module=faculty&func=view&id=1674")

url.withReader { reader ->

html = slurper.parse(reader)



//we should now have a parsed file



def value = html.body.div.div.div[2].ul.li

value.list().each { f ->

println "\nPub : " << f.toString()[0..80] << "..."

}



}

[gkowalski]$ ./screenScrape.groovy
Pub : Role of cannabinoids and endocannabinoids in cerebral ischemia. (Hillard CJ) Cur…
Pub : Regional alterations in the endocannabinoid system in an animal model of depressi…
Pub : Mediation of Cannabidiol Anti-inflammation in the Retina by Equilibrative Nucleos…

Advertisements

One thought on “groovy screen scraping example

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s