html - nokogiri screen scrape css selector issue -


i'm trying css working on rake task.

namespace :task   task test: :environment     ticketmaster_url = "http://www.ticketmaster.co.uk/derren-brown-miracle-glasgow-04-07-2016/event/370050789149169e?artistid=1408737&majorcatid=10002&minorcatid=53&tpab=-1"     doc = nokogiri::html(open(ticketmaster_url))     #psec-p label      doc.css("#psec-p").each |price|       puts price.at_css("#psec-p")       byebug     end   end end 

however i'm returning this:

#<nokogiri::xml::element:0x3fd226469e60 name="fieldset" attributes=[#<nokogiri::xml::attr:0x3fd2281c953c name="class" value="group-price widget-group">, #<nokogiri::xml::attr:0x3fd2281c9528 name="id" value="psec-p">] children=[#<nokogiri::xml::text:0x3fd2281c8d44 "\n            ">, #<nokogiri::xml::element:0x3fd2281c8c7c name="legend" attributes=[#<nokogiri::xml::attr:0x3fd2281c8c18 name="id" value="psec-p-legend">] children=[#<nokogiri::xml::text:0x3fd2281c8614 "price:">]>, #<nokogiri::xml::text:0x3fd2281c8448 "\n          ">]> 

i'm guessing selected wrong element have chosen psec-p

could let me know i'm going wrong?

i've been following railscast 190

the prices on http://www.ticketmaster.co.uk applied html dynamically, via javascript. partially done hinder scraping efforts. cannot use nokogiri scrape type of content domain, nokogiri processes raw html/xml, , not execute javascript in process. other tools exist this, require entirely different approach.

for learning purposes, should choose less dynamic site. instance, http://www.wallacesuk.com has nice, parseable site. learn basic web scraping techniques site presents information inline page, such this.

scraping http://ticketmaster.co.uk require advanced scraping techniques, beyond railscast 190 demonstrating.


Comments

Popular posts from this blog

ios - RestKit 0.20 — CoreData: error: Failed to call designated initializer on NSManagedObject class (again) -

java - Digest auth with Spring Security using javaconfig -

laravel - PDOException in Connector.php line 55: SQLSTATE[HY000] [1045] Access denied for user 'root'@'localhost' (using password: YES) -