Hi
Relatively new to scrapy, how can i use scrapy to parse XML files from a
local file system.
I have a relatively modest alteration of base scaffold.
import scrapy
class ScrapexmlItem(scrapy.Item):
# define the fields for your item here like:
# name = scrapy.Field()
meeting = scrapy.Field()
number = scrapy.Field()
name = scrapy.Field()
and example.py
# -*- coding: utf-8 -*-
import scrapy
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor
class ScrapexmlItem(CrawlSpider):
name = 'ScrapexmlItem'
def __init__(self, filename=None):
if filename:
with open(filename, 'r') as f:
self.start_urls = f.readlines()
def parse(self, response):
pass
then from root directory i am trying to run the spider with, below, which
fails due to keykerror scrapy crawl MySpider -a filename=2015219RHIL0.xml
scrapy crawl MySpider -a filename=2015219RHIL0.xml
I have based my example.py on this SO
post http://stackoverflow.com/a/17307762/461887 but I am not sure i am
really approaching it in the correct way. I am hoping just to open and then
use the xpath selectors in scrapy to put the data i want in a pipeline.
Is there a more default way to approach this in scrapy?
Cheers
Sayth
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.