python - scrapy: convert html string to HtmlResponse object -
i have raw html string want convert scrapy html response object can use selectors css , xpath, similar scrapy's response. how can it?
first of all, if debugging or testing purposes, can use scrapy shell:
$ cat index.html <div id="test"> test text </div> $ scrapy shell index.html >>> response.xpath('//div[@id="test"]/text()').extract()[0].strip() u'test text' there different objects available in shell during session, response , request.
or, can instantiate htmlresponse class , provide html string in body:
>>> scrapy.http import htmlresponse >>> response = htmlresponse(url="my html string", body='<div id="test">test text</div>') >>> response.xpath('//div[@id="test"]/text()').extract()[0].strip() u'test text'
Comments
Post a Comment