Quantcast
Channel: Python/Feedparser: reading RSS feed fails - Stack Overflow
Viewing all articles
Browse latest Browse all 2

Python/Feedparser: reading RSS feed fails

$
0
0

I'm using feedparser to fetch RSS feed data. For most RSS feeds that works perfectly fine. However, I know stumbled upon a website where fetching RSS feeds fails (example feed). The return result does not contain the expected keys and the values are some HTML codes.

I tries simply reading the feed URL with urllib2.Request(url). This fails with a HTTP Error 405: Not Allowed error. If I add a custom header like

headers = {'Content-type' : 'text/xml','User-Agent': 'Mozilla/5.0 (X11; Linux i586; rv:31.0) Gecko/20100101 Firefox/31.0',}request = urllib2.Request(url)

I don't get the 405 error anymore, but the returned content is a HTML document with some HEAD tags and an essentially empty BODY. In the browser everything looks fine, same when I look at "View Page Source". feedparser.parse also allows to set agent and request_headers, I tried various agents. I'm still not able to correctly read the XML let alone the parsed feed from feedparse.

What am I missing here?


Viewing all articles
Browse latest Browse all 2

Trending Articles