PodCast

Tackled on a little python module to automate the handling of enclosure tag enabled RSS feeds (Podcasts, Broadcatches or whatever buzzword that stands for that practice). Main interest actually is to integrate a podcast feature into ApnaOpus so that the client automatically fetches new torrents that were announced in the rss from the BitTorrent tracker. Nevertheless, the module already proved to be useful in combination with wget just to fetch parsed links.

PodCast basically takes an opml file as input and polls the listed feeds according to the updateInterval attribute. It issues a conditional GET to only download the feed when modifcations happened - even though it turned out that most generated rss feeds don't supply the nescessary headers. PodCast makes extensive use of libxml2 and its python bindings.

Module usage example:

from PodCast import PodCast

# options must include the opmlfile 
# and can include a list of desired mediatypes 
opts = {'opmlfile':'/path/to/feeds.opml', 
        'filtertypes':['audio/mpeg']}

# instantiating and polling
pc = PodCast.PodCast()
pc.poll(opts)

# get the enclosures
# serializeEnclosures() returns a list of dicitionaries
# containing serialized enclosure tags
enclosures = pc.serializeEnclosures(withitems=False)

There is also a poll_mainloop() method which polls periodically in a loop.

Simple retrieval with the podcast utility and wget

# save enclosed urls to be found in the feeds from 
# feeds.opml to enclosure.txt 
# (only those of type audio/mpeg)
:~/podcast/$ ./podcast -i feeds.opml -o enclosures.txt 
 -f audio/mpeg

# use wget to retrieve the files
wget -i enclosures.txt -P downloads/ -nc


PodCast is tested with Python 2.4
You can checkout the module and the utility from:
http://svn.var.cc/svn/podcast/ (user: anonymous, no password)

Permalink
Posted in:

libxml2-python and memory

Seems like one is well advised to consider some memory management tips when using libxml2-python.

1) free unused XMLDocument objects

domdoc.freeDoc()

2) free unused XPathContexts

xpctxt.xpathFreeContext()

And then there are the internal memory debugging methods that can be used

import libxml2
#turn on memory debugging 
libxml2.debugMemory(1) 

if __name__ == '__main__':
    
    doWeirdThingsHere()
    
    # debug
    libxml2.cleanupParser()
    leak = libxml2.debugMemory(1)
    print "Memory leak %d bytes" % (leak)
    if leak > 0:
        libxml2.dumpMemory()


Another useful method is memoryUsed() to observe libxml2's memory usage while the program is running.

Links: .
http://www.dwerg.net/2004/articles/libxml
http://www.dwerg.net/2004/articles/morelibxml
http://xmlsoft.org/python.html

Permalink
Posted in:

my subveRSSed

SubveRSSed is a python script that generates an rss feed from changes in a subversion repository. Seems to be very useful to subscribe to the rss feed from a subversion repository instead of having commit mails. The subverss script was initially written by Philppe Normand. It basically parses the output from svnlook and creates a sequential (last 20 commits) and a full RSS 2.0 feed of a given repository. Moreover it generates a neat html page.

I added a new option for appending the diffs to the RSS feed (-d) and another to specify a range of revisions to check. The second maybe useful to create only an excerpt of all commits. The modified subverss script can be checked out from http://svn.var.cc/svn/misc/mySubveRSSed (user:anonymous, no password). There's also a diff in order to patch the original script (applies to revision 45 of the original).

update:
Philppe added the above improvements to the main subveRSSed tree. Get it from here - this will a be the more actual version anyways ;). Thanks Philippe!

(2)
Permalink
Posted in:

1-3/3

LiveSearch

Blogroll

Relayed

Archive

Buttons

  • RSS 2.0 Feed
  • Latest comments
  • XHTML 1.0 compliant
  • Powered by Flux CMS
  • Powered by Popoon

Login


BXCMSNG Errors:
Notice[8] Undefined index: 0 in [BX_PROJECT_DIR]/inc/bx/plugins/blog/categories.php at line 59.