Friday, July 29, 2011

Configuring hpodder To Handle Compressed Podcast Feeds

My Nokia 2730 classic dumbphone is surprisingly smart. Unbeknownst to me at the time of purchase, was the fact that it has a 2GB MicroSD card, and can play MP3 files. But eventually I did stumble upon this feature and it wasn't too long before I started tuning in on podcasts.

I use hpodder (launched from a cron job) to follow podcast feeds and download episodes to my computer, and a semi-scripted procedure to move these files to my phone's memory card over a USB connection.

Lately, hpodder started complaining:
*** 12: Failure parsing feed: Lexical error in  file http://escapepod.org/feed/  at line 1 col 1:
  unrecognised token: ^_\213^H^@
So I tried to download the feed manually:
$ curl http://escapepod.org/feed/
and the terminal filled up with gibberish to the point that I had to blindly type reset in order to fix it.

That was weird: after all, the feed is nothing more than an XML file - a text file, which should be perfectly readable with the naked eye.

I saved the feed with
$ curl -o feed.bin http://escapepod.org/feed/
and then determined its type:
$ file -b -i feed.bin
application/x-gzip; charset=binary
$ zcat feed.bin | file -b -i - 
application/xml; charset=utf-8
I.e. a GZIP compressed XML file.

So hpodder choked on a compressed feed.

I consulted the manual and found out that hpodder delegates downloads to cURL. I skimmed through the cURL manpage, found about the --compressed command line option, tried the download again - and got the uncompressed XML contents. Hoozah!

Apparently, the server, that's serving that particular feed, is mis-configured to always compress its replies, even if not specifically asked to do it. The --compressed command line options tells cURL to request the server to compress its replies, and cURL decompresses them.

I tried downloading other feeds with the --compressed added, and it worked fine. So either this option is supported by all the other servers on my list of feeds, or that cURL does nothing when the reply is not compressed. I dunno.

All that I needed now was a way to convince hpodder to launch cURL the same way.

Turns out that hpodder is a rather nice piece of software (and rather well documented too). The hpodder manual pointed me to ~/.hpodder/curlrc:
$ echo compressed >> ~/.hpodder/curlrc
and now hpodder works like a charm again (and probably faster than before, because it always asks for compressed replies).

Friday, July 22, 2011

Cloning a GitHub GIT Repository on Ubuntu 8.04 LTS

At work, we're still running Ubuntu 8.04 LTS on most PCs with Linux. Most of the time the age of the operating system isn't a problem - but sometimes it can be a pain. Case in point: cloning a GIT repository hosted on GitHub. This used to work just fine, until they switched to HTTPS:
$ git clone https://github.com/user/repo.git
Initialized empty Git repository in /current/directory/repo/.git/
Cannot get remote repository information.
Perhaps git-update-server-info needs to be run there?
When this first happened, I shrugged it off as a problem with the remote end, and just downloaded a source tarball from https://github.com/user/repo/tarball/master. But a few weeks later I got this error again with a different repository, and got annoyed. I tried the same command at home (Debian/testing, GIT version 1.7.5.4):
$ git clone https://github.com/user/repo.git
Cloning into repo...
remote: Counting objects: 81, done.
remote: Compressing objects: 100% (72/72), done.
remote: Total 81 (delta 34), reused 55 (delta 8)
Unpacking objects: 100% (81/81), done.
So, this is a problem with either GIT at work, or the Net connection. I downloaded the GIT source tarball and installed it locally in my account (at ~/local):
wget -c http://kernel.org/pub/software/scm/git/git-1.7.6.tar.bz2
tar xvjf git-1.7.6.tar.bz2
cd git-1.7.6
ls
./configure --prefix=$HOME/local
make
make install
and since we're using tcsh at work (don't ask), I also had to type rehash in order to convince the shell to use the newly installed GIT.

Here's what I got this time:
$ git clone https://github.com/user/repo.git
Cloning into repo...
error: SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed while accessing https://github.com/user/repo.git/info/refs

fatal: HTTP request failed
... which is useful: Google directed me to a question on Stack-Overflow. Most of the answers there deal with installing CA certificates, but the following trick works nicely with git version 1.5.4.3 on Ubuntu 8.04.4 LTS:
$ env GIT_SSL_NO_VERIFY=true git clone https://github.com/user/repo.git
Cloning into repo...
remote: Counting objects: 81, done.
remote: Compressing objects: 100% (72/72), done.
remote: Total 81 (delta 34), reused 55 (delta 8)
Unpacking objects: 100% (81/81), done.
(no need for env in bash).

And while we're at it, here's another trick that might be handy in the future:
$ env GIT_CURL_VERBOSE=1 git clone https://github.com/user/repo.git
Cloning into repo...
* Couldn't find host github.com in the .netrc file, using defaults
* About to connect() to github.com port 443 (#0)
*   Trying 207.97.227.239... * Connected to github.com (207.97.227.239) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
* Expire cleared
* Connection #0 to host github.com left intact
* Couldn't find host github.com in the .netrc file, using defaults
* Connection #0 seems to be dead!
* Closing connection #0
* About to connect() to github.com port 443 (#0)
*   Trying 207.97.227.239... * Connected to github.com (207.97.227.239) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
* Expire cleared
* Connection #0 to host github.com left intact
error: SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed while accessing https://github.com/user/repo.git/info/refs

fatal: HTTP request failed