Offline Wikipedia reader

From Openmoko

Jump to: navigation, search

offline wikipedia is one of the applications that runs on the Openmoko Phones. For a list of all applications, visit Applications.

150px Offline Wikipedia reader

Read entirety of Wikipedia offline


Homepage: http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html
Package: no package yet
Tested on: -


The Offline Wikipedia reader is a set of scripts and programmes which can be used to display wikipedia pages without an internet connection.

The English Wikipedia (text only) is around 6GB including indices, so the entire content can be stored on one 8GB card. The German Wikipedia is approximately 1/4 the size, so can be stored on a correspondingly smaller card; ditto for other languages. Asthe file is so large, you may want to remove the microSD card from the Freerunner and use a card reader to transfer the files; copying them over USB will take several hours.

The software provides a custom lightweight webserver running locally and uses php to present the pages, which are then viewed using any web browser.

Instructions can be found here: http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html

All Wikipedia pages are contained in the Wikipedia page dump - [1]. The file needed is called 'pages-articles.xml.bz2', and the most recent is 4.1GB

Before the pages can be displayed, an indexer needs to be run - this should be carried out on a desktop/laptop, as it would take many hours on the Freerunner; this takes approximately one hour on a dual-core 1.1GHz cpu, 1.5GB RAM laptop.

Dependencies (php, perl, python) are then installed on the freerunner, and the index and database copied across, along with the included web server.


Development status

At present, a single tar.bz is downloaded from the site above, the pages extract downloaded and copied to the correct location, and the indexing process run.

In the future the software will be released as an ipk. Also a deb/rpm for a desktop distributions, to allow automatic download of the most recent wikipedia dump, with a diff utility to allow it to be updated and re-indexed.

Personal tools