There are tools for downloading the entire wikipedia database (above 8 Gb without pictures), but I would like to download only physics and maths pages, to view them offline. Wikipedia pages have a tree structure in categories, so it should be possible.
Please see Wikipedia:
To download a subset of the database in XML format, such as a specific category or a list of articles se[...]
and a link to some other page from which, after many efforts, I could surely get what I want. I am just wondering if anyone has already done it and can save me/us the effort with a few "how to" guidelines.
I know wikipedia seems for many of you not much an orthodox source, but it is usually an excellent introduction to areas that are not in your main line of research/study. I miss it very often while commuting.
Answer
To get a list of all physics and math articles on Wikipedia, you could run CatScan on Category:WikiProject Physics articles and Category:WikiProject Mathematics articles, like this.
Note that a lot of the pages returned by CatScan will be article talk pages, since that's where the WikiProject templates normally go. Depending on the output format you choose, you may need to postprocess the list to remove the Talk:
prefix from page titles.
You can then use Special:Export to download the actual articles.
No comments:
Post a Comment