How to export a blog as a document 19 May 2011 Anyone got smart ideas here? I want to convert my whole blog into a formattable document, including comments, with a view to doing a book format (The Very Best of Evolving Thoughts). I want to be able to edit it, and put it through InDesign, and I want to do the whole thing in one go. I’ve tried importing WordPress’ XML files, but nothing works. I’ve tried finding an export plugin, but nothing exists (though many have asked for it on the forums). I can export individual posts to InDesign, but that means I can’t convert the resulting markup to Word or Pages. Any ideas, oh wonderful crowd of readers? Anyone want to write a WRX to RTF filter? Administrative Administrative
Administrative I’m a 3QD Philosophy Prize semifinalist 12 Sep 201122 Jun 2018 You no longer can vote for me, but I made the cut for this post on phenomena, so I think you should go read some cool philosophy posts while we wait for the editors to pick six entries to send to Pat Churchland for a final judgement… So, this being… Read More
Administrative Happy Solar Festival 21 Dec 2009 So, I am moved from Sydney to Brisbane, but as yet have no internet at home. I must therefore steal some of my GF’s in order to wish you all a wonderful solstice for tomorrow, and sundry other religious and ethnic festivals that may fall on or about that date…. Read More
Administrative So I says.. 5 May 200918 Sep 2017 I told ’em! I told ’em! So I did. Millennium hand and shrimp! Read More
Build a screen scraper? http://nokogiri.org/ Here’s a crude sketch in Ruby: require 'nokogiri' require 'open-uri' @doc = Nokogiri::HTML(open("https://evolvingthoughts.net/2011/05/how-to-export-a-blog-as-a-document/")) # class entry-title # class entry-meta # class entry-content @title = @doc.at_css("h1.entry-title").text puts @title @meta = @doc.at_css(".entry-meta").text puts @meta @content = @doc.at_css(".entry-content").text puts @content More sophisticated parsing, extraction and persistent storage would be necessary.
I just needed a historians to do the research for me! Thanks, Chris – I’ll report back on how well it works.
I’m with Chris – I’ve used anthologize for this purpose & it’s not bad. It doesn’t give you perfectly clean copy, but it certainly gives you enough to work with.
Perhaps somebody like Ed Yong, who published from his blog, would know better? Can you do any LaTeX wizardry to your wordpress files?
I can write a grep file in a number of environments, but I really hoped someone else would do that for me. I used to do that for a living, and it’s really, really, boring.
When I had to export my whole blog, I was able to set the RSS feed to display all entries. That’s a cinch to import. But WordPress may not have such a setting.