A tiny German-English parallel corpus extracted from the German Wikipedia, where quotations sometimes include the translation and the original language. It contains 6,802 parallel sentences. The corpus can be useful for testing or tuning statistical machine translation systems.
Download
Download gzipped tar archive containing two parallel files in Moses format (UTF-8): zitate-dewiki-20141024.tgz.
License
The Wikipedia Parallel Quotations Corpus is derived from the Wikipedia and is therefore made available under the same license as Wikipedia: Creative Commons Attribution-ShareAlike license.