Wikipedia Parallel Quotations Corpora
A tiny German-English parallel corpus extracted from the German Wikipedia, where quotations sometimes include the translation and the original language. It contains 6,802 parallel sentences. The corpus can be useful for testing or tuning statistical machine translation systems.
Download gzipped tar archive containing two parallel files in Moses format (UTF-8): zitate-dewiki-20141024.tgz.
If you’d like to stay informed about corpora updates and new tools for text analysis you can subscribe to linguatools newsletter by providing your email address.