Content
This data package contains 2 millions english collocations with example sentences.
Description
Extensive database of 2 million English collocations for 43 thousand English words.
Collocations have been extracted from a dependency-parsed corpus with more than 2 billion words. Main text sources were:
- 20,000 books from Project Gutenberg
- full text of the English Wikipedia
- British National Corpus
Each collocation includes the following information:
- Collocation (e.g. heavy smoker)
- Significance
- 3 English examples
- Basis word
- Syntactic relation
Syntactic Relations
Syntactic relation | example |
subject-verb | police arrest |
verb-object | start conversation |
verb-direct object-indirect object | lend drilling machine neighbour |
verb-prepositional object | wobble onto floor |
verb-direct object-prepositional object | drive nail into wall |
verb-subclause verb | let move |
verb-subclause verb with „to | force to resign |
verb-adverb | work hard |
adjective-noun | white paper |
adjective-preposition | conversant with |
adverb-adjective | really practical |
noun with genitive attribute | man’s friend |
noun compound | mega prize |
noun with prepositional phrase | cloud of smoke |
Testing
Variant 1:
You can query all 4 millions collocations in this online demo tool: http://linguatools.de/kollokationen-en/
Variant 2:
You can test the free API.
Format
If you license the data package of the English Collocations you will recieve the data package as XML, CSV, sqlite3 or in another desired format as a file for download.
Licensing conditions
Only for commercial use.Please contact Peter Kolb (peter.kolb@linguatools.org) for more information.
English collocations as API
For the english collocations we provide an API.
Description and testing of the API: https://linguatools.org/language-apis/linguatools-collocation-api/
Other language APIs by linguatools: https://linguatools.org/language-apis/
An overview of all collocation databases: https://linguatools.org/online-projects/collocation-database/