Each of the following free n-grams file contains the (approximately) 1,000,000 most frequent n-grams
one billion word Corpus of
Contemporary American English (COCA).
In order to download these files, you will first need to input your name and email. Thanks.
Case sensitive means that e.g. Bush
and bush are separate entries. The n-grams with parts of speech allow
you to find (for example) all of the tens of thousands of
NOUN + NOUN sequences, or any other search that refers to the part of speech of
the word. For help with the part of speech tags,