|
Overview
Compare to Google
Samples (COCA)
Level 1 (free)
Level 2
Level 3
Historical (COHA)
Processing the data
Spanish data
Portuguese data
Purchase data
Free (1 million)
Related sites
Word frequency
WordAndPhrase
Collocates
corpus.byu.edu
Contact us
|
The n-grams are
available in a number of different formats:
|
Level |
Data |
Size |
Samples |
Price |
|
1 |
Most frequent 2,
3, and 4-grams |
1 million entries each |
See |
Free |
|
2 |
All 2, 3, 4-grams that occur at
least 3 times.
Available ±case sensitive,
±part of speech (more
info) |
6.2 million, 11.9 million, and
8.3 million
n-grams, respectively |
See |
$55 |
$95 |
$195 |
|
3 |
All 2, 3, and 4-grams, including
those that occur just 1-2 times |
More than 155 million rows (for the
3-grams).
The format allows users to specify word, PoS,
and lemma. |
See |
$95 |
$195 |
$395 |
|
License: GRAD = graduate
student, ACAD = other academic, COM = commercial |
GRAD |
ACAD |
COM |
To purchase the files
(Levels 2 and 3):
1. Download and fill out the
appropriate non-disclosure agreement (NDA) by clicking on one of the
links in the blue sections above, and then
send it back
to us as an
email attachment. For both GRAD and
ACAD licenses, the NDA must be sent back from a university email
account. For GRAD, you must also provide proof of status via a university web
page (on the NDA).
2. Once we receive the NDA, we'll send you a request for payment
from PayPal.
3. As soon as we receive confirmation
of the payment, we'll send you the
link to download the data.
Thanks for your interest.
|