This page contains very short samples from COCA for the three
different
levels of n-grams
. In all cases, the
samples on this page are limited to just 1000 lines or so, so that they load
quickly. Longer samples, more samples, and more information can be found in the
"more information" links below.
LEVEL 1: 1,000,000 n-grams for each of 2, 3,
4, 5-grams. Samples here are for 3-grams only (n-grams with like). More
information and samples...
LEVEL 2: All 2, 3, 4, 5-grams occurring at
least three times. Samples here are just for words starting with [U], and are
case sensitive, with part of speech. More
information and samples...
LEVEL 3: All 2, 3, 4-grams, even those that
occur just once (hundreds of millions of rows of data).
These n-grams sets are meant to be used in a database, where Sets 1 and 2 are
joined together (via SQL commands) to produce something like Set 3.
More information and samples...
COHA: From the
Corpus of Historical
American English. These samples are case sensitive and they also include
part of speech. More information and samples...
Spanish and Portuguese. For the
Corpus del Español
and the Corpus do
Português. These samples are not case sensitive, but they do include part of
speech. More information and samples...
|