Downloads
Translations of Simple English Wikipedia Articles into Typed Lambda Calculus
The below text files are annotated using a cued-association sentence processing (CASP) markup, including associations for anaphoric inheritance (-n and -m tags) and quantifier scope (-s, -t, -u tags).
Files:
(v0.3) syntactic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC1.cg_.txt1.01 MBWikisem tranche C1 categorial grammar (v0.3) semantic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC1.logic__0.txt6.01 MBWikisem tranche C1 logic (v0.3) syntactic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC2.cg_.txt988.24 KBWikisem tranche C2 categorial grammar (v0.3) semantic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC2.logic_.txt5.34 MBWikisem tranche C2 logic (v0.3) syntactic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC3.cg_.txt974.73 KBWikisem tranche C3 categorial grammar (v0.3) semantic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC3.logic_.txt4.68 MBWikisem tranche C3 logic (v0.3) syntactic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC4.cg_.txt978.59 KBWikisem tranche C4 categorial grammar (v0.3) semantic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC4.logic_.txt4.42 MBWikisem tranche C4 logic (v0.3) syntactic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC5.cg_.txt944.14 KBWikisem tranche C5 categorial grammar (v0.3) semantic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC5.logic_.txt4.28 MBWikisem tranche C5 logic (v0.3) syntactic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC6.cg_.txt969.04 KBWikisem tranche C6 categorial grammar (v0.3) semantic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC6.logic_.txt3.98 MBWikisem tranche C6 logic (v0.3) syntactic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC7.cg_.txt991.19 KBWikisem tranche C7 categorial grammar (v0.3) semantic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the next 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FilewikisemC7.logic_.txt4.2 MBWikisem tranche C7 logic
The below version 0.2 annotation files must be manually translated into large lambda calculus text files (over 100M each) using the modelblocks software package.
----------
After installing modelblocks, go to the modelblocks-release directory and create the workspace directory:
make
Then, from the modelblocks/workspace directory:
curl -O https://linguistics.osu.edu/sites/default/files/2021-06/wikisemc2.casp_.toktrees_0.txt mv wikisemc2.casp_.toktrees{_0.txt,} make wikisemc2.casp_.discexprs
If you have trouble running modelblocks, you can build the files manually:
cat wikisemc2.casp_.toktrees | perl ../resource-linetrees/scripts/editabletrees2linetrees.pl > wikisemc2.casp_.senttrees cat wikisemc2.casp_.senttrees | sed 's/\^g//g' | python2 ../resource-gcg/scripts/senttrees2discgraphs.py -e > wikisemc2.casp_.discgraphs if [ ! -d ../../modelblocks-release/config ]; then mkdir ../config; fi echo '-DNDEBUG -O3' > ../config/user-cflags.txt if [ ! -d bin ]; then mkdir bin; fi g++ -I../resource-rvtl -Wall `cat ../config/user-cflags.txt` -g -lm ../resource-linetrees/src/indent.cpp -o bin/indent cat wikisemc2.casp_.discgraphs | python2 ../resource-gcg/scripts/discgraphs2discexprs.py | bin/indent > wikisemc2.casp_.discexprs
Files:
(v0.2) semantic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FileWikisem tranche C1 CASP annotations956.16 KB(v0.2) semantic annotations for 6-sentence beginnings of Simple English Wikipedia articles corresponding to the second 128 most common words used in a 2014 dump of Simple English Wikipedia that are also titles of articles.
FileWikisem tranche C2 CASP annotations911.34 KB(v0.2) semantic annotations for 3-sentence beginnings of the first 279 articles in a 2014 dump of Simple English Wikipedia which are not redundant with tranches C1 or C2.
File