Polish Corpus of Wrocław University of Technology (KPWr)
Wrocław University of Technology, Language Technology Group G4.19
2010-2012. Some rights reserved

This work is licenced under a Creative Commons Attribution 3.0 Unported Licence
The corresponding licence agreement can be found at
http://creativecommons.org/licenses/by/3.0/legalcode

The corpus is manually annotated on the following layers:
* shallow syntax: syntactic chunking and selected inter-chunk syntactic 
relations,
* named entities and selected semantic relations between them,
* anaphora (limited to the identity-of-reference type),
* word senses (for selected lexemes).

The corpus is stored in CCL (*.xml) and REL (*.rel.xml) files, for format
specification please consult the following site:
http://nlp.pwr.wroc.pl/redmine/projects/corpus2/wiki/CCL_format
or the ccl-format.pdf file included in the package.

Note: the progress of annotation is different across the layers.
Not all documents have been annotated on all layers.

The following files contain a list of documents annotated on the corresponding 
annotation layer:
* index_chunks.txt     --- syntactic chunking,
* index_chunks_rel.txt --- inter-chunk syntactic relations,
* index_names.txt      --- named entities,
* index_names_rel.txt  --- semantic relations between named entities,
* index_anaphora.txt   --- anaphora,
* index_wsd.txt        --- word senses.

@inproceedings{kpwr,
  author = "Broda, Bartosz and Marcińczuk, Michał and Maziarz, Marek
            and Radziszewski, Adam and Wardyński, Adam",
  address = "Istanbul, Turkey",
  booktitle = "Proceedings of LREC'12",
  editor = "Nicoletta Calzolari and Khalid Choukri and Thierry Declerck
            and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani
            and Jan Odijk and Stelios Piperidis",
  publisher = "ELRA",
  title = "{KPW}r: {T}owards a {F}ree {C}orpus of {P}olish",
  year = "2012",
}

Web page: http://nlp.pwr.wroc.pl/kpwr

Acknowledgements: work financed by NCBiR NrU.: SP/I/1/77065/10,
project SyNaT (web page: http://www.synat.pl).
Project leader at Wrocław University of Technology: Maciej Piasecki.

Contributors
* Concept and co-ordination:
  Bartosz Broda, Michał Marcińczuk, Marek Maziarz, Adam Radziszewski,
  Adam Wardyński
* Annotators:
  Agnieszka Dziob, Justyna Ławniczak, Marek Maziarz, Joanna Nowak, 
  Marcin Oleksy, Jan Wieczorek
* Software developers:
  Adam Pawlaczek, Jan Kocoń, Marcin Ptak
