# introduction
- readme to file:	LEX#3_V_50LUs_4ling_PAR_TERM_sum.csv
- encoding:	UTF-8
- note:	Multi-word lemmas from lexicon V of 140 multi-word combinations validated by a team of 4 linguists (AD, EH, EK, EM) having worked earlier for half a year on introducing multi-word lexical units into plWN under the supervision of AD. Linguists annotated the list independently. The 50 multi-word lemmas were taken from the lexicon V through simple random sampling. LEX#3_V_50LUs_4ling_PAR_TERM_sum.csv is an extension of the LEX#2_V_50LUs_4ling_PAR_TERM.csv annotation.

# description
- lemma-S:	50 multi-word lemmas taken from V
- PARAFRAZA:	validating a given lexical unit, whether it is paraphraseable or not (with the test from https://docs.google.com/document/d/160g-skuaI2edevsUN98ufsFsdFMQhJQ72kK28zK12OA/edit#heading=h.4zpe7vopmdzd --- point A.2)
- TERMIN:	validating a given lexical unit, whether it is a specialist term or not (with the test from https://docs.google.com/document/d/160g-skuaI2edevsUN98ufsFsdFMQhJQ72kK28zK12OA/edit#heading=h.4zpe7vopmdzd --- point A.1)
- N-kipi:	N-kipi = AB + BA = number of occurrences of a given multi-word combination in IPIC PAS corpus
- type:	structural type of a given multi-word combination (NA = noun + adjective in postposition, agreed on number, gender and case, PP = nominal phrase with prepositional phrase, NG = nominal phrase with a modifier in genetive, C = nominal phrase with a conjunctive, AN = adjective in preposition + noun, agreed on number, gender and case, AAA - 3-word combination agreed on number, gender and case
- AB:	frequency of the order AB in IPIC PAS corpus (without agreement)
- BA:	frequency of the reversed order BA in IPIC PAS corpus (without agreement)
- ACB:	frequency of the order AB with a third word between the two words (separability) in IPIC PAS corpus (wihout agreement)
- BCA: frequency of the order BA with a third word between the two words (separability with reversed order) in IPIC PAS corpus (wihout agreement)
- SN:	measure of separability: SN = (ACB+BCA+1)/(N-kipi+1)
- ABBA:	measure of fixed order: ABBA = (AB+1)/(BA+1)
- ACB/N-bu: measure of separability: ACB/N-bu = (ACB+1)/(N-kipi+1)
- AB/N-bu: measure of fixed order: AB/N-bu = (AB+1)/(N-kipi+1)
- -bu:	stands for "bez uzgodnienia" (without agreement = we do not check whether two words are agreed on number, gender and case
- -kipi: stands for "KIPI" = Korpus Instytutu Podstaw Informatyki PAN = IPIC PAS

- class:	a two-fold classification done with the decision tree:

TERMIN
| TAK --> JLW
| NIE --> PARAFRAZA
		| TAK --> nie-JLW
		| NIE --> JLW

TAK = yes, NIE = no, JLW = multi-word lexical unit, nie-JLW = not a multi-word lexical unit

- suffix:	a suffix (-AD, -EH, -EK, -EM) marks - a given linguist  and
- suffix: 	a suffix -i marks the extension to the above decision tree with the syntactic information:

TERMIN
| TAK --> JLW
| NIE --> PARAFRAZA
		| NIE --> JLW
		| TAK --> STRUCTURAL TYPE noun + adjective in postposition?
				| NIE --> nie-JLW
				| TAK --> ACB/N-bu [separability] <= 0.001458
						| TAK => JLW
						| NIE => AB/N-bu [fixed order 1] <= 0.966887?
							| TAK ==> nie-JLW
							| NIE ==> ABBA [fixed order 2] <= 54.923077
									| TAK --> JLW
									| NIE --> nie-JLW

- sum:	sum of annotation of lexicalicity of multi-word combinations from V made independently by 14 linguists ("-1" = not a multi-word lexical unit, "0" = don't know, "1" = a multi-word lexical unit)

- class14: IF (sum > 0){==> JLW} ELSE{==> nie-JLW}
