-
Notifications
You must be signed in to change notification settings - Fork 4
SrgTop
Montserrat Marimon developed the Spanish Resource Grammar (see References).
Olga Zamaraeva is currently (2022-2023) developing it further.
The grammar is a fairly large grammar which can be run on corpora with some coverage. The associated treebank, TIBIDABO, is released along with the grammar as GitHub assets.
The grammar relies on the Freeling morphophonological analyzer. See the associated wikipage.
- For now: Linux only, due to the FreeLing dependency, which is tricky to set up for a Mac OS.
- Check out a copy of SRG from github: https://github.com/delph-in/srg. (Note: Do not use the older SVN copy because that would require extra steps.)
- Install freeling, picking the correct deb package for your ubuntu version. For example, for Ubuntu 20, pick freeling-4.2-focal-amd64.deb .
- To check the Freeling is working: try typing something like
analyze -f es.cfg
and then type in a Spanish sentence, likeEl gato duerme
. You should see something like:
el gato duerme.
el el DA0MS0 1
gato gato NCMS000 1
duerme dormir VMIP3S0 0.989241
. . Fp 1
-
There is a script, under
util/analyze-wrappers
, which maps tags from Freeling with rules in the SRG using YY input mode. See intended usage inutil/srg-yy.sh
-
Make sure you have a recent version of ACE installed.
-
Compile the grammar using ACE.
-
Run the script with the sample sentence file, making sure the
srg-yy.sh
script has the correct path to the ACE-compiled grammar. Note: You may want to create your own copy of the script (e.g.my-srg-yy.sh
); do not then check it into the repository. -
Without the script, to try parsing one specific sample sentence directly (simply as an example), start ACE:
ace -g srg.dat -1Tf -y --yy-rules
and try the following input:
(42, 0, 1, <0:2>, 1, "mi" "mi", 0, "dp1css") (43, 1, 2, <4:8>, 1, "perro" "perro", 0, "ncms000") (44, 2, 3, <9:15>, 1, "dormir" "duerme", 0, "vmip3s0")
- To use the graphical interface for the output, install LUI and use the flag sequence -1Tlf instead of -1Tf in step 8 above. Right now, there is no way of using the LUI interface with the
srg-yy.sh
script.
See https://github.com/delph-in/srg/blob/main/util/README
Find the up-to-date treebanks in the latest release (https://github.com/delph-in/srg/releases/)
Find the original treebanks, which can be used with the logon version of the LKB and the corresponding version of the grammar, in release 0.3.0: https://github.com/delph-in/srg/releases/tag/v.0.3.0
You can work with the SRG using LKB-FOS. The recent test binary can be downloaded from http://users.sussex.ac.uk/~johnca/lkb.linux_x86_64.2023.04.20 . In the grammar, change the line in lkb/Globals.lsp to the actual address of the freeling2lkb.py script on your machine. Also add srg/util folder to your PYTHONPATH. Then you should be able to run the binary (don't forget to chmod u+x it) and load the SRG successfully.
Home | Forum | Discussions | Events