From d709ddcfc67f3426acf218394689e97226bb5349 Mon Sep 17 00:00:00 2001 From: FredTheNoob <43958385+FredTheNoob@users.noreply.github.com> Date: Fri, 1 Dec 2023 10:33:35 +0100 Subject: [PATCH 1/4] dir watcher docs --- docs/directorywatcher.md | 60 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) create mode 100644 docs/directorywatcher.md diff --git a/docs/directorywatcher.md b/docs/directorywatcher.md new file mode 100644 index 0000000..ef1cc1f --- /dev/null +++ b/docs/directorywatcher.md @@ -0,0 +1,60 @@ +# [Directory Watcher](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/lib/DirectoryWatcher.py) +The pipeline starts when a new file is placed in a watched folder by pipeline part A. The Directory Watcher's responsibility is to call a callback function when a new file is created in the watched folder. + +## Features +- [watchdog](https://pypi.org/project/watchdog/) for file events +- Async callback support +- [Threading](https://docs.python.org/3/library/threading.html) + +## Overview + +The `DirectoryWatcher` provides a simple way to monitor a specified directory for file creation events and execute asynchronous callbacks in response. It utilizes the [watchdog](https://pypi.org/project/watchdog/) library for filesystem monitoring and integrates with [asyncio](https://docs.python.org/3/library/asyncio.html) for handling asynchronous tasks. Furthermore the `DirectoryWatcher` uses [threading](https://docs.python.org/3/library/threading.html). + +> **_NOTE:_** [Threading](https://docs.python.org/3/library/threading.html) is used to avoid blocking the main thread's code from executing. + + +## Example usage +```python +# Importing +from lib.DirectoryWatcher import DirectoryWatcher + +dirPath = "some/path/to/a/directory" + +# Setup +async def newFileCreated(file_path: str): + print("New file created in " + file_path) + + +dirWatcher = DirectoryWatcher( + directory=dirPath, async_callback=newFileCreated +) + +# A fast API event function running on startup +@app.on_event("startup") +async def startEvent(): + dirWatcher.start_watching() + +# A fast API event function running on shutdown +@app.on_event("shutdown") +def shutdown_event(): + dirWatcher.stop_watching() +``` + +> **_NOTE:_** The fast API event functions are not needed to use the `Directory Watcher` + + +## Methods +```python +def __init__(self, directory, async_callback): +``` +### Parameters: +- **directory** (str): A path to the directory you want to watch ie. `some/path/to/a/directory` +- **async_callback** (function): An async callback function to be called when a new file is created in the **directory**. This function should accept a single parameter, which is the path of the created file. + +```python +def start_watching(self) -> threading.Thread: +``` + +```python +def stop_watching(self): +``` From 8f9f6987d98eab5ef3714b4ea04e2cf0b8adc242 Mon Sep 17 00:00:00 2001 From: FredTheNoob Date: Fri, 1 Dec 2023 11:55:47 +0100 Subject: [PATCH 2/4] remove dir from filename --- components/GetSpacyData.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/components/GetSpacyData.py b/components/GetSpacyData.py index 835c98b..26d8e02 100644 --- a/components/GetSpacyData.py +++ b/components/GetSpacyData.py @@ -87,7 +87,7 @@ def BuildJSONFromEntities(entities: List[EntityLinked], doc, fileName: str) -> J # Create the final JSON structure final_json = { - "fileName": fileName, + "fileName": fileName.split("/")[-1], "language": DetectLang(doc), "metadataId":"7467628c-ad77-4bd7-9810-5f3930796fb5", "sentences": sentences_json, From dbc73c9bf7ba01ab68e48963985f786e9f4b2083 Mon Sep 17 00:00:00 2001 From: FredTheNoob <43958385+FredTheNoob@users.noreply.github.com> Date: Fri, 8 Dec 2023 12:13:18 +0100 Subject: [PATCH 3/4] add database docs --- docs/database.md | 73 +++++++++++++++++++++++++++++++ docs/img/database-visualized.png | Bin 0 -> 25694 bytes 2 files changed, 73 insertions(+) create mode 100644 docs/database.md create mode 100644 docs/img/database-visualized.png diff --git a/docs/database.md b/docs/database.md new file mode 100644 index 0000000..99c32f8 --- /dev/null +++ b/docs/database.md @@ -0,0 +1,73 @@ +# [Database](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/components/Db.py) +The database is responsible for keeping track of sentences, entity mentions, and entity indices. + +## Features +- CRUD (Create, Read, Update, Delete) Operations supported. +- Uses [SQLite](https://www.sqlite.org/index.html). +- Seeds the database with required tables if they do not exist. + +## Overview +The database contains the following tables: + +![](img/database-visualized.png) + +### sentence +Contains each sentence from all input text. Has a unique `sid`. + +### entitymention +Represents each entity mention from all input text. Has `sid` as foreign key (a sentence must exist for the entitymention to exist). + +### EntityIndex +Used by the [Entity Linker](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/components/EntityLinker.py) to find potential matches for a given entity mention. See [Entity Linker Docs](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/entitylinker.md) for more information. + +## Methods +```python +async def InitializeIndexDB(dbPath): +``` +### Parameters: +- **dbPath** (str): A path where the database will be stored / is stored, ie. `some/path/to/a/Database/directory`. + +```python +async def Insert(dbPath, tableName, queryInformation): +``` +### Parameters: +- **dbPath** (str): A path where the database will be stored / is stored, ie. `some/path/to/a/Database/directory`. +- **tableName** (str): The name of the table you want to insert into, available ones can be found in [Overview](##Overview). +- **queryInformation** (JSON): A JSON object containing the key-value pairs you want to insert, for example: +```JSON +{ + "fileName": "article.txt", + "string": "A duck walked across the road", + "startindex": 20, + "endIndex": 29 +} +``` +Would be a valid insert in the `sentence` table. + +> **_NOTE:_** The `sid` is autogenerated using `AUTOINCREMENT`. + +```python +async def Read(dbPath, tableName, searchPred=""): +``` +### Parameters: +- **dbPath** (str): A path where the database will be stored / is stored, ie. `some/path/to/a/Database/directory`. +- **tableName** (str): The name of the table you want to insert into, available ones can be found in [Overview](##Overview). +- The search predicate to query the table with, for example if `searchPred` = `Jones` and the `tableName` = `entitymention`, the entitymention table will be searched for `Jones`. + +```python +async def Update(dbPath, tableName, indexID, updatedName): +``` +### Parameters: +- **dbPath** (str): A path where the database will be stored / is stored, ie. `some/path/to/a/Database/directory`. +- **tableName** (str): The name of the table you want to insert into, available ones can be found in [Overview](##Overview). +- **indexID** (str): The `sid`, `eid` or `id` (EntityIndex) to update. +- **updatedName** (str): What the `string`, `mention` or `name` should be updated to. + + +```python +async def Delete(dbPath, tableName, indexID): +``` +### Parameters: +- **dbPath** (str): A path where the database will be stored / is stored, ie. `some/path/to/a/Database/directory`. +- **tableName** (str): The name of the table you want to insert into, available ones can be found in [Overview](##Overview). +- **indexID** (str): The `sid`, `eid` or `id` (EntityIndex) to delete. \ No newline at end of file diff --git a/docs/img/database-visualized.png b/docs/img/database-visualized.png new file mode 100644 index 0000000000000000000000000000000000000000..7f25a8018d44cfb636f846636d7a40ddba4c5693 GIT binary patch literal 25694 zcma&O1z42r);2sef|QhWNtfi%9nuZbC?zf3sf2V0Lx)J0bc0Ig&|T8qo!^Z<&)&~| z|NVXMe;jad%rN&{cdR(qxz2TcQdX2kLm@-~fk0?7uOw7KAQ%VWw+o=hqTOFOcatXgFiV{JA*!^3EqAqvMk8ykjWBvX8ESdCfh`)+dg^gWXcIhNTBo(zbzVo zf0sMZ0`*+fYKSd0(U++?VDuK=nA=_`Dw>?E-OKKRdgqvHi;K4@IJ_Az)Nnf6)~MEK z;j)?->IlOh*Suoz_AwyCm@E^h^I7|G#;BB2x>#|xGueo;8ej9a1Es;|u3-kY_WfP& zhaD|az1Cy(?Q06E4C#`lY^y&{C;6}!+@R!i=cyyc>e20IgV#B$h;0ks}ZZ$`;#~>&s*5t&Sq~o^;$=s z6Lr(l(2U`Y1-xdWd1$7JhIAGAziK;UM3ZIw&qcy_r|aEM%*MK{!keWGWHx!VA5VMC z392}^TP#>NtxaIiteyAkIvz0x_g|-*gPT>fQ)xMnc7I=1$o2XiU5R>-mK+=*aBa~s z|F|9$h|Fnh^IjDc8}`jM3E`Klg_j5k1>PWKN6cE>4Hu&&{AXW2m@x>zK#1*o@)-s>XE2a1Qf5ZXpc~#!!8oGk2Z1J6Uo0 z9^)+khhx`1PT7j@__XmwX!#mNjWH9e3Mz)=Hnsar({A?*oIqhuZ^uXp?1} z-QDrK6`#CkEf$*`0^DfwM0T5^;A}jim1}W7qS}+1EARFR5Lj7PQ+#Z2jT8FV^$p3s z@hy}DgogxsC3b|QKF=rSJzF!(U7v5Q2Y>bLsYhy-Ywv~=H&FvWBcarFew<;7tm_>P z=21L{QJQy{dXfjVj04d?VKIF=JfzHbqcB+mhqF(D16_lfQL##x(u+sZUf?9~ zazD(885bd^a=BPOJwmj9)v6uxQiy2+F3V_m1@8N@2)R~8AI-TCq7Qr`uRfrl*c?$1O1+kNUxFXmk{>^VBQZRbKpWFv-i;*XB(tC2vQW8FWH^JBDv zw5*yQjaOL*=0BXz+&Y5TvactOR)wkzR5szp1-MqCXXQzTPkv1%UiA{{x;<=MRa9gca+KJABLp=v~fz?8K3U*jpx|MVILYi1j- z_l)yfYAK*^OO}+;j$0C}inEacjqlzqJ$FVLcj0B=#RG}pnXI~h6mu=8J`NOPwnPJE z9^0N>dEJb8s4U#BgGcRrPKbc8Roik_a@xySmS3{W$;G9z`_^Uj)t8H=n`TSqy_}3D zIy;vjo&5yV(H#$dzNBdz_AzI56R2k1vPgc~Pk(qcX;!SO+5C&5j65ah5bvgsA(YS- zc{`g$rH30|FG^>VCg2}z@9GyCN*BD#lg>%a{%Q>hw%+^{jG9GR>UB)N%kr_1NQRh%HXhDw_B zSVBw_-m-td^Rg=V0Tx#qwdDWMT}4|AM!0g6h%Oz4X-q7K`5vE)ysD=&95(90275n# z%qCPZ+1q`#{OM^U4)I4`M?6tYlm3>ea9lTU=CKJ`PL|4bHO_eb6}m35NHpcPgc8G6 zADNR*Tt>|yAd8J_+H>12>~6z}x)3+*QOEDvFb}mvdGNz-jUZ2U)mMqdja?kRiPbnW zA_~Md=1g6Z_Yw}gOZ_BGKSmw7GSUkvE>5*tZzj19XAh(8{@x%vkzh!;pt)cCuF-@2*(Sw!G~cG<^_Jrj9>eB>Vd3$(GRw zObAHxPY)$hFS4wF@C zT=+?X2ti%^(}GK@H49t*B?E{is8? zTHAoC0@2WnA^Gs^%8>CQ3>;gxSeCA&{DPt4CX-=nDn)&jAE`R2$MSF~rG`ts#u@Rr zdjD$#nx(HiIPF`qa5PU#`J04J8?;2zY-Eb;^7eLj%Xet1X;Z@Bw0+8U{nIcaer#J8 z%^)qii9b=ZL)6QOOpyI;;P)&3#SNFqsIc9X#+|$cWC@v54@yTf z^8{TITSG2KknhM3)Ju7o5uwHxGY#h_Y|F&$L_wpxJmUD?ewG30Y3@4BYp1)bDQSuX z^{+SXcqNqgh>u2Ll45kH%MHzzDfy#9clFFy3GPIgkIh4jSA|HW;oVv-$?|450aU9EvL3@suuH%se`AJ~wp<2>qE!UxLr^B8LcS1VeUE8r6G5AE?7{i0% zpcD!XogM~AexCa-zL>P6h(Q7&_>4mj{`l1p1453hpk$%dj4aV6lb}+{YBrOv3H-ak z1~3-y**Ft*bXY5KDB&!OEF+LcM*M9|f^pXqGue7l-lDVb%t)G6 z)o_OIh|cX;enKgZuIl)qJKUMpYmuQ)&LBSRtfw}C@K}LLeo$E+erY||qdFV8BN7c% zs;)r(aCnCiB2pKWz-q`s=6;it(UPC8FQT`yp3FH};6BL9s(+Q2*Dfth|7C*L4e#}vKgu9kFK>9pd!yaYAhUlU!1_}~Jen7v7d za0qa1H7KFJ`?WFp6SA{~z6TV25=p;9(4L-^Tv>m%1`l;Gv`b!n5s}uv$V3@I zBuBmk_Zfwt9yLq3g*35){^AS-QndVHo_CXy#ysZjg&*jI-M{%}J6k^~scO5&eQ2VA zIS;O7MuAht{0nmx@6OrmHLJ?yus06+?JL01@?9KE4pO1UJ~H7wB!23R;6^WR;j{W^ z9|g_6x}SFfj2#O<6-tz#s*Z8vm809GhOFmyvRvkht`pFY>1hiefePnzSMU3JUlz-L zwtJ7Iwe?QF*;!8&f)c-{lG40g96xBfv)fy6vO5jdoe!A|M-)8sKyV~gG3ud+u+(2Z zMg;weGRT*|0}g|`q2JwbB*^&=W1s5N7Gqu=T>)2*JjJKnN7FB{(s0G$;lfuKuZ{l) zq}fM@&~D~hwoc3nDXqw%Cs$M_i%-A6%IT)A-bXZFwT!>ID^fE}dez?h<%(D>fCA;% z5Tg)L`18@EAM(;&w$l;KgOHb6d@Vjz2>-8zOorJ=d(V20LUDcKOlu9?Zm7_m;Pv?- z_UUP*`PN9;2-fFH+h!gc^RTG>I8$W2%_&w`;}qtXx%TUWqHz*FW-thikF+$e?S0^I6}SA1*t)xnhNH?H(HS zrN`O-!peSB>Y`KG-`l_9Fnbm81k|t2<=+&me`=aLK#juN%C8?fL~>X!Bsf-`EG0GX z{2l7mEKZ;6ce!bxtp)L=c=~=oSMUdl(=dj!c2}o#nb0ilcl7A#9+o5nS!#^KJ2(v^`RQnI`4j7( z!enY{C;(MyzQ+Gnpk{@&_$j%1kO_S;Brv!OsRzE4(Bw~1YbfG>?#;=QnZm5Q82ehM z>(=|22#rH$APEF4fY(uRv)>5Su;@Sq5PDr^$pS+l+~JQ%kfbR;H~Q*rd_|n5F4$R~ z%`Me{Ez&rwzEIh#sP-=if4VC{egQh&?a$M+{0-vYSE{{#y6WCA(skn8pfwPF;@SX3 zS*4Dm(o7xPlQWjENg_E`GY+w_)vvwu^)Lk+;_cvWeu)xM=PN^`d@LNeRH!yYO7Qdc z^q9x)^0&pH&8!#Up++h58nm*(H3ylbZ^W#wQV@NgR

GjV4?w;`ammq=ph^2CDZQVwHMZrm07@QZ6Luu_@s=u6vbQ}pE7+TxIINeg zX-Sn^U|W3BunEWXu@Imiw!a*#m7X^AaPY0e`Dy+!9#q#HC~MxR_mEP7$uzXbay}mQ+ap$XVQoybaoxD{tz(!w z_p|QF>kw?sj=uaH(C$b-2_Cy{3erJ>!-r*%IY(2-- z!_W4QGq7Z%XF_oqH?PljEQ@$jz;-T7B|P4FD8BGy zi`8$1JS}o!Wv>pPa;p!h09k>cuS37qdQ?~#Llv#XI@X=m^MH@pT?A{n%F;hDFr0MY z4T;1c&F;G2=Mi+S>2?(!kIAbQypZdu2h#{o3O<16J)+72XGrH{UB9Sd_?15vYMmvJ z;c~kX{e#3WtuyX1Efp(})B8@x(_CFhRYO%(Y#gTxyYIy{r)_$;AI6d-7bM2_G~IBN z%y)_KO0$Im-vGmoZW*;ky>ca8MfX7-1j5KKAiVlMWFHe~__blyFpt&OCVuXsL&+eg zUdt{=_`ow|uBJ-);z%E7#KxH?L|TkLE>&L8@km|EHm?mrr%JuJ`D@~~;6x{dTFad2Stkmj_L`E2$eb|9c6 z2(Mi$e-Yr@Jj1&1l&1I&I@)kbB(gC1&EuH5gXp6;CX(Mz%hIajyDe~^^kBRD8;^|6 zfhaqNm1Vd=Fj$L_@#hpZ_g{rZ>g7{`mr z47YukYST|q``L=|fqhcFNJF0%2+X$`8WRX=JmBc^KTnF#*U5i@>;jBTm=sd?!b{v; zC4*giWkG+hwchyCAD+4eCu*}>a2IoOE7;JSfO?h^ue}}G&1*X!xx91*BwGXDHp{&v znFIxNH3^#;Z36z7#n$8pdiK+~;^u8?b9xyon;fjbi<^M4IU^gL90KY^tM5EVzeu;| zY9eY1a-dtjp9NsElx^SQkmZu{YGR;umFP!y`DX*MfUM;K2IX>c=yA0(j+ST*8>Z6{XElxCMx5soU#Fx&u)6KY~MLM=U zQlgz;TlPx@^-Oy!y^v^TpX#uv%L`t9v$29cY>T^e2CWL9CS&Zv+G<+JeMM#y=gaV( z9Dif&b^7qidraY1zqk~Uv^Uy)>EB}>xICvD>GF-A^6|Z&Zu1XclzGUB!7wX9hLq`P zTH!6MlMgysNFv$47Kl&~1<-bmyRaSnxFkHgwuj8P-Yvi?Lf}Y|bU*dcr`6b`cy)CI z2b&(QFL~95CtkBD#=OFcL+AO?wb1W->=fPTtW}~Uy5RpVwC0oNQdC?WkmZ1D~p7I zvSVQ7AS0&onaWWvTXCmStHo=R0fWwKCwH2gGDLS${3`*m7wC``5}i#tmj4gvW`T$C z%IY8C6;#q=NyKka|J!YHLqb4|;#B6%ntUv9r?+qBsACHG7w?xH!hNvSqT38LrjHYLN4Qf?1j)tuf}fxM5oM-Be(J2)c64U&N^`{LkhId93QMW{EgINHN{HWm+cJg;^Ba8t#8kvMMLp z&v8pNSjRdV6jXdVyV`saLUIzQVVr{);C*gOI-|IMI#P@JX?jY?hq7UI9>Aosc ztSw}rGTaUCE>Z0qovt&3_Syq&kZ|IrFR9_gm6}h7CMI$xnD+Me0KO9BKX<}IAIj|Vh|_11BPR7Czr_@DD#)uhCqTt6V!X(k=o=e!ZJ&0_iVdA_aNbd7M9R0PE9q~OTd^lt`xQ*4sf!1+ zLNcph4g5z?Kq6>BYbIZe>?W>+7j;EyH4yJ5_TP12W^YMO1cq%6zqe)OQ)dO;t*nV^ zhvsswd6TA+0)zcHLhiNQ4P%bTf|u|PeuN{wRjk!KHsUZXcr!!q(>rO*Rz38Y=8XiV zt7Mv0LckH4Uyh3J@M#oPHr=}fL6Ss=qZU?}F8Yc}b<4Q$WCt^JV z?3p&-4}w~zF8Gcbx0W}WCQ~YfvlX9SnM`ar4JrOIo*YI<^0&)DbK>TzVX6#aGKi$F zxa5v05%%>M@!;qV*a{BvTu(9|(<#l%@JUW!uH^DSpc8kYr}pG;r6cihLplq8A+cnW z_btT!^Yd(J0G`{tZGGDN_F>-zH=p{5@TkcP?TTI^dwHsvvp+?YEl8j$u%RPG)DMn7 zs?uB?{=sE?4U92aOuhiHxw53h?c-|8_3JcP67H?Yh0Y41mg>;=-zKSJQPD(oud>s< zWU!{(h0AWtELdnmPKbWc94km`;2DUhn(Vh|RAG+e0*33hIWu*JNe+PJF%zPeu5n{*JWL5R>iNUm7H8z z>~FNM^Yi#fRMfz<;GBR)fD2~JE9dlCnUWdkadI(y4l%M8t_>Kfbc~sEgbE(dK3#NM zLA&l?^%132OwEKCc*qD{v|}p`bqSrb)%)D+)%LAi8ee?)@S)1{!uEPp_)e|C{egW1_M^v7EamGtidQpa;EYm+ff5N+QGD3%#sk_6jtU( zk5VSRhN+6sToXi_JXyv(XYKYH=6lLltwC^=EEzn_oeUYTjc23c`I*jDL%&z*jpQWk zvaxTC$kGe)+x@yuNeSfLV|6ZwKjPte*jv13_Fl%$$ZUPMY0XRHcg;<&dR}SG>CCdBj3xUOekzUHHqbOf_ z*Li2iEqgJ(ftx?6$0o&@d~V}Gi|#}RuiR>J8izdyEade7NBONZYpoB`C%xdamB(6W z&bWpW5iUQXaIALuooyy9^e2mb+MQfB&V~*zMAIZG2=y>0F~>2lgecO3)ODN2OsJer zK_z&+~R$+DR|(HaW~s~N9288nUMoA^wAdTBB(2k%MkR`ysy8#-YFYRSvaNf z*e=N1q$0ZtdAgZc9q@2p!8@Qcy-+U>e>6LWHylA!HoUX~x7=C)o`YzT+oWsL2CmU} zOc_mt!mO{an;fpor^+!I-XBB8Ig-rG&9|8C+{5%X9zMa$HF&L6=u{qZBiWm==I}5p z`*qi3M3rFwn7h+E?3^eW%t=HMB{zKGwGx{m@;>*jmvKUYj`ra)>fsgH-KO-KLS2;a z?Mk8ta;f0W90!h)9!}CF+Ew?gGJ(K@-eR|txS~DNFXs+1;46IZCViRhoP1V@7BibY zFGlX~Zh*6iBokIV@O`-1I_e_J%edP$w`3JLD%Ee4%2tXTI{J})98;UsKoQLvn{m5J zTV>pZ&dX>znybLc#+H@dlPY~*_HbQx8p!G;swll`@#>b?)(uvK;x~EHRvho{W+4FN zpOmnZOvOQ^!bF`!bxu$S5C*YgM$zY9hK~)P{DGU-ehQ@8x3-I6Vn|( z7j-U|CJ=Uv4iJE&x!Ug)g0qbmzcwzgN%~fwPiZ@=+C$crkD;*tpa zDg_&+DuM{i=TkAd)k$ss%j4lcs@Z|-g&S6%V}^ld55_z9#o5)GuO2SjJ}C|6dHP>{ zkF^iTbv#*_j_{WZ*63O9vsQzw zS*@lRtZu_*A5`yi-)4PVxpR?HLZ%(5{;bmMimM8De?53>lKgsoT}D&*+5Vuu8XJ4- z6xL*vYfk%*osvfDh5UB3_cd?4T)bz3$Dns`opbqns zd~cV8fT-C$udtljXkm}|ZWHoSO>VEt9Fw-dWRv9Cnzg>w+4g6J&)J;R0X#s78n3_M z8Nr!i+phNpcvc#lW3?@!(n!}^x{Fsd`XP5>dkRphht)2!FSNdg+#cJ(n!2K$J4yns zjWe*Q5UI$aC(StHYk8k1KO^}QXZAjJ287(~7@d7A`MsCYJuSSJ>(yUQa%i@kpJ6-Q zfm?Ibpt*41veq}%=T7YeJSR8xQu}$zLrNO+AuoC5WOU>P=13MJAB)OjzSBuAgwg6V zuQF|;$5UHxyAT$Ec_7qj@v8S)-P<&g5Y%mW>==x^TwB2W zg$Qhr{iBO4E~0f0EP6Ho{bwZ?N@-qD{DnR^iOv?_G$Wo95U5yzuJl_Rv`e(zQaX-Wp`qE#T!E4YadOfJ9U}uscw});F&R_~C_h$lhZZ7IliCn~4D$fXtvjlAavg zmw2*_6Axr-xA#lTIrldb`Z!O1{%VI&$soEdaTBu;%q^oCmF`nE~P zAg7eGCOcu-iSYQd;koVhbuR#c(~P-26+I^__Vidt>9E+%Aj>*hBo?ZE{b zF>gXoa3^l)6x##^If-i@D~w#OC-!<#Jc=z-E4!^JLL1?Wdss(}%!9tF4OI^7{RGur z%~$s~AASNDT&-Zo+$mzN$X@_+KgU3^anFAahRju&xVE1ZN00?mh|Z^$YvQ1$eIuN( z-qWDu$P(ZSe+(!hW$lY@qvy!MKT3OOTOq;{^WyW%8BJxu%ar+BssTbYV}dUX7+rJP zgFW+UpY?Wkue`}yxu=yBdl+D7mbiYGawhJsOt}2P^e|kF#~FQgGqCZDhhda9jieI! z%w`AsLB3C@HGI)VIh@glWXDww;y2Q&yYBp^zEzix3R;hMzDU2Jvwq)nMIPO_@rQcs z+$7{hGW|nN1mSwIe~4FmkM3+f@j-lEQ<(vZ8!Dk52?%_+3^w;Oy~3p_2?)L!!ndBf zBhWSJ-cm=SEX%{o&3@ALe{Dl|m=Y0Dh&vtj{ zN#oH8aKU-E+Od17j__hQs{Tp7O1udr>7)r8Z&prEZbMc<9qfG3j_#7I)3b^<>)6PF z>fC(VjX{P;P&Bdr`Q3xMft9 zF$$ZN+554?y8iGW*FbmqfW{ZEEW}8-Sc!_1;!bE4!-RW}`Ch$yo*Pz7e4igpz@;tvdJ`+-OGxC-y4*!0jL2G6)m!W++A*DwpCpCc{^NY6@S@P?ygw9AartcE#YI`;KZIXC?Ec_ z=C1V`YC2)}8j``Px)}bh9rFB#E5i(Xy-w(-(f5Tk6aYive8L{GXtVkSSE%o1e#8zr zX~WvMG17}evgubLRlY1ocku8V{W;f)~fc#uF^vO3;(wgR6W)EVZzK}WaNn4x4 z8xKQ39fUy~xAv^wsd}iQcqEi1?{H*RMcvS=yD#0M=qy*^bm#;B+Q@`@*}2^pslvQ< z=C{5(IzrDu$N4iOy;fy+k+${J7n6Mp_}ep*6tY8y=_FMOCcgmT5xp^C6!JE^TNqPX z#zBz;naMY-A2Y7$im3u(v@~YLmro-y1my7FCFCmV-oTKz7P6G|Q~SIin_Q@r zW05}?eP+*+aY^s>{c7EeZj6$QDn{n&9SKG}Rlr#l${0XwyG^Q*Y%c??b{mVeI5OXm zSp4*C!-r+}v!^|0eXPPcHB3jt&FrY={*iZ_PMHy~4yd5-Pc?LPu4X2chU|Pi(|pb< z4+_m+l`b}21*tRcGePGNyi_B0}=BRKxZsh9L$OY7Z>#!h>Ftc(??V1GVfHU{`=@ zB5VE4M5w1hnse5po&7_2`fvN^IRB@xj##+CE?CJ+6Ju`I30ef+dDOuErMTf`g5^|2 zeEmuTzq_)PlhG6C?a%!Am#qt;YtP`-=U2*e2emUz)KUc-a>h7@^HR-7GovMoZV#m8 zi;Gf7WHxzR5bLrmc*!pkS()1cSc39t4KbBG3--j-TWG(Tkku3+Kn7M_^afjS%qhN8 zQ7N{}nbyWy`Nr5>L)?TV3BH3{#kj55GSNC>w78cJpT1Z zh|;88_USi=;e9T3^CKGh@Rp9XGsiZ}3dYW*Z8dHiY%oMv^ zd1AbF+#|(d_Do{}d=%YaGFWrDU_iVis*0ss?XBDC0Wg%DB`j zF4XD%G;Mz8>7pYKvg0~aKyi1XcR2y$;k8bFN&GuKvw^&K4D=YK;eo_NnNpcmv6His znCo4@lX_BK%I*YN9+pX}M&g?(=}$$N6_{;zS2XDx!~D9A7$dsVKl4{5|2oOJuem;! z14phj-VR=p(Zw3njhYkRR62BsY(C(FLdGN2sHQSSU0Ug=pK7&Mp(_lr2-udvST^c6 zjeh|V{`~rLF5N2|z593il08k@CroMl-U+b^x*7%TF4!CyU%C+eyns?KJWv}5oz4@1 zk6m%j1Q37)|3)5vESwo&18($8SHo;pIwQKff5Z%ScXf4*llskk96@=GnuB}smxkp7 zz()Q*)!{$^@*H&(1F)l0-hU`zTizAhsSp!R#ozPIQUWo#$4+7Q{18He7t!iVOmYUp zP`C~2etr>UZilbnd`i#j7CMR1JISiXFCzOV1lul@6|DjJ;u6#$B9$&p=LFGVRG=FZ zL;eN&_8e%tb9;S@=l>F15anWOwmiq<-kvf0WU*jWNMR)D9jeWskM{#L3fYv>n}-)? zf?kP~R}B|M``7;GcVkx0c8Z@)4>DWwU)yi@vJM8$j9INhQ{=0%M@%~N(ei%HwX8UOSbE<>Fs&i0r}KsN`*zc|B+0 z>zX1sw9SS-3U>OAOL>=3Zjb>f!4`Hs*Z_mDR@Go$3*zp~#0OYaraR$cBBFc2#nZ^e zXeU2Ijax>7{_R1!VR7pIosV|Ho0t$>zY<4v;)dfgnY3^>j;)OvhufQcaJ~;?+Xc%$ zS|O~ni>o(?F0|bLc6W3!+YUomR7uz~f#q&Hc4ns~%*Ts!oc7o(zQdA?A1E(CH!QW(z9RoW||-o+>TnnoYmK#I$5O=kug~sPhE|te>T+ z$sb9VQNw>ox}2rWX7s*9g9S9`#Uz$4Hxr-k9$%G?^OBrxq{KWUBcW!o32ung0dwgu zwLYus90W8f=(~B}?vy^l>R*ZLdvU~9o>6fUlVpn*?l0`9qZ~I8G+-71sdi3(u>lJO zG#H**UGYCkKcvmG$;ED6v0)Umbt9a99eZd|CCg*1XVPu;xl@hz%=wp+wR9i_ya9h) z!@O5TbBJUaw)yrJ!s+6nZX;SU>LAzMklZpi@albar>;~zdJSZ4P55v+w5$2bm=dOLgs6aw*4k8rKe(uWifcWwP| zvQTm3Col1YFjv#HgonQT3qWCZGk%;c%t_$_gfVojD#L4&=w^{>fkTpHnahPN(dNDn zZs*)gY=9sqpXaxjrfonmA&=3!$+SiB;wHbAs`4Rnye>(OvC#PbybUMGHT5US35Jm6 zPG5n6v^L4{-L+obY)sy43!u&+s&=SU{4)iD&@ht6>)Ifk50~er=sLh!{J!gU&c40g z9y3EeLs!rFw%oPh%d0XoOh}|&S%-cq-oSp>$roas(V4cdZ59O;I1Xh#@T$6o{*fR{XQZb?QdbqW+Dct<0X5vRz;Z4uhP>i>Sc%t#A~py zLs-&G5Q=)dWS=<7b3E6MA=>2yqWfy;;iBmE;GUe8rs2ty8aNB*cJ3m3KA?ym>3Kf# z&2av!IvDdzL1r+Tj6gy8+YX&`-;A`*kCZg`uAVe-`h{5@U#(m48KGd%FqnVMJFP>m(8W2t0VwY_J%K4e+ z1l<|l`vjr1ETQtZL0+pifT9KPiASZ2AE$9187#;T;orFAzt8VH#(RKz+IK&rT7NX6 zzoumLfN(YGX}d5a<-f#2h_lnYcdxF7RBwRn`+>66fa%cSr;#6Ox2JFCrpi?wqjY0zjp3E$%HDS2CAz)nK9+c zh>oCq=H75tHc?jXJe>z5RLVw-LdQyQ_1btyb0)1L!+Ykt4HKhq}wu%{9;@r0GIet4&1Qu*BpdpOVu+nT$tp>8B}6x z-q-6RDBlybLp#*N1Q&HI z4+n-m_f~NXjEsFe%Z(moZUq$l)K`RmP^!|AO5>5In#uy{OYp9KBbC zTy;9C_OM*rPVw*Od9ZBk)Uj3^-w)R@Pu^VO$Sc^MlU zPLRwi19N!W%7k&`08a~Bix~Zb`q5@vyd&iz+ac?CdF~nqsweYyql8`MnqH|HPU^(w#oHyb;!GW6+((#RvB63(1o`mSglNY-fc>*4`}OwahpeNJshhr^@p;-1oIA`V8osrPQ&%R zx1BD2TfqQwzD|8a&cp*P^L#RsES!L!aXE-WA4rbKyH! z`w_k|Ava89xweCDIviXqQe4a+h%r~3lOQ9SdJ8NU~Z`rThmx zVzn*;`}4!uV@Cak-ZM{dIelkqR5uB)k~qxANH5lJ4lqyvv;Yjb8Ukhdn5Y2tweP#B z2Sd8uM~K}F3;+dTHPW?5K%`;E`kMK(<0!*w8xXdF`sb+W#@&)834|}GkLF@qH1PT? zQ+I)R5#4ozWCbnLHWS055*dlzV#^_Ggm-@ug<0>v6NSImvH({VqN&zBudbS@?Xlk6 zR1(0N3YIMtk3MSnPjYZQ@dkHhuOCpV% zU&zQvazflzi893XncH!(^tok_oQd{xeN!8^!d{XJT8~c^<)jD!rt}<1|McXs>Fk3N zXQPQS+3G&g+il=s!YL{xFIDh1v3fcEFAXnX)P|o~uJ7?st{nRqKp+ESz>79nu7UoU z!^9IM>RqfPFvt%Am^yw#n9tV-jzJE^F=Bi4eYzr$pB8|wU_j((zX6m~zYfg%pOqRf ztvX1`1{)sdHH>#5W-vzAS)K-H62Ga^|1qKYXFS^2&V}!{*tgYTu^QGOI3*20TE9i@ zgvOpOl+kb=Krnh7YxaT;zJ?+f9__Pi^12j-An4` z)+6HPtqJNoxb~+&mz`f*9&-b+N7qJ>8^(xU{cK<`YT?A{dIdv|YOxo*`g7+O8cT~& z-#N85aL+UznNY^xOyzL%!ClGR#adZOEb59kuVBob%~(<6b$4fv4bIXBb*eRkY3htV z`Wv;b_paol9*p?VN6wOFp4^?bx<}KtDIJ@S32(EFc&DzcF2N`Kq=fqDM+>p#~D@*mvPm0ZNH+HRQP3{xqgcZ#exN3!Y7+{Gu z*NVeCIqMl#CCGA8z6P}GPi&(F*r$ zVjMlN`1@c)0s+{%G);LhA`y`CvhA;FD>*AV+h-s*@riHj>X>Xfzs;D-2p>`J=d-ru zOGfX%GN2kzWy&QK`QtHai`R%GB%675-Ww@S6z8bbU|`jMv`Ot$pwW81%k{J43Dir5 z%w+Cnq5a)wvGeM%q3uZxy``o>B?}-FhBYowCSbH$rFt@CO||iS!qsU)@1f0{LPvTG zi8~{`lPx|0r4Q@1PTSK?OY3Lg;)!(nqj%~+`gW?)P2xqZQJG>0%X^PFc^q~&lw3z>y-o!Z z`}o=}#zob3aQ5y}eHVlvx`Y@qTLrqq_=j)E-Vaq~& z<)5v=sj;pg??v7X5V@I>k3??FKL@Aub%sTWoAS!(Upe;_t^UG@fPLQ4Epor>2vrua zS20-zuTt6Jue^!*bP{0gRX#qN=Ajw-nw$GQuW9W1SS3iAlnU;u+G_IMX%7}4Sx^@` zcA?<&Yg0|iar8*9(|gHEoEZz;5D+Lj==bOVm08JW4n~+lBD0dZmGM7gK;b~L3?z@n zhyK~vAYucK_q&)FIRmi4f;)rrU16!zAbvw^lMv}cu(C|G%N?lDf5X#7?9IQnVQ{z1 zHP6$H7ZNUu3jVWM12D4pI8FR0z4(9ayYSyjy#KmI!~ghbxYX$bj3El%DYLZc4`4eC zkc_0FM5&lj0PKg=ZteQ_gG_md)ag778w1yu%JLE!;hQC8QW-<}2Vl8FnNJJ$ZAwbk ztk+}hcgOGY6ktI*nr#S(zmRAGM>{r+qno9llko{LYp1jt^mxPkG3L7=_DU$7qzCEDKe$$O%9Ueve2E7m@|YITASe)bbq z9PxhW!~AOZfJ|>Yc27)p?Cv-B^tT7&-c7e^>y?+YO1+!iJ-03Ej#m@cPj^O`qxQX} zT+XJxZcj5KU7X87wC9Bm!-)lqApQofjU<#)T;CQ~4-)ni(>;mp76VqC&-*e5$aZ&X zlH!SpKtr3!K6DiP&SIS`je>9m9MY>pNxWCzWpGDXv-5&p*m~lilVrqNTfgf35K&`d zdK}&ZzCp(bf(@Ii3g5bMViT73#EZn;jZ4_5&ii>Kz$VlC`HGj;gxyL=28oF>O% z+k%a=3rX*%ZafHcU^aH;m~}b*u1f%09 zXMfZbVxv}#Yp&^n)l~Jx_xeH%!)8`ln##>mmvdRAje!2eZ(Fco?7};U1cOkv2pdZ9@>0RpkH%n|fF@7@KsSg5(DrRg8HK3ps&PAXm%JZzY+ z{vz{yZQy>U+Cu@SRGtoJ$0O?bR51l;GG8jRr8RnqbvlSEQk7z3XF$wT@0{+Y3-sPV z1Gp~<=~i=1{cH?R!?^FZHKBxHo0QJ$LVEI*61f z6{aYV@R@Dc5GnprWY;P{!Ex+7z1O5@MO7va@~rJ`T4emDYjPf|XE?Hd|B*?5*!g5V z;4VPy24Pfny*#Y@c1?lz{gR(vv&Z!`J~Rv%r12fMPKY;XRxs8%+si}x1Mm`~w=~jM``$s8?N4UvFsyNH!tC_R(V7f~FXrR77}rA=EpEwT!bgpQi4Ja+1F)4ZHJ?1? z>k~YeH86U~#`eaMp;q&*Xl$ZHcWr;ZksDV(`~TE(-9b%ldpC3`f`Uqu;)PJ8BOO8! zk=|@5y+wr3q<5rT5WPz8T|rcOFQG-GR|zeIDxri9p%dUc=zZ7wz4yLv-al_bG2Q-L;KqNN2ctG+nH?l zC#49q_S*x!0AFXc;RcLia(ijY#@ZlK`*X&mi?dUcTk`N4QGW-5$b)goy&8QZ*o5d< z`_vJ&j;U?m+7pm6wP~Un$0HgYQ=fE}5`f+m%>nZS&zg%yzXH`_>pkY_icfMIf}vRk zrVZAk#!OVi;g`PHhW8OMU=vs8=|m;s%aiX4D@bArG#xeC-s--lN{?IIYvo+5U$kI^ zcs_{1(x20xq?(jC=kr+kG>xu9%XY^HOc!=0;$D|i>fB=n+%^5YVVk<)xH-2qwhLU_ zn6m=U#kGNx8%v*6+3J;;`G}g1&#hPc^9mph7BCs7X#~+Qx?3*Y>$8SyV|>ul?GM+b zoE3q~eG2Txv^(+*iKOEd9s`z>Of>qVKRg?8_UgjTS0zjB4vxt()`;rT7(Z?J>E;7D5OCq~kwd42X7< zZWsJQNyWv*Wv??o@;NYmyv{v;qQQ6c!#mi{+~1+SEDu_nI!zy77*3gJ7kc1jxK)7%V(i27N7e_%(LBwqIUA*UvK1IP)xF zqIS+S8W#DY`if@ao&~aF1*UTkoA%XGVF5$h8hg?=5#;sTjVHCESzhgnO7>p$yX~=)#OGN|+2oq?-8^zMrB_90* zDI~5`H%Ixc;)ZV@swf|-Ad3KgVhyRD_C^=giWMpaQBbp7F#>@U*M`8tUUR*p;8eH( z4Xd29sW`BxW4RchgOlTZ=rUj8{5(!bx;O>c7;M)du^9k6uqNO?0RGX~{N}r0-TVAi zAkd5P>~`RT2N$UH!c0-_=1p>49PtEM^Ww(lmjcZ1NtrzO{t;J$`NkDsC<+3-$+BAP zOOIxS*8|&GJux+{`V_yniYotJ+TE3U|I$uZe&TA1=HeA%qG0Fo*z3 z>8p%eMMdTEC1Q?ZdOs6xly(~wn^sE48|9x`xW}1^{cM#to8D$*WT*up4EsYy=mNn& zhmNWxD2V>b!Qhz+m(`hZn3Z(0T)=HE*Ar2n^^csc_eDLV1?M3Xv%L(JS_u)3>ec5SV-_JsvBCmfZd`o~A#W-NOKFg}6(r;DKFbo74Q9druA(En~F4YFV+}g8haXN}FzOw|e4z zisj(a>quTxcClC67bXZLB{^rK`wLR-d6>NxWgYf!*RRbVy&*wf*dMQu=xA_RX-k{9 zwKCppmNw8THUNoe_6u^v8jHU1X6ISzG%hjY$bQ47$_8o6{dnMnGdeEEu>yBhFp1s5DEz5d zbdNStSizEc)f6?AoN#Eqfe5ncDMLFAn7oeZtT8wpMDan=DMg)p2`L1+@CD4)QVQ*2 zM<%8qy*_}bwX+?A5Ir!nSOade=+UX9XxZr|D_6><_N>Qs{I$~Rux=-&<5{ns9Ar#a zJBdoG6_=E7S)y(YgmlW~XC+;ohr=R|_6oZBTG`q=I;Bmd&?EVG!1FBZgC zb@wz2TIyvQ$baLVT;F9~Wp?qUeqgB=<{p%WPHy9hm0$HG#hHr07+%(bjY_%;< zHznZq%*`d~%&Vqd6NfwT5rzo_W8-x0kcIT+|GNUK> zLD7?^;s%uS^cUypwc!b7CO%YC0gDdsg~19^lABU)diK0V1!gg&@hyOJ$*KVUWvBYX zI&0VJc|-^q@LThFrMF{T4T7e8q$Dca3X}o8kgsX(e0J)eeAET?KE!MOc$51Au62W_ zWLiH9L3sX61rPpCPUt8&?F-8tN8)vjk6PtDc172^gqXU7Qal8j>L7Q>XFOWF#JeMM z4S%9)yb{l#@$Y0zzLEJQzqGRIxGow&;9@;wJk}=|u|=UkB;CEeU8Y@FVbh=S&J%nE zhS_~`0VaNW6(U4%u@}xayF8pPk>>&%Rw#J;cZpbyzS<3Fs!2F^;L(UA#$&I_Z%LV@ zk0XU%%j#ZLGxADHzt&3{f2FqlR6M}sJ#5MQ?Fo_I}{h9I_(p0!tiphKsB z{{5%!H0ds$U10Fi)gOY|9qrG*SCAJv+e=1o8>u#OE5J|c@G~@7d60q8{~z|UFAhH?qq7AhUrC|(i6^x)0)W#klEvx;t4GqbrcHiV=05RvW5{+V@g-dK znRb-+nqjnI1pMJ490~^p_6DxNECWxE@EfZEWa(><*aJ(7n@aMsmwYEO;V|>S+2_EH zl!X@Iv{U~3Y#6dvRLliED~aFkE7Mjt5QV?h?OY2Fx5htfnsiKNxA4m(zqA^Vhg=im zv|u{xOLckvU8PnE&JM5>QctL(GJ>%d{8}`ywbzr$)y6ScUH16Z>SplE2J$BPZ=4H)0~CJ#UiHP{HI^Fl;7|v? zx+`~L)Ft)Qau?D@jjPhb=(Xkom6-!L0QWaIIrz}tR4LE4qk)yU;fQTX{5bipDK2qP zH1oM9+>;~SX-Tu8y_6@p=bZ_#V_gWKjuTI3_A$AxHW-@I&HedD3&r-10X(ybf!Hj+ zdELMVjYqv~RYqT*1kU%emGTF?0vg-o$}>JpMX=P z+yhQ^?sgVs$$x4e8AxK85no(tLh0OnE~%wNec95@FKS;Ld%AVfog0hU8+EzaGvX|4 zJS^cx!URv8x3g=vn=C0l>2gyogp;P>Q6wn&nFYuVnd`H>yR5*t^0}M-I>i6rB{>Ca z8E-Th%~U+j?1<%@hA0|m)O@LjYR!%tam(IE)qw@h83YmuI1&vbsQmYw+l@0HU3#}k zc>gc$vmHiTp9Okj<7;-iE@54RDDcy7AoKKPT88?qg_W4z$pkVQR;l6?!S?%2Csuv8 zoe+0d5>y-D)IDYSs^K(%44EOi?4(RcWG+rjIuA>r)2#gIkt<+_!*aqPy9heX zKeY>>U;z0NmTHy7R*=GU;&sGkq`mei0Q6#c`MiFR!y57rV#>bvm*DwJ_JlEO4De~x z1&{n?T7O#Cwqw7EB7i~thyjh)tZEJr;eVk!x^*xY8E8IA08&g{S;Ar?t}xi6tF#@b z8y)U{bC|4qxIWbYyZ=r6XzMSf^Rol!9V#$zO>98~AkgMzhW}32eRZN1M+9#a1eRAM z%+EtpZ|DB&b+AAnkZ8gX2Xs~E8oJuyX5INp#Ywm)(7UU^`eVcc2ECt6yRfTxAqWvf zM7GLQ5ug~hHm5QO)D>&$NJ(Y* zX9Q;5O6$XBNGRu8ZWn++&%}2a3bt-`50vz4txlika|cXARCfU=;KQy67em9q)g z7Qjz_0Q1FKnYOSd&TipL6zyE@S+@AhJMBkXFnanBzRD|ap|=*pHjc)oUyl`G7Uxu@ ztj^P6i$U_H@<50C;Ugk^c>kTG@wr({E@meixnO31&5VuvM)`ihtQRSI0bf6`K`)7J zBeD|APwch8O;zTz#?nxVcVlvYl`~+4*~I9jd1h4Bm%9(Z>31t+zFf?7!MzG#82y~- zS!uOwxEqi;9H?XJBDRCOJNW`((xi9k3h`Bi5%1V)4HV#Y-c^RtT0Tb$r{vGJ$Piy* zAXTHLH^weYY}i#9AIYCVJdzOVsiXm8wg)?16ObxTaJnD=4RC7H%%ZzIYz9~4qg9^0 zv{C-H<9T5wU7XBXz}==)cCP0qRHq^nriA3waa8 zdm}ZvmJYwb$-i$U!7Z6~)X~CFZtjz!9iD9*_0CPx!1XRZ%)8)BNw@tL!*>4DW8im7iIQlpn{=4r zi|t)lO+lY*T;~wSn!aIXlJZiRMpcdD>mDjN&5HDsMSgm9DnC9$W_>?~g2hS>Ws!n= zQ5m2kP1wFE-%Y}-RRFgKf!y`0n-z=kqva2729m&vqA2TXdxzJtUmQ2elJdRxy_61F ze<%E~rDHJ07JF8o8;LzVa0y_G-S=bHg&$?o1Rk7!;|{~rVqZ6OH!C;N%ocbO?TQ_h z$uHiR0JA#Qr)cM&$df9Q|DKY|SN2P$8m@S>hG8ael)7w%0rL!=T_GkFPo(h6yFFXx zq6W-gJF?qoR&1Y0zgs6>H)*uKPXMx)@;M)SBAWr;1Pu6JT?vdY)r;N14IzLnwo~Ti zmwrJCog^f6)F$ab@lk6*$cdJLSnNNO%A3h%6HS)-9GPq;?f~CZD*s@eb7s9Yxc2HF z#A;~Up8-*EGT8CC5DXNt2U8jdZA@hxMlYEF+Vn5~ZYMmo_Uea!{~ahbJ!bhktvXIy z>D?5*0QgIW3!%uDpS=l*8rz$P`S+E6*$%cpi-r)=t2AKFw_U$S2Sgj|l$9oZ6&I2O zTQ!LYpu)+Z_@SuOJh8Ky&d7^hu4dO`96Sj(qSJB)f9t%{qpl7_Eq+gR(!Pis(?i~B zkC+{FDI`?^kpooKthju?fjQB7JjYC>uAFY|&hr$)@UHc7 z^R0!G^`lq)IU87`IBexxmfMj^*-vY~fBf~NhOLMUgiRa0%GK2D#_6&|mv&1ZvP*Eb zoUnW#$V?E4^;DxrcD(LXkg#SBP(6h_ea{K)TJ^onC7#R>ywch9OK0UbO0T!b>aqwOQ@td)LL@s3bR8Fyni)YoSaFg2*UeAFY|i4sUJGYM7XZB$oqq>=t3j zumTMzD}YH*cp#hDi8rC>j>}hvj`?l!>UMLM?-I=&%OdqtWW{9fz)M$8-q|Ki@*S%s zT;t@3?AYrwEl(mlElO1}DGB;qN-30s$zQ~GtKPg`RS>neT?NVUnNYV|mTraxuAQ%(W z#@@WK)GwNT?96q>B4>UL`E0t(sLQ>Zm@GNtX@TsJRq;{c)uS5a1!4Xp{I(Js@ zo1Dt^xPylIS>65ct9T1ykRk`{6iz!ZSi_gC_fJNbgj7-b!nqXqyU{iD;sLPv;MG6# z!QZIipS#89W%0SNZzu6~4fuCw~ z%@ELR6_r10eany{M$o|C`3+IPEFXC6Rd zE#wfemFufQv}`iGiy`P9(UcPA)O%iP>;MA&;qx;=Hfhk3Vl*m0YTc%aW@mi&A@EQM zg-Dnsnm`=)_JZ&~K*zfXBqViqN7#XG!0@Wdbb5r zZ=JPS9}GQVWP4Sc9FfCWfo>)yaC8bHr=1Xp(Mp+VR?9u@@phk{UcLrB4RhGMB{`p9 zn*=AMqxSs)AkI-yi5C6uko3txUP`VD&XS#jX+1Vt8IDGEeEU6ng5`CS#Kg(WlD!CD z?K*T$uZ5_odCAct0cc1UNdvad(~)W*zS+2ubIW?Oyfi$0HgpwQuI)=z-?|>o-tf*$ z?4s+{I$Ywv#}63J2FkJ!KhXFG}hO&JIG1)#po?XvUB6hnx}v0_RH z2=}A_`nc+qlr2c{J?#9(y_5_Z=o6Q^C4d4l;*+*G>kv9R)d|$^<1JY@inOWUUAfcE zvkwKjc9|*rFbX!eGi@uJPy+LH(BKfQ{s8Hk>S;y_9tWi&X&l9B|M>As>!O|mg$>h% z59HUjTS!yZhb%|}83;fF9pk5|G`rL!>3<~2AO-0zr&3P)J^uKlQM&Ng@Vh)>m8GQU zrH{xCv7`i|u=EdX=XIlD9`X0rdB{A$81e3cg};rm zQO%&2ubP2zi#y@h(R^=n&=BoE8HFFZ4)9BVX2TTm?|y`NKO;Th_0z*capOpW%m1ix o|K Date: Fri, 8 Dec 2023 14:28:40 +0100 Subject: [PATCH 4/4] add "our part of the pipeline" - add our component diagram update api.md update readme --- README.md | 20 +++++- docs/api.md | 9 ++- docs/img/KNOX_component_diagram-B.drawio.svg | 4 ++ docs/our-part-of-the-pipeline.md | 33 ++++++++++ .../pipeline-input.md | 28 +++++++++ .../pipeline-output.md | 63 +++++++++++++++++++ 6 files changed, 154 insertions(+), 3 deletions(-) create mode 100644 docs/img/KNOX_component_diagram-B.drawio.svg create mode 100644 docs/our-part-of-the-pipeline.md create mode 100644 docs/our-part-of-the-pipeline/pipeline-input.md create mode 100644 docs/our-part-of-the-pipeline/pipeline-output.md diff --git a/README.md b/README.md index 0fc45d7..328075b 100644 --- a/README.md +++ b/README.md @@ -1 +1,19 @@ -# ProcessingLayer_EntityRecognitionAndLinking \ No newline at end of file +# PreProcessingLayer_EntityRecognitionAndLinking +Pipeline B's python implementation of Entity Recognition and Entity Linking + +## [Getting started](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/gettingstarted.md) + +## [This pipeline explained](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/our-part-of-the-pipeline.md) + +### Full documentation available at: [http://wiki.knox.aau.dk](http://wiki.knox.aau.dk/) +### The 2023 report is available at: [https://www.overleaf.com/project/64feed8bda5b70b36afb6597](https://www.overleaf.com/project/64feed8bda5b70b36afb6597) + +### 2023 Authors +```txt +Alija Cerimagic +Frederik Ødgaard Hammer +Mathias Frihauge +Nichlas Blak Rønberg +Peter Bækgaard +Åsmundur Alexander Kjærbæk Thorsen +``` diff --git a/docs/api.md b/docs/api.md index 59b9607..cae1e72 100644 --- a/docs/api.md +++ b/docs/api.md @@ -12,6 +12,7 @@ The `/entitymentions` endpoint is a **GET****GET****GET****GET** + + +
Entity Extraction and Linking
Entity Extraction and Linking
API
API
«Component»
/entitymentions/all
«Component»...
«Component»
/detectlanguage
«Component»...
«Component»
/entitymentions
«Component»...
«Component»
DirectoryWatcher
«Component»...
DirectoryWatcher
DirectoryWatcher
directory: str
async_callback: func
directory: str...
start_watching()
stop_watching()
on_created(event)
run_once()
start_watching()...
DB
DB
EntityIndex
EntityIndex
Sentence
Sentence
entitymention
entityment...
«Component»
Database
«Component»...
«Function»
async modifyTxt()
«Function»...
«Function»
async processInput()
«Function»...
«Component»
Functionality
«Component»...
entity_mentions.json
entity_mentions.json
Text is not SVG - cannot display
\ No newline at end of file diff --git a/docs/our-part-of-the-pipeline.md b/docs/our-part-of-the-pipeline.md new file mode 100644 index 0000000..fcf5e94 --- /dev/null +++ b/docs/our-part-of-the-pipeline.md @@ -0,0 +1,33 @@ +# Our part of the pipeline +### (also available at: [http://wiki.knox.aau.dk/en/entity-extraction](http://wiki.knox.aau.dk/en/entity-extraction)) + +Our part of the pipeline is concerned with Entity Recognition and Entity Linking. This solution utilizes the [SpaCy](https://spacy.io/) library to perform Entity Recognition, and the [FuzzyWuzzy](https://pypi.org/project/fuzzywuzzy/) library for the entity linking. + +> Every following section describes this pipeline *in order*, but first a visual overview. + +## Overview +![](img/KNOX_component_diagram-B.drawio.svg) + +## How to get started +See the [Getting started](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/gettingstarted.md) guide. + +## The input that the solution takes +See the [input](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/our-part-of-the-pipeline/pipeline-input.md) explanation + +## Entity Recognition +Check out the [Entity Recognition documentation](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/entityrecognition.md) + +## Entity Linking +Check out the [Entity Linker documentation](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/entitylinker.md) + +## The output it produces +See the [output](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/our-part-of-the-pipeline/pipeline-output.md) explanation + + +## Other components +- The [DirectoryWatcher](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/DirectoryWatcher.md) +- The [Database](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/database.md) +- The [APIs](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/docs/api.md) +- The [Language Detector](https://pypi.org/project/langdetect/) + +## Future work diff --git a/docs/our-part-of-the-pipeline/pipeline-input.md b/docs/our-part-of-the-pipeline/pipeline-input.md new file mode 100644 index 0000000..a0bed08 --- /dev/null +++ b/docs/our-part-of-the-pipeline/pipeline-input.md @@ -0,0 +1,28 @@ +# Pipeline input +The pipeline starts when a new file (article) is detected in a watched directory by the [DirectoryWatcher](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/lib/DirectoryWatcher.py). This new file is produced by **pipeline A** + +## Example input data +```txt +Since the sudden exit of the controversial CEO Martin Kjær last week, +both he and the executive board in Region North Jutland + +have been in hiding. +``` +> some/article.txt + +## Preprocessing the input +Before the Entity Recognizer can use the input, it must be preprocessed. This entails removing newlines and adding punctuation where needed. + +### Example preprocessed input data +```txt +Since the sudden exit of the controversial CEO Martin Kjær last week, +both he and the executive board in Region North Jutland. have been in hiding. +``` + +----------- +

+ Up next: +
+ Entity Recognition + +
\ No newline at end of file diff --git a/docs/our-part-of-the-pipeline/pipeline-output.md b/docs/our-part-of-the-pipeline/pipeline-output.md new file mode 100644 index 0000000..2de1c9c --- /dev/null +++ b/docs/our-part-of-the-pipeline/pipeline-output.md @@ -0,0 +1,63 @@ +# Pipeline output +The pipeline output is a [JSON](https://en.wikipedia.org/wiki/JSON) structure containing the entitymentions and links for a given article + +## The [JSON](https://en.wikipedia.org/wiki/JSON) output +```JSON + { + "fileName": STRING, + "language": STRING, + "metadataId": UUID (STRING), + "sentences": [ + { + "sentence": STRING, + "sentenceStartIndex": INT, + "sentenceEndIndex": INT, + "entityMentions": [ + { + "name": STRING, + "type": STRING, + "label": STRING, + "startIndex": INT, + "endIndex": INT, + "iri": STRING? + } + ] + } + ] + } +``` +Here we see a file (article) contains a language (detected by the [Language Detector](https://pypi.org/project/langdetect/)), a metadataId (forwarded by **pipeline A**), as well as a list of sentences, further consisting of a list of entity mentions. +> _**NOTE**_: The `iri` property can be null + +## Example [JSON](https://en.wikipedia.org/wiki/JSON) output +```JSON +{ + "language": "en", + "metadataId": "790261e8-b8ec-4801-9cbd-00263bcc666d", + "sentences": [ + { + "sentence": "Barrack Obama was married to Michelle Obama two days ago.", + "sentenceStartIndex": 20, + "sentenceEndIndex": 62, + "entityMentions": + [ + { "name": "Barrack Obama", "type": "Entity", "label": "PERSON", "startIndex": 0, "endIndex": 12, "iri": "knox-kb01.srv.aau.dk/Barack_Obama" }, + { "name": "Michelle Obama", "type": "Entity", "label": "PERSON", "startIndex": 59, "endIndex": 73, "iri": "knox-kb01.srv.aau.dk/Michele_Obama" }, + { "name": "two days ago", "type": "Literal", "label": "DATE", "startIndex": 74, "endIndex": 86, "iri": null } + ] + } + ] + } +``` + +## Sending the [JSON](https://en.wikipedia.org/wiki/JSON) output to pipeline C +Lastly the [JSON](https://en.wikipedia.org/wiki/JSON) output is sent to **pipeline C** using a `POST` request. See [the code](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/e442dc496002b788d30f996cdfc87d36f5bcaa35/main.py#L32) for implementation details. + +----------- +
+ Go back to: +
+ + Entity Linker + +
\ No newline at end of file