-
Notifications
You must be signed in to change notification settings - Fork 10
RESTful semantic tag extraction application for SMS, Twitter and Text
jgosier/SiLCC
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
SiLCC =========== Synopsis =========== SiLCC aims to provide a learning NLP based relevancy filter/tagger/classifier that is applied to the various information sources that Swift aggregates (RSS, twitter, SMS). ============ Installation ============ To set up a silcc server instance: 1) Create a Python virtualenv 2) In the this distribution run: $ python setup.py develop This should install all needed libraries in your virtual environment. 3) You also need to install nltk. Download it from the following location: $ wget http://nltk.googlecode.com/files/nltk-2.0b8.zip $ unzip nltk-2.0b8.zip $ cd nltk-2.0b8.zip Now make sure you have activated the virtualenv created in step 1. If you are sure its activated, issue the following to install nltk into your virtualenv. $ python setup.py install 4) After installing nltk you need to download one of the Treebank Corpus for the part-of-speech tagger to work: $ python >>> import nltk >>> nltk.download() This will take you into an interactive download environment. The corpus you want to download is: maxent_treebank_pos_tagger After downloading it quit python. 5) Create the database. If using MySQL: $ mysql -u root -p Once in the mysql environment: >>> create database silcc default charset utf8; >>> grant all on silcc.* to silcc@localhost identified by 'password'; Make sure the .ini file contains a DB URI to the above database. (Please see instructions below on using SQLAlchemy migrate scripts to generate the db schema) 6) To start the server: $ paster serve --daemon prod.ini Look for any errors in paster.log ======== Placing your SiLCC Instance DB under version controll. ======== You may want to place your database under version control so that you can easily upgrade the schema, should it evolve over time. To do this you need to first create an empty database, and then place it under version control. 1) First install sqlalchemy-migrate: $ easy_install sqlalchemy-migrate 2) now place your database under version control with: $ python db_repository/manage.py version_control mysql://silcc:silcc@localhost:3306/silcc This will create the migrate_version table in your database and set the initial version to 0. After that you may run any schema upgrade scripts including the first one which will create the necessary tables. Most data in the SiLLC distribution can be recreated using scripts and data dumps provided. 3) To make upgrading your local instance easier its best to create a manage.py scripts as follows: $ migrate manage manage.py --repository=db_repository --url=mysql://silcc:silcc@localhost:3306/silcc Since the manage.py will be specific to your local instance it is not included in the SiLCC repo. 4) Now run a db update to generate the initial schema: $ python manage.py update 5) The Web Service API needs to have one or more valid keys in the apikey table. For testing purposes add a key with a valid_domains value of '*' (valid from all domains)
About
RESTful semantic tag extraction application for SMS, Twitter and Text
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published