Skip to content
This repository has been archived by the owner on Jan 10, 2023. It is now read-only.

Commit

Permalink
Python 3 support for SLING API (#366)
Browse files Browse the repository at this point in the history
  • Loading branch information
ringgaard authored May 9, 2019
1 parent 727b108 commit 2591d18
Show file tree
Hide file tree
Showing 41 changed files with 522 additions and 398 deletions.
7 changes: 4 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ language:
- cpp
- python
compiler: gcc
python: "2.7"
python: "3.5"

addons:
apt:
Expand All @@ -13,10 +13,11 @@ addons:
- wget
- pkg-config
- g++-4.8
- python3.5-dev

before_install:
- wget https://github.com/bazelbuild/bazel/releases/download/0.8.0/bazel_0.8.0-linux-x86_64.deb
- sudo dpkg -i bazel_0.8.0-linux-x86_64.deb
- wget https://github.com/bazelbuild/bazel/releases/download/0.13.0/bazel_0.13.0-linux-x86_64.deb
- sudo dpkg -i bazel_0.13.0-linux-x86_64.deb

script:
- tools/buildall.sh
37 changes: 20 additions & 17 deletions doc/guide/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@

If you just want to try out the parser on a pre-trained model, you can install
the wheel with pip and download a pre-trained parser model. On a Linux machine
with Python 2.7 you can install a pre-built wheel:
with Python 3.5 you can install a pre-built wheel:

```
sudo pip install http://www.jbox.dk/sling/sling-2.0.0-cp27-none-linux_x86_64.whl
sudo pip3 install http://www.jbox.dk/sling/sling-2.0.0-cp35-none-linux_x86_64.whl
```
and download the pre-trained model:
```
Expand Down Expand Up @@ -38,28 +38,30 @@ git clone https://github.com/google/sling.git
cd sling
```

SLING uses [Bazel](https://bazel.build/) as the build system, so you need to
[install Bazel](https://docs.bazel.build/versions/master/install.html) in order
to build the SLING parser.

Next, run the `seup.sh` script to set up the SLING development environment
and build the code:
```shell
sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python2.7 python2.7-dev
wget -P /tmp https://github.com/bazelbuild/bazel/releases/download/0.13.0/bazel-0.13.0-installer-linux-x86_64.sh
chmod +x /tmp/bazel-0.13.0-installer-linux-x86_64.sh
sudo /tmp/bazel-0.13.0-installer-linux-x86_64.sh
./setup.sh
```

The parser trainer uses Python v2.7 and PyTorch for training, so they need to be
installed.
This will perform the following steps:
* Install missing package dependencies, notably GCC and Python 3.
* Install [Bazel](https://bazel.build/) which is used as the build system for
SLING.
* Build SLING from source.
* Remove the Python 2.7 SLING pip package if it is installed.
* Set up link to the SLING development enviroment for SLING Python 3 API.

The parser trainer uses PyTorch for training, so it also needs to be installed:

```shell
sudo pip install http://download.pytorch.org/whl/cpu/torch-0.3.1-cp27-cp27mu-linux_x86_64.whl
sudo pip3 install http://download.pytorch.org/whl/cpu/torch-0.3.1-cp35-cp35mu-linux_x86_64.whl
```

## Building

Operating system: Linux<br>
Languages: C++, Python 2.7, assembler<br>
Languages: C++ (gcc or clang), Python 3.5+, assembler<br>
CPU: Intel x64 or compatible<br>
Build system: Bazel<br>

Expand All @@ -69,11 +71,12 @@ You can use the `buildall.sh` script to build all the source code:
tools/buildall.sh
```

You then need to link the sling Python module directly to the Python source
directory to use it in "developer mode":
If you haven't run the `setup.sh` script already, you then need to link the
sling Python module directly to the Python source directory to use it in
"developer mode":

```shell
sudo ln -s $(realpath python) /usr/lib/python2.7/dist-packages/sling
sudo ln -s $(realpath python) /usr/lib/python3/dist-packages/sling
```

**NOTE:**
Expand Down
12 changes: 6 additions & 6 deletions doc/guide/myelin.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,15 +152,15 @@ data = cell.instance()

# Set input.
xdata = data[x]
for i in xrange(64): xdata[0, i] = 5
for i in range(64): xdata[0, i] = 5

# Run computation for data instance.
data.compute()

# Print result.
ydata = data[y]
print "y", ydata
print "argmax", np.asarray(ydata).argmax()
print("y", ydata)
print("argmax", np.asarray(ydata).argmax())
```

The index operator on the cell object (e.g. `data[x]`) returns a _tensor_ object
Expand Down Expand Up @@ -217,15 +217,15 @@ data = cell.instance()

# Set input.
xdata = data[x]
for i in xrange(64): xdata[0, i] = 5
for i in range(64): xdata[0, i] = 5

# Run computation for data instance.
data.compute()

# Print result.
ydata = data[y]
print "y", ydata
print "argmax", np.asarray(ydata).argmax()
print("y", ydata)
print("argmax", np.asarray(ydata).argmax())
```

## Creating a flow file from a Tensorflow graph
Expand Down
57 changes: 26 additions & 31 deletions doc/guide/pyapi.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,9 @@
A number of components in SLING can be accessed through the Python SLING API.
You can install the SLING Python wheel using pip:
```
sudo pip install http://www.jbox.dk/sling/sling-2.0.0-cp27-none-linux_x86_64.whl
```
or you can [clone the repo and build SLING from sources](install.md). You can
then link the `sling` Python module directly to the Python source directory to
use it in "developer mode":
```
sudo ln -s $(realpath python) /usr/lib/python2.7/dist-packages/sling
sudo pip3 install http://www.jbox.dk/sling/sling-2.0.0-cp35-none-linux_x86_64.whl
```
or you can [clone the repo and build SLING from sources](install.md).

# Table of contents

Expand Down Expand Up @@ -61,25 +56,25 @@ doc = store['document']
```
Role values for frames can be accessed as attributes:
```
print doc.name
print(doc.name)
```
or using indexing:
```
print doc['name']
print(doc['name'])
```
You can also use a frame value to access roles:
```
print doc[name]
print(doc[name])
```
You can test if a frame has a role:
```
if 'name' in doc: print "doc has 'name'"
if name in doc: print "doc has name"
if 'name' in doc: print("doc has 'name'")
if name in doc: print("doc has name")
```
You can iterate over all the named frames (i.e. frames with an `id:` slot)
in a store:
```
for f in store: print f.id
for f in store: print(f.id)
```
The `parse()` method can be used for adding new frames to the store:
```
Expand Down Expand Up @@ -121,24 +116,24 @@ f.extend([('foo', 10), ('bar': 20)])
All the slots in a frame can be iterated:
```
for name, value in f:
print "slot", name,"=", value
print("slot", name,"=", value)
```
or just the roles with a particular name:
```
for r in doc('role'):
print "doc role", r
print("doc role", r)
```
Frames can be encoded in text format with the `data()` method:
```
print f.data()
print(f.data())
```
and with indentation:
```
print f.data(pretty=True)
print(f.data(pretty=True))
```
or with binary encoding:
```
print len(f.data(binary=True))
print(len(f.data(binary=True)))
```
Arrays can be created with the `array()` method:
```
Expand All @@ -154,9 +149,9 @@ a[2] = 3
SLING arrays work much in the same way as Python lists except that they have
a fixed size:
```
print len(a)
print a[1]
for item in a: print item
print(len(a))
print(a[1])
for item in a: print(item)
```
Finally, a store can be save to a file in textual encoding:
```
Expand All @@ -181,7 +176,7 @@ import sling
recin = sling.RecordReader("test.rec")
for key,value in recin:
print key, value
print(key, value)
recin.close()
```
The `RecordReader` class has the following methods:
Expand Down Expand Up @@ -236,7 +231,7 @@ writer.close()
# Look up each record in record database.
db = sling.RecordDatabase("/tmp/test.rec")
for i in range(N):
print db.lookup(str(i))
print(db.lookup(str(i)))
db.close()
```

Expand Down Expand Up @@ -310,7 +305,7 @@ for _,rec in corpus:
num_docs += 1
num_tokens += len(doc.tokens)
print "docs:", num_docs, "tokens:", num_tokens
print("docs:", num_docs, "tokens:", num_tokens)
```

Example: read text from a file and create a corpus of tokenized documents:
Expand Down Expand Up @@ -464,7 +459,7 @@ The `Corpus` class can be used for iterating over a corpus of documents stored i
record files:
```
for document in sling.Corpus("local/data/e/wiki/en/documents@10.rec"):
print document.text
print(document.text)
```
This will create a global store with the document schema symbols and create
a local store for each document. If you have a global store you can use this
Expand All @@ -474,7 +469,7 @@ kb = sling.Store()
corpus = sling.Corpus("local/data/e/wiki/en/documents@10.rec", commons=kb)
kb.freeze()
for document in corpus:
print document.text
print(document.text)
```
### LEX format

Expand Down Expand Up @@ -560,11 +555,11 @@ kb.freeze()
# Lookup entities with name 'Annette Stroyberg'.
for entity in names.lookup("Annette Stroyberg"):
print entity.id, entity.name
print(entity.id, entity.name)
# Query all entities named 'Funen' with frequency counts.
for m in names.query("Funen"):
print m.count(), m.id(), m.item().name, "(", m.item().description, ")"
print(m.count(), m.id(), m.item().name, "(", m.item().description, ")")
```

The `lookup()` and `query()` methods return the matches in decreasing
Expand Down Expand Up @@ -593,7 +588,7 @@ for Annette Stroyberg ([Q2534120](https://www.wikidata.org/wiki/Q2534120)):
```
entity = kb["Q2534120"]
dob = sling.Date(entity["P569"])
print dob.year, dob.month, dob.day
print(dob.year, dob.month, dob.day)
```

The `Date` class has the following properties and methods:
Expand Down Expand Up @@ -688,7 +683,7 @@ The `flags.define()` function takes the same arguments as the standard Python
method. You can then access the flags as variables in the flags module, e.g.:
```
if flags.verbose:
print "verbose output..."
print("verbose output...")
```

The flags parser must be initialized in the main method of your Python program:
Expand All @@ -712,5 +707,5 @@ url = "https://www.wikidata.org/wiki/Special:EntityData/" + qid + ".json"
json = urllib2.urlopen(url).read()[len(qid) + 16:-2]
item = wikiconv.convert_wikidata(store, json)
print item.data(pretty=True)
print(item.data(pretty=True))
```
2 changes: 1 addition & 1 deletion doc/guide/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ the same script to create the commons store behind the scenes. But we mention
this here in case one wishes to inspect the automatically created commons.

```shell
python sling/nlp/parser/tools/commons_from_corpora.py \
python3 sling/nlp/parser/tools/commons_from_corpora.py \
--input=<path to train.rec>,<path to dev.rec>,<any other rec files> \
--output=<path where commons will be written>
```
Expand Down
1 change: 1 addition & 0 deletions python/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import sling.pysling as api

from sling.log import *
from sling.nlp.document import *
from sling.nlp.parser import *
Expand Down
2 changes: 1 addition & 1 deletion python/flags.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
"""Command-line flags"""

import argparse
import pysling as api
import sling.pysling as api

# Command line flag arguments.
arg = argparse.Namespace()
Expand Down
2 changes: 1 addition & 1 deletion python/log.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

import inspect
import os
import pysling as api
import sling.pysling as api

INFO = 0
WARNING = 1
Expand Down
4 changes: 2 additions & 2 deletions python/myelin/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import sling.pysling as api

from builder import *
from flow import *
from .builder import *
from .flow import *

Compiler=api.Compiler

9 changes: 3 additions & 6 deletions python/myelin/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,7 @@

"""Myelin function builder and expression evaluator."""

import flow
from flow import Variable
from flow import Function
from flow import Flow
from .flow import set_builder_factory, Variable

DT_FLOAT32 = "float32"
DT_FLOAT64 = "float64"
Expand Down Expand Up @@ -160,7 +157,7 @@ def split(self, x, splits, axis=0, name=None):
shape = x.shape[:]
shape[axis] = x.shape[axis] / splits
results = []
for n in xrange(splits):
for n in range(splits):
o = self.var(op.name + ":" + str(n), x.type, shape)
op.add_output(o)
results.append(o)
Expand Down Expand Up @@ -379,5 +376,5 @@ def rank(self, x, name=None):
def builder_factory(flow, name):
return Builder(flow, name)

flow.builder_factory = builder_factory
set_builder_factory(builder_factory)

Loading

0 comments on commit 2591d18

Please sign in to comment.