Skip to content
This repository has been archived by the owner on Jan 10, 2023. It is now read-only.

Python 3 support for SLING API #366

Merged
merged 4 commits into from
May 9, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ language:
- cpp
- python
compiler: gcc
python: "2.7"
python: "3.5"

addons:
apt:
Expand All @@ -13,10 +13,11 @@ addons:
- wget
- pkg-config
- g++-4.8
- python3.5-dev

before_install:
- wget https://github.com/bazelbuild/bazel/releases/download/0.8.0/bazel_0.8.0-linux-x86_64.deb
- sudo dpkg -i bazel_0.8.0-linux-x86_64.deb
- wget https://github.com/bazelbuild/bazel/releases/download/0.13.0/bazel_0.13.0-linux-x86_64.deb
- sudo dpkg -i bazel_0.13.0-linux-x86_64.deb

script:
- tools/buildall.sh
37 changes: 20 additions & 17 deletions doc/guide/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@

If you just want to try out the parser on a pre-trained model, you can install
the wheel with pip and download a pre-trained parser model. On a Linux machine
with Python 2.7 you can install a pre-built wheel:
with Python 3.5 you can install a pre-built wheel:

```
sudo pip install http://www.jbox.dk/sling/sling-2.0.0-cp27-none-linux_x86_64.whl
sudo pip3 install http://www.jbox.dk/sling/sling-2.0.0-cp35-none-linux_x86_64.whl
```
and download the pre-trained model:
```
Expand Down Expand Up @@ -38,28 +38,30 @@ git clone https://github.com/google/sling.git
cd sling
```

SLING uses [Bazel](https://bazel.build/) as the build system, so you need to
[install Bazel](https://docs.bazel.build/versions/master/install.html) in order
to build the SLING parser.

Next, run the `seup.sh` script to set up the SLING development environment
and build the code:
```shell
sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python2.7 python2.7-dev
wget -P /tmp https://github.com/bazelbuild/bazel/releases/download/0.13.0/bazel-0.13.0-installer-linux-x86_64.sh
chmod +x /tmp/bazel-0.13.0-installer-linux-x86_64.sh
sudo /tmp/bazel-0.13.0-installer-linux-x86_64.sh
./setup.sh
```

The parser trainer uses Python v2.7 and PyTorch for training, so they need to be
installed.
This will perform the following steps:
* Install missing package dependencies, notably GCC and Python 3.
* Install [Bazel](https://bazel.build/) which is used as the build system for
SLING.
* Build SLING from source.
* Remove the Python 2.7 SLING pip package if it is installed.
* Set up link to the SLING development enviroment for SLING Python 3 API.

The parser trainer uses PyTorch for training, so it also needs to be installed:

```shell
sudo pip install http://download.pytorch.org/whl/cpu/torch-0.3.1-cp27-cp27mu-linux_x86_64.whl
sudo pip3 install http://download.pytorch.org/whl/cpu/torch-0.3.1-cp35-cp35mu-linux_x86_64.whl
```

## Building

Operating system: Linux<br>
Languages: C++, Python 2.7, assembler<br>
Languages: C++ (gcc or clang), Python 3.5+, assembler<br>
CPU: Intel x64 or compatible<br>
Build system: Bazel<br>

Expand All @@ -69,11 +71,12 @@ You can use the `buildall.sh` script to build all the source code:
tools/buildall.sh
```

You then need to link the sling Python module directly to the Python source
directory to use it in "developer mode":
If you haven't run the `setup.sh` script already, you then need to link the
sling Python module directly to the Python source directory to use it in
"developer mode":

```shell
sudo ln -s $(realpath python) /usr/lib/python2.7/dist-packages/sling
sudo ln -s $(realpath python) /usr/lib/python3/dist-packages/sling
```

**NOTE:**
Expand Down
12 changes: 6 additions & 6 deletions doc/guide/myelin.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,15 +152,15 @@ data = cell.instance()

# Set input.
xdata = data[x]
for i in xrange(64): xdata[0, i] = 5
for i in range(64): xdata[0, i] = 5

# Run computation for data instance.
data.compute()

# Print result.
ydata = data[y]
print "y", ydata
print "argmax", np.asarray(ydata).argmax()
print("y", ydata)
print("argmax", np.asarray(ydata).argmax())
```

The index operator on the cell object (e.g. `data[x]`) returns a _tensor_ object
Expand Down Expand Up @@ -217,15 +217,15 @@ data = cell.instance()

# Set input.
xdata = data[x]
for i in xrange(64): xdata[0, i] = 5
for i in range(64): xdata[0, i] = 5

# Run computation for data instance.
data.compute()

# Print result.
ydata = data[y]
print "y", ydata
print "argmax", np.asarray(ydata).argmax()
print("y", ydata)
print("argmax", np.asarray(ydata).argmax())
```

## Creating a flow file from a Tensorflow graph
Expand Down
57 changes: 26 additions & 31 deletions doc/guide/pyapi.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,9 @@
A number of components in SLING can be accessed through the Python SLING API.
You can install the SLING Python wheel using pip:
```
sudo pip install http://www.jbox.dk/sling/sling-2.0.0-cp27-none-linux_x86_64.whl
```
or you can [clone the repo and build SLING from sources](install.md). You can
then link the `sling` Python module directly to the Python source directory to
use it in "developer mode":
```
sudo ln -s $(realpath python) /usr/lib/python2.7/dist-packages/sling
sudo pip3 install http://www.jbox.dk/sling/sling-2.0.0-cp35-none-linux_x86_64.whl
```
or you can [clone the repo and build SLING from sources](install.md).

# Table of contents

Expand Down Expand Up @@ -61,25 +56,25 @@ doc = store['document']
```
Role values for frames can be accessed as attributes:
```
print doc.name
print(doc.name)
```
or using indexing:
```
print doc['name']
print(doc['name'])
```
You can also use a frame value to access roles:
```
print doc[name]
print(doc[name])
```
You can test if a frame has a role:
```
if 'name' in doc: print "doc has 'name'"
if name in doc: print "doc has name"
if 'name' in doc: print("doc has 'name'")
if name in doc: print("doc has name")
```
You can iterate over all the named frames (i.e. frames with an `id:` slot)
in a store:
```
for f in store: print f.id
for f in store: print(f.id)
```
The `parse()` method can be used for adding new frames to the store:
```
Expand Down Expand Up @@ -121,24 +116,24 @@ f.extend([('foo', 10), ('bar': 20)])
All the slots in a frame can be iterated:
```
for name, value in f:
print "slot", name,"=", value
print("slot", name,"=", value)
```
or just the roles with a particular name:
```
for r in doc('role'):
print "doc role", r
print("doc role", r)
```
Frames can be encoded in text format with the `data()` method:
```
print f.data()
print(f.data())
```
and with indentation:
```
print f.data(pretty=True)
print(f.data(pretty=True))
```
or with binary encoding:
```
print len(f.data(binary=True))
print(len(f.data(binary=True)))
```
Arrays can be created with the `array()` method:
```
Expand All @@ -154,9 +149,9 @@ a[2] = 3
SLING arrays work much in the same way as Python lists except that they have
a fixed size:
```
print len(a)
print a[1]
for item in a: print item
print(len(a))
print(a[1])
for item in a: print(item)
```
Finally, a store can be save to a file in textual encoding:
```
Expand All @@ -181,7 +176,7 @@ import sling

recin = sling.RecordReader("test.rec")
for key,value in recin:
print key, value
print(key, value)
recin.close()
```
The `RecordReader` class has the following methods:
Expand Down Expand Up @@ -236,7 +231,7 @@ writer.close()
# Look up each record in record database.
db = sling.RecordDatabase("/tmp/test.rec")
for i in range(N):
print db.lookup(str(i))
print(db.lookup(str(i)))
db.close()
```

Expand Down Expand Up @@ -310,7 +305,7 @@ for _,rec in corpus:
num_docs += 1
num_tokens += len(doc.tokens)

print "docs:", num_docs, "tokens:", num_tokens
print("docs:", num_docs, "tokens:", num_tokens)
```

Example: read text from a file and create a corpus of tokenized documents:
Expand Down Expand Up @@ -464,7 +459,7 @@ The `Corpus` class can be used for iterating over a corpus of documents stored i
record files:
```
for document in sling.Corpus("local/data/e/wiki/en/documents@10.rec"):
print document.text
print(document.text)
```
This will create a global store with the document schema symbols and create
a local store for each document. If you have a global store you can use this
Expand All @@ -474,7 +469,7 @@ kb = sling.Store()
corpus = sling.Corpus("local/data/e/wiki/en/documents@10.rec", commons=kb)
kb.freeze()
for document in corpus:
print document.text
print(document.text)
```
### LEX format

Expand Down Expand Up @@ -560,11 +555,11 @@ kb.freeze()

# Lookup entities with name 'Annette Stroyberg'.
for entity in names.lookup("Annette Stroyberg"):
print entity.id, entity.name
print(entity.id, entity.name)

# Query all entities named 'Funen' with frequency counts.
for m in names.query("Funen"):
print m.count(), m.id(), m.item().name, "(", m.item().description, ")"
print(m.count(), m.id(), m.item().name, "(", m.item().description, ")")
```

The `lookup()` and `query()` methods return the matches in decreasing
Expand Down Expand Up @@ -593,7 +588,7 @@ for Annette Stroyberg ([Q2534120](https://www.wikidata.org/wiki/Q2534120)):
```
entity = kb["Q2534120"]
dob = sling.Date(entity["P569"])
print dob.year, dob.month, dob.day
print(dob.year, dob.month, dob.day)
```

The `Date` class has the following properties and methods:
Expand Down Expand Up @@ -688,7 +683,7 @@ The `flags.define()` function takes the same arguments as the standard Python
method. You can then access the flags as variables in the flags module, e.g.:
```
if flags.verbose:
print "verbose output..."
print("verbose output...")
```

The flags parser must be initialized in the main method of your Python program:
Expand All @@ -712,5 +707,5 @@ url = "https://www.wikidata.org/wiki/Special:EntityData/" + qid + ".json"
json = urllib2.urlopen(url).read()[len(qid) + 16:-2]

item = wikiconv.convert_wikidata(store, json)
print item.data(pretty=True)
print(item.data(pretty=True))
```
2 changes: 1 addition & 1 deletion doc/guide/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ the same script to create the commons store behind the scenes. But we mention
this here in case one wishes to inspect the automatically created commons.

```shell
python sling/nlp/parser/tools/commons_from_corpora.py \
python3 sling/nlp/parser/tools/commons_from_corpora.py \
--input=<path to train.rec>,<path to dev.rec>,<any other rec files> \
--output=<path where commons will be written>
```
Expand Down
1 change: 1 addition & 0 deletions python/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import sling.pysling as api

from sling.log import *
from sling.nlp.document import *
from sling.nlp.parser import *
Expand Down
2 changes: 1 addition & 1 deletion python/flags.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
"""Command-line flags"""

import argparse
import pysling as api
import sling.pysling as api

# Command line flag arguments.
arg = argparse.Namespace()
Expand Down
2 changes: 1 addition & 1 deletion python/log.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

import inspect
import os
import pysling as api
import sling.pysling as api

INFO = 0
WARNING = 1
Expand Down
4 changes: 2 additions & 2 deletions python/myelin/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import sling.pysling as api

from builder import *
from flow import *
from .builder import *
from .flow import *

Compiler=api.Compiler

9 changes: 3 additions & 6 deletions python/myelin/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,7 @@

"""Myelin function builder and expression evaluator."""

import flow
from flow import Variable
from flow import Function
from flow import Flow
from .flow import set_builder_factory, Variable

DT_FLOAT32 = "float32"
DT_FLOAT64 = "float64"
Expand Down Expand Up @@ -160,7 +157,7 @@ def split(self, x, splits, axis=0, name=None):
shape = x.shape[:]
shape[axis] = x.shape[axis] / splits
results = []
for n in xrange(splits):
for n in range(splits):
o = self.var(op.name + ":" + str(n), x.type, shape)
op.add_output(o)
results.append(o)
Expand Down Expand Up @@ -379,5 +376,5 @@ def rank(self, x, name=None):
def builder_factory(flow, name):
return Builder(flow, name)

flow.builder_factory = builder_factory
set_builder_factory(builder_factory)

Loading