Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLE-1053 rebase asapp fixes #4

Open
wants to merge 29 commits into
base: ASAPP-fixes
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
c0e7a13
merge computeSubwords functions
Celebio Sep 13, 2018
711f513
added FAQ on how to get reproducible results (#633)
Sep 13, 2018
a5d22ab
Fix broken link (#590)
EmilStenstrom Sep 13, 2018
8e68462
Conforming to Facebook c++ style
Celebio Oct 24, 2018
be1e597
Compute precision/recall for each label
Celebio Oct 24, 2018
25c3994
fixing python binding for `predict` function
Celebio Oct 26, 2018
6efad35
duplicate import removed
edenbaus Nov 2, 2018
58fe650
fixing missing include in productquantizer.cc that is causing compila…
Celebio Nov 2, 2018
2e52f53
Refactor model testing and metrics code. (#672)
Nov 6, 2018
4a3b5af
meter class refactoring for per-label stats, some function deprecatio…
Celebio Nov 6, 2018
d759dd1
adding python binding for `test-label`
Celebio Nov 6, 2018
0ddcd5f
adding coverage option for Makefile and setup.py
Celebio Nov 6, 2018
41a0f39
putting back the usage of vector to loop in C++ in multiline prediction
Celebio Nov 6, 2018
c180783
fix circleci errors
Celebio Nov 7, 2018
5c229ab
Fixed typos at readme.md (#662)
schneiderl Nov 8, 2018
ead7911
fix support for older C++11 compilers for python bindings
Celebio Nov 20, 2018
4aee63d
Add circleci build badges to the README.md
Celebio Nov 21, 2018
256032b
remove printing functions from fasttext class
Celebio Nov 23, 2018
b8022b5
python install, a more robust pybind11 include
Celebio Nov 27, 2018
a84a6a4
add argument names in fasttext.h
Celebio Nov 27, 2018
71b4101
Normalize buffer vector in analogy queries
Celebio Nov 27, 2018
8850c51
One-vs-all cross-entropy loss
Celebio Nov 27, 2018
7deac6d
adding ova loss option to python bindings
Celebio Dec 4, 2018
501b9b1
Better default for number of threads
whiletruelearn Dec 4, 2018
7842495
Re-licensing fasttext to MIT
Dec 18, 2018
3c4a3ea
footer language : default to EN (#581)
Dec 20, 2018
67e8950
set version to have an ASAPP suffix, add Cython to install_requires
fwph May 22, 2018
f74aad6
bump version after publish script change
cdfox-asapp Sep 7, 2018
b7fa4e7
Update setup.py
cdfox-asapp Oct 18, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Re-licensing fasttext to MIT
Summary: Re-licensing fastText to MIT

Reviewed By: piotr-bojanowski

Differential Revision: D13415080

fbshipit-source-id: 6708849531fe7559cde273a3024660bc8b3b3750
Edouard Grave authored and facebook-github-bot committed Dec 18, 2018
commit 7842495a4d64c7a3bb4339d45d6e64321d002ed8
5 changes: 2 additions & 3 deletions .circleci/cmake_test.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

RESULTDIR=result
5 changes: 2 additions & 3 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -5,9 +5,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

# Maybe one day this will work
5 changes: 2 additions & 3 deletions .circleci/gcc_test.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

RESULTDIR=result
5 changes: 2 additions & 3 deletions .circleci/pip_test.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

sudo pip install --index-url https://test.pypi.org/simple/ fasttext
5 changes: 2 additions & 3 deletions .circleci/pull_data.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

myshuf() {
5 changes: 2 additions & 3 deletions .circleci/python_test.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

sudo pip install .
5 changes: 2 additions & 3 deletions .circleci/run_locally.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

# This script illustrates how to run the build tests locally
5 changes: 2 additions & 3 deletions .circleci/setup_circleimg.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

sudo apt-get update
5 changes: 2 additions & 3 deletions .circleci/setup_debian.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

apt-get update
5 changes: 2 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -2,9 +2,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

cmake_minimum_required(VERSION 2.8.9)
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -29,4 +29,4 @@ to do this once to work on any of Facebook's open source projects.
Complete your CLA here: <https://code.facebook.com/cla>

## License
By contributing to fastText, you agree that your contributions will be licensed under its BSD license.
By contributing to fastText, you agree that your contributions will be licensed under its MIT license.
43 changes: 17 additions & 26 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,30 +1,21 @@
BSD License
MIT License

For fastText software
Copyright (c) 2016-present, Facebook, Inc.

Copyright (c) 2016-present, Facebook, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* Neither the name Facebook nor the names of its contributors may be used to
endorse or promote products derived from this software without specific
prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
5 changes: 2 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
@@ -2,9 +2,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

CXX = c++
33 changes: 0 additions & 33 deletions PATENTS

This file was deleted.

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -336,4 +336,4 @@ See the CONTRIBUTING file for information about how to help out.

## License

fastText is BSD-licensed. We also provide an additional patent grant.
fastText is MIT-licensed.
5 changes: 2 additions & 3 deletions classification-example.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

myshuf() {
7 changes: 3 additions & 4 deletions classification-results.sh
Original file line number Diff line number Diff line change
@@ -3,12 +3,11 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

# This script produces the results from Table 1 in the following paper:
# This script produces the results from Table 1 in the following paper:
# Bag of Tricks for Efficient Text Classification, arXiv 1607.01759, 2016

myshuf() {
5 changes: 2 additions & 3 deletions eval.py
Original file line number Diff line number Diff line change
@@ -4,9 +4,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

from __future__ import absolute_import
9 changes: 4 additions & 5 deletions get-wikimedia.sh
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
#

set -e
@@ -30,7 +29,7 @@ read -r -p "Choose a language (e.g. en, bh, fr, etc.): " choice
LANG="$choice"
echo "Chosen language: ""$LANG"
read -r -p "Continue to download (WARNING: This might be big and can take a long time!)(y/n)? " choice
case "$choice" in
case "$choice" in
y|Y ) echo "Starting download...";;
n|N ) echo "Exiting";exit 1;;
* ) echo "Invalid answer";exit 1;;
@@ -77,4 +76,4 @@ while (<>) {
print $_;
}
}
' | normalize_text | awk '{if (NF>1) print;}' | tr -s " " | shuf > "${ROOT}"/wiki."${LANG}".txt
' | normalize_text | awk '{if (NF>1) print;}' | tr -s " " | shuf > "${ROOT}"/wiki."${LANG}".txt
7 changes: 3 additions & 4 deletions python/README.md
Original file line number Diff line number Diff line change
@@ -63,9 +63,8 @@ DESCRIPTION
# Copyright (c) 2017-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
FUNCTIONS
load_model(path)
@@ -78,7 +77,7 @@ FUNCTIONS

## IMPORTANT: Preprocessing data / enconding conventions

In general it is important to properly preprocess your data. In particular our example scripts in the [root folder](https://github.com/facebookresearch/fastText) do this.
In general it is important to properly preprocess your data. In particular our example scripts in the [root folder](https://github.com/facebookresearch/fastText) do this.

fastText assumes UTF-8 encoded text. All text must be [unicode for Python2](https://docs.python.org/2/library/functions.html#unicode) and [str for Python3](https://docs.python.org/3.5/library/stdtypes.html#textseq). The passed text will be [encoded as UTF-8 by pybind11](https://pybind11.readthedocs.io/en/master/advanced/cast/strings.html?highlight=utf-8#strings-bytes-and-unicode-conversions) before passed to the fastText C++ library. This means it is important to use UTF-8 encoded text when building a model. On Unix-like systems you can convert text using [iconv](https://en.wikipedia.org/wiki/Iconv).

5 changes: 2 additions & 3 deletions python/README.rst
Original file line number Diff line number Diff line change
@@ -70,9 +70,8 @@ For example
# Copyright (c) 2017-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

FUNCTIONS
load_model(path)
7 changes: 3 additions & 4 deletions python/benchmarks/get_word_vector.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
# Copyright (c) 2017-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

from __future__ import absolute_import
from __future__ import division
@@ -33,7 +32,7 @@ def get_word_vector(data, model):
t3 = time.time()
i = 0
for t in tokens:
vec = f.get_word_vector(t)
f.get_word_vector(t)
i += 1
if i % 10000 == 0:
sys.stderr.write("\ri: " + str(float(i / len(tokens))))
5 changes: 2 additions & 3 deletions python/doc/examples/FastTextEmbeddingBag.py
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2017-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

# NOTE: This requires PyTorch! We do not provide installation scripts to install PyTorch.
# It is up to you to install this dependency if you want to execute this example.
5 changes: 2 additions & 3 deletions python/doc/examples/bin_to_vec.py
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2017-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

from __future__ import absolute_import
from __future__ import division
5 changes: 2 additions & 3 deletions python/doc/examples/compute_accuracy.py
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2017-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

from __future__ import absolute_import
from __future__ import division
5 changes: 2 additions & 3 deletions python/doc/examples/get_vocab.py
Original file line number Diff line number Diff line change
@@ -3,9 +3,8 @@
# Copyright (c) 2017-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

from __future__ import absolute_import
from __future__ import division
Loading