-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Label_ids "None" upon initiating training of Adaptive Model #92
Comments
Hey Feline, It sounds to me as if there might be some labels missing or malformed in the input. Could you post one full sample output here so we can check? |
3 examples train set 09/18/2019 15:59:47 - INFO - farm.data_handler.processor - 09/18/2019 15:59:47 - INFO - farm.data_handler.processor - 3 examples train set 09/18/2019 16:02:28 - INFO - farm.data_handler.processor - 09/18/2019 16:02:28 - INFO - farm.data_handler.processor - I don't get the same prints for the "dev" subset -is this sufficient for you to determine whether missing or malformed labels could be the issue here?? |
Well, there are no labels passed onto FARM : ) A sidenote: Your texts seem to be very long. Language models have a maximum length of subwords you can put in. I guess you even set it to a small value to make it work on your machine. I see in your code you set max_seq_len to 50. So the language model will just take a tiny part of the text. Please have a look how your texts are tokenized. The tokenization does not look good, since you have a lot of words the German Bert does not know. |
Thanks for your suggestions! Max_length is set to 50 only for testing purposes only. I have trained a convolutional neural net with tokens as input and a pretrained word embeddings layer against a baseline model with tf-idf's and obtained better performance. I'm glad to be able to use your pretrained BERT model now thanks to your continuous support, in order to compare its performance with the CNN. |
Ah ok, nice! How about you use the settings that we supplied in our example and update your farm version to the current master. Based on your code you posted earlier these TextClassificationProcessor parameters should work:
Please verify that the data in data_train.tsv is actually tab separated and your text does not contain tabs. Can you read the tsv files with pandas without error? Hope that works. I still do not believe this case is suited for Language models. Maybe you have to clean the text before, it seems it is web pages that have advertisement or header information surrounding the actual text? You should think about only getting the actual text that is of interest and use that in FARM. |
I cleaned the texts to such an extent that multiple whitespaces and tabs are removed before writing into .tsv files and made sure that there aren't any empty cells. I will certainly follow up on your advice to remove such 'website-text' specific words. Thank you. After updating farm to the current master, I'm running into a new error both when using the exact same code and settings you supply in your example ("doc_classification.py") as well as when using the parameters you suggested on my own data:
|
Wow, your error messages are always super cryptic : ) It looks like an environment issue to me. Can you please create a fresh install in a virtual environment. Then run one of our examples without adjustments to verify the installation. And then adjust FARM to your use case with the code I just posted earlier. |
I have uninstalled and installed farm in the virtual environment and I'm now running it from your master with commit ID "f6734cb88cb29a872cbe6dcc2ba7cbf81855cd50". `# fmt: off from farm.data_handler.data_silo import DataSilo logging.basicConfig( ml_logger = MLFlowLogger(tracking_uri="https://public-mlflow.deepset.ai/") ########################## 1.Create a tokenizertokenizer = BertTokenizer.from_pretrained( 2. Create a DataProcessor that handles all the conversion from raw text into a pytorch DatasetHere we load GermEval 2018 Data.label_list = ["OTHER", "OFFENSE"] processor = TextClassificationProcessor(tokenizer=tokenizer, 3. Create a DataSilo that loads several datasets (train/dev/test), provides DataLoaders for them and calculates a few descriptive statistics of our datasetsdata_silo = DataSilo( 4. Create an AdaptiveModela) which consists of a pretrained language model as a basislanguage_model = Bert.load(lang_model) b) and a prediction head on top that is suited for our task => Text classificationprediction_head = TextClassificationHead(layer_dims=[768, len(processor.tasks["text_classification"]["label_list"])], model = AdaptiveModel( 5. Create an optimizeroptimizer, warmup_linear = initialize_optimizer( 6. Feed everything to the Trainer, which keeps care of growing our model into powerful plant and evaluates it from time to timetrainer = Trainer( 7. Let it growmodel = trainer.train(model) 8. Hooray! You have a model. Store it:save_dir = "saved_models/bert-german-doc-tutorial" 9. Load it & harvest your fruits (Inference)basic_texts = [ fmt: on` |
Wow, ok. Then it is definitely an environment issue, maybe something specific to your linux mint distribution or pycharm related. |
Yes I have tried that as well and unfortunately the error persists. It keeps throwing this error message (a print out of a sample is included in case you can work out something based on that): `09/20/2019 15:49:04 - INFO - farm.data_handler.processor -
(/||/ || ID: infer - 63-0 torch_shm_manager: error while loading shared libraries: libnvToolsExt.so.1: cannot open shared object file: No such file or directory Process finished with exit code 1` Thank you very much for your help. |
A good thing I can read from your printouts: The dataprocessing seems to work now : ) As for the multiprocessing torch_shm_manager error. I have been able to reproduce your error with the newest PyTorch version 1.2.0. It is resolved when using PyTorch version 1.1.0. Could you verify which version you are on and downgrade to 1.1.0? Thanks a lot for reporting this bug. I will create a separate issue and have our Data Engineer fix that. |
Fantastic, this solved it! Wishing you a well-deserved happy weekend:)! |
Nice, thanks a lot for the positive feedback. I will close this issue now. |
When calling train method on the adaptive model, an error is thrown upon collecting the losses from the prediction head (logits_to_loss_per_head, adaptive_model) when logits and labels in prediction head are combined (logits_to_loss, prediction_head) to create a per_sample_loss because the variable "label_ids" is None and the "view()" function cannot be called upon it. The variable "label_ids" becomes None due to the fact that the function call assigning values to it "kwargs.get(self.label_tensor_name)" returns None.
However, the documentation does not reveal whether and where "label_ids" or rather, "label_tensor_names" ought to be specified.
Error message
Train epoch 1/5: 0%| | 0/200 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/f_weise/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/192.5728.105/helpers/pydev/pydevd.py", line 2060, in
main()
File "/home/f_weise/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/192.5728.105/helpers/pydev/pydevd.py", line 2054, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/f_weise/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/192.5728.105/helpers/pydev/pydevd.py", line 1405, in run
return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
File "/home/f_weise/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/192.5728.105/helpers/pydev/pydevd.py", line 1412, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/f_weise/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/192.5728.105/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/f_weise/projects/adup-watchdog/src/scripts/BertBaseCased.py", line 152, in
model_training = trainer.train(model)
File "/home/f_weise/projects/adup-watchdog/.venv/src/farm/farm/train.py", line 154, in train
per_sample_loss = model.logits_to_loss(logits=logits, **batch)
File "/home/f_weise/projects/adup-watchdog/.venv/src/farm/farm/modeling/adaptive_model.py", line 129, in logits_to_loss
all_losses = self.logits_to_loss_per_head(logits, **kwargs)
File "/home/f_weise/projects/adup-watchdog/.venv/src/farm/farm/modeling/adaptive_model.py", line 116, in logits_to_loss_per_head
all_losses.append(head.logits_to_loss(logits=logits_for_one_head, **kwargs))
File "/home/f_weise/projects/adup-watchdog/.venv/src/farm/farm/modeling/prediction_head.py", line 263, in logits_to_loss
return self.loss_fct(logits, label_ids.view(-1))
AttributeError: 'NoneType' object has no attribute 'view'
Expected behavior
I expected the variable "label_id" to be a tensor of the same lengths/shape as there are samples in the training set, such that a per_sample_loss can be calculated and the training successfully begins.
To Reproduce
`class BertBaseCased(object):
`
System:
The text was updated successfully, but these errors were encountered: