Lenguage translations with python

Sometimes you may need to automatize translation without using API, so you can search if exist a pre trained model on HugginFace and you can use it your script.

This is a quick guide that explain how to use AutoModelForSeq2SeqLM on HuggingFace and in particular I will use Helsinki-NLP/opus-mt-it-en and Helsinki-NLP/opus-mt-en-it to translate italian to english.

I put all the code in this colab notebook (it is not editable, if you want to edit copy it in yours colab notebook).

An application is to use summarization in another lang not supported, as i do in this colab notebook.

  • you need to have python and pytorch
  • after you need to install also transformes in particular with sentencepiece (if you use colab sometimes after the installation you must restart the runtime)
pip install transformers transformers[sentencepiece] sentencepiece
  • (you may need tensorflow instead of pytorch if the model loaded is written in tensorflow)

Load model

from transformers import AutoTokenizer, AutoModelForSeq2SeqLMit_en_tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-it-en")it_en_model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-it-en")

Translate a sentence

text = "Questo è un testo italiano"
encoded_input = it_en_tokenizer(text, return_tensors="pt")
output = it_en_model.generate(**encoded_input)out_text = it_en_tokenizer.batch_decode(output,
skip_special_tokens=True)
print(out_text)

We can replicate the same thing with that other model and all the others

Load model

from transformers import AutoTokenizer, AutoModelForSeq2SeqLMen_it_tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-it")en_it_model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-it")

Translate a sentence

text = "This is an Italian text"
encoded_input = en_it_tokenizer(text, return_tensors="pt")
output = en_it_model.generate(**encoded_input)out_text = en_it_tokenizer.batch_decode(output,
skip_special_tokens=True)
print(out_text)

HuggingFace have a very big number of AI for the text solutions, so you can use it… and also you can contribute to it!

Share your trained model to not waste ours of training that can help the world and save the nature!

Linux user and Open Source fun. Deep learning PhD Student, Full stack web developer, Mobile developer