MLProject

Under Guidance of: Dr. Rajesh Kumar Mundotiya

Simultaneous Machine Translation (SiMT) aims to generate translations simultaneously with the reading of the source sentence, balancing translation quality with latency. Most SiMT models currently require training multiple models for different latency levels, thus increasing computational costs and, more importantly, limiting flexibility. The new approach is, like Mixture- of-Experts Wait-k policy, training multiple wait-k values in balance between the considerations of both latency and translation quality, leaving the determination of the optimal value of k for unseen data as an open challenge. Moreover, variability in the structure of structure between different languages makes the problem even more complicated because the application of a fixed policy becomes rather ineffective.

Base Model: The project will utilize the Mixture-of-Experts Wait-k policy as the backbone model. This policy allows each head of the multi-head attention mechanism to perform translation with different levels of latency.

Project Objectives

Develop a Dynamic Wait-k Policy that adaptively balances latency and quality in real-time.
Integrate Self-Critical Sequence Training (SCST) to optimize the quality-latency trade-off using reinforcement learning.
Evaluate translation quality using BLEU and ROUGE metrics while minimizing latency.

Workflow

a. Dataset Preparation

Data Format: JSON files with input-output sentence pairs.
Tokenization: Utilized AutoTokenizer from Hugging Face for sentence processing.
Padding and Alignment: Padded source sentences and shifted decoder input for alignment.

b. Dynamic Wait-k Policy Implementation

Utilized a flexible wait-k strategy to dynamically adjust latency based on remaining input length.
Enhanced with HMT to predict sequence likelihoods, improving token generation decisions.
Wait-K Policy Formula: $$g(t; k) = \min(k + t - 1, |Z|)$$

c. SCST Fine-Tuning and RL Integration

Reward Function: Optimized using BLEU and ROUGE metrics.
Policy Optimization: RL agent trained via policy gradients.
Advantage Calculation: Based on the difference between sampled and baseline rewards.
SCST Reward Formula: $$R(\theta) = \sum_{t=1}^T (r_t - b_t) \log P(y_t | x; \theta)$$

d. Model Architecture and Optimization

Base Model: LLaMA-7B fine-tuned with LoRA, supported by HMT for improved sequence prediction.
Optimization: Adam optimizer with cross-entropy loss.
Device Compatibility: Supports GPU (CUDA), MPS (Apple Silicon), and CPU.

e. Evaluation Metrics

BLEU Score: Measures translation quality using n-gram overlaps.
ROUGE-L Score: Assesses informativeness and coverage.
Latency: Quantified via read-write sequence length ratio.
Latency Metric (AL): $$AL = \frac{1}{\tau} \sum_{t=1}^\tau \left[g(t) - t - 1\right] \cdot \frac{|y|}{|x|}$$

Evaluation and Findings

Dynamic Wait-k Policy significantly improved latency-quality trade-offs.
SCST Fine-Tuning optimized performance through reinforcement learning.
HMT Integration enhanced real-time adaptability.
LoRA-enhanced LLaMA model ensured resource-efficient translations.
BLEU and ROUGE scores provided robust evaluation metrics.

References

Zhang, S., & Feng, Y. (2021).
Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7306–7317.
Available here
Gu, J., et al. (2017).
Learning to translate in real-time with neural machine translation.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.
Available here
Grissom II, A., He, H., Boyd-Graber, J., Morgan, J., & Daumé III, H. (2014).
Don’t Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1342–1352.
Available here

Pending Tasks & Challenges

We are diligently working on implementing the project, leveraging the latest advancements in simultaneous machine translation, including the Mixture-of-Experts approach and the dynamic Wait-k policy. While we strive to achieve the best possible outcomes, we acknowledge that the code and methodology are still evolving. As we proceed, there may be adjustments to our approach based on practical challenges, insights gained during experimentation, and efforts to optimize performance. Our aim is to ensure that the final implementation aligns with the project objectives while maintaining flexibility for improvements. I also found that the code does not seem to work properly as some part of the code or requirement is missing to run the code though the paper explained everything.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
HMT-SiLLM_2.py		HMT-SiLLM_2.py
ML Project 2024-25 (1).pdf		ML Project 2024-25 (1).pdf
MLProject_pytorch+SCST+SiMT.ipynb		MLProject_pytorch+SCST+SiMT.ipynb
Mathematical_Implication.pdf		Mathematical_Implication.pdf
Phase2_ML.zip		Phase2_ML.zip
README.md		README.md
__init__.py		__init__.py
prompter.py		prompter.py
requirement.txt		requirement.txt
test_dataset.json		test_dataset.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLProject

Under Guidance of: Dr. Rajesh Kumar Mundotiya

Project Objectives

Workflow

a. Dataset Preparation

b. Dynamic Wait-k Policy Implementation

c. SCST Fine-Tuning and RL Integration

d. Model Architecture and Optimization

e. Evaluation Metrics

Evaluation and Findings

References

Pending Tasks & Challenges

About

Releases

Packages

Contributors 3

Languages

Tanmay-IITDSAI/MLProject

Folders and files

Latest commit

History

Repository files navigation

MLProject

Under Guidance of: Dr. Rajesh Kumar Mundotiya

Project Objectives

Workflow

a. Dataset Preparation

b. Dynamic Wait-k Policy Implementation

c. SCST Fine-Tuning and RL Integration

d. Model Architecture and Optimization

e. Evaluation Metrics

Evaluation and Findings

References

Pending Tasks & Challenges

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages