Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix: remove extra char from configmap (#526)
This PR removes the extra newline/space char (probably from a different encoding) from the yaml. This extra char causes display issue as the whole config file is converted to string with "\n" to represent newline. The resultant configmap is not readable. Before the change, ``` k get configmap/lora-params-template -n workspace -o yaml apiVersion: v1 data: training_config.yaml: "training_config:\n ModelConfig: # Configurable Parameters: https://huggingface.co/docs/transformers/v4.40.2/en/model_doc/auto#transformers.AutoModelForCausalLM.from_pretrained\n \ torch_dtype: \"bfloat16\"\n local_files_only: true\n device_map: \"auto\"\n\n \ QuantizationConfig: # Configurable Parameters: https://huggingface.co/docs/transformers/v4.40.2/en/main_classes/quantization#transformers.BitsAndBytesConfig\n \ load_in_4bit: false\n\n LoraConfig: # Configurable Parameters: https://huggingface.co/docs/peft/v0.8.2/en/package_reference/lora#peft.LoraConfig\n \ r: 8\n lora_alpha: 8\n lora_dropout: 0.0\n\n TrainingArguments: # Configurable Parameters: https://huggingface.co/docs/transformers/v4.40.2/en/main_classes/trainer#transformers.TrainingArguments\n \ output_dir: \"/mnt/results\"\n # num_train_epochs: <Defaults to 3, adjustable>\n \ ddp_find_unused_parameters: false # Default to false to prevent errors during distributed training.\n save_strategy: \"epoch\" # Default to save at end of each epoch\n per_device_train_batch_size: 1\n\n DataCollator: # Configurable Parameters: https://huggingface.co/docs/transformers/v4.40.2/en/main_classes/data_collator#transformers.DataCollatorForLanguageModeling\n \ mlm: true # Default setting; included to show DataCollator can be updated.\n\n \ DatasetConfig: # Configurable Parameters: https://github.com/Azure/kaito/blob/main/presets/tuning/text-generation/cli.py#L44\n \ shuffle_dataset: true\n train_test_split: 1 # Default to using all data for fine-tuning due to strong pre-trained baseline and typically limited fine-tuning data.\n # Expected Dataset format: \n" ``` After the change: ``` k get configmap/lora-params-template -n workspace -o yaml apiVersion: v1 data: training_config.yaml: | training_config: ModelConfig: # Configurable Parameters: https://huggingface.co/docs/transformers/v4.40.2/en/model_doc/auto#transformers.AutoModelForCausalLM.from_pretrained torch_dtype: "bfloat16" local_files_only: true device_map: "auto" QuantizationConfig: # Configurable Parameters: https://huggingface.co/docs/transformers/v4.40.2/en/main_classes/quantization#transformers.BitsAndBytesConfig load_in_4bit: false LoraConfig: # Configurable Parameters: https://huggingface.co/docs/peft/v0.8.2/en/package_reference/lora#peft.LoraConfig r: 8 lora_alpha: 8 lora_dropout: 0.0 TrainingArguments: # Configurable Parameters: https://huggingface.co/docs/transformers/v4.40.2/en/main_classes/trainer#transformers.TrainingArguments output_dir: "/mnt/results" # num_train_epochs: <Defaults to 3, adjustable> ddp_find_unused_parameters: false # Default to false to prevent errors during distributed training. save_strategy: "epoch" # Default to save at end of each epoch per_device_train_batch_size: 1 DataCollator: # Configurable Parameters: https://huggingface.co/docs/transformers/v4.40.2/en/main_classes/data_collator#transformers.DataCollatorForLanguageModeling mlm: true # Default setting; included to show DataCollator can be updated. DatasetConfig: # Configurable Parameters: https://github.com/Azure/kaito/blob/main/presets/tuning/text-generation/cli.py#L44 shuffle_dataset: true train_test_split: 1 # Default to using all data for fine-tuning due to strong pre-trained baseline and typically limited fine-tuning data. # {"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]} # e.g. https://huggingface.co/datasets/philschmid/dolly-15k-oai-style ```
- Loading branch information