Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apptainer merging (#47) #48

Merged
merged 2 commits into from
May 19, 2023
Merged

Apptainer merging (#47) #48

merged 2 commits into from
May 19, 2023

Conversation

esseivaju
Copy link
Contributor

  • updated json state example

  • fixed issue where only one merge task per input file was created per call to check_mergeable_files

  • stage in queuedata from cvmfs instead of Harvester cacher

  • update for ray 2.0.0

  • create state.json with task output directory instead of expecting Harvester to create it. Fixed type in job parameter name

  • rename job file instead of deleting it

  • parse job parameters to replace input and output files

  • update for new job definition parameters

  • stop threads in case of exception

  • check if threads are alive

  • par --outputHITSFile before forwarding jobdef to pilot

  • inject eventservice flags in jobdef for pilot

  • support for merging multiple input files into the same output file, generates correct output filenames

  • taskdir from config, fixed issue with cleaner thread not having most up to date taskdir path

  • flake8

  • Updated test conf

  • Fix typo in test jobdef

  • Generate job definition with appropriate number of input / output files

  • update generated event ranges filename, pass an array to set_file_merged

  • updated queuedata file

  • remove obsolete autoConfiguration parameter

  • parse hostname output for multiple IP addresses

  • don't use bash arrays; Harvester will interpret {} as parameters to substitute

  • assign event ranges by file

  • revert queuedata file

  • fixed issue with status of event range status before update rangesID_by_file

  • install missing module

  • fixed issue where results list was incorrectly indexed

  • update actions version

  • keep pre-allocating the list

  • don't print entire merge transform command

  • account for failed ranges when checking if a file is ready to be merged

  • fixed failed events not being correctly accounted for when creating merge tasks

  • fixe evnt_range not defined for failed ranges

  • account for failed ranges in the assert

  • handle failed event ranges by flagging as failed all ranges that should be merged with it

  • only launch merge jobs if we have enough events, not counting failed events

  • updated booking of ranges in the same file of a failed range

  • missing braces

  • formatting

  • fix unit tests

  • Unpin protobuf

  • removed _system_config argument which fails with ray 2.3.0

  • removed syste-config incompatible with ray 2.3.0

  • updated harvester template script

  • fixed issues when generating failed event ranges

  • update bookkeeping correctly when receiving a failed env range from pilot

  • flake8, remove unused method

  • removed test for obsolete methods, generate correct event range ids

  • Validate job output (Validate job output #45)

  • Script to validate job output. Check for duplicates and completeness

  • install validation script

  • flake8

  • flake8

  • Fix unit tests

  • Athenamp jobpar (Athenamp jobpar #46)

  • force multiprocess option

  • Only add multiprocess for release23

  • reads container type and options from panda queue config

  • remove multithread argument

  • flake8

  • Support for merge transform in apptainer

  • unused import

esseivaju added 2 commits May 19, 2023 12:12
* updated json state example

* fixed issue where only one merge task per input file was created per call to check_mergeable_files

* stage in queuedata from cvmfs instead of Harvester cacher

* update for ray 2.0.0

* create state.json with task output directory instead of expecting Harvester to create it. Fixed type in job parameter name

* rename job file instead of deleting it

* parse job parameters to replace input and output files

* update for new job definition parameters

* stop threads in case of exception

* check if threads are alive

* par --outputHITSFile before forwarding jobdef to pilot

* inject eventservice flags in jobdef for pilot

* support for merging multiple input files into the same output file, generates correct output filenames

* taskdir from config, fixed issue with cleaner thread not having most up to date taskdir path

* flake8

* Updated test conf

* Fix typo in test jobdef

* Generate job definition with appropriate number of input / output files

* update generated event ranges filename, pass an array to set_file_merged

* updated queuedata file

* remove obsolete autoConfiguration parameter

* parse hostname output for multiple IP addresses

* don't use bash arrays; Harvester will interpret {} as parameters to substitute

* assign event ranges by file

* revert queuedata file

* fixed issue with status of event range status before update rangesID_by_file

* install missing module

* fixed issue where results list was incorrectly indexed

* update actions version

* keep pre-allocating the list

* don't print entire merge transform command

* account for failed ranges when checking if a file is ready to be merged

* fixed failed events not being correctly accounted for when creating merge tasks

* fixe evnt_range not defined for failed ranges

* account for failed ranges in the assert

* handle failed event ranges by flagging as failed all ranges that should be merged with it

* only launch merge jobs if we have enough events, not counting failed events

* updated booking of ranges in the same file of a failed range

* missing braces

* formatting

* fix unit tests

* Unpin protobuf

* removed `_system_config` argument which fails with ray 2.3.0

* removed syste-config incompatible with ray 2.3.0

* updated harvester template script

* fixed issues when generating failed event ranges

* update bookkeeping correctly when receiving a failed env range from pilot

* flake8, remove unused method

* removed test for obsolete methods, generate correct event range ids

* Validate job output (#45)

* Script to validate job output. Check for duplicates and completeness

* install validation script

* flake8

* flake8

* Fix unit tests

* Athenamp jobpar (#46)

* force multiprocess option

* Only add multiprocess for release23

* reads container type and options from panda queue config

* remove multithread argument

* flake8

* Support for merge transform in apptainer

* unused import
@esseivaju esseivaju merged commit 38174c3 into master May 19, 2023
esseivaju added a commit that referenced this pull request May 23, 2023
* Apptainer merging (#47) (#48)

* updated json state example

* fixed issue where only one merge task per input file was created per call to check_mergeable_files

* stage in queuedata from cvmfs instead of Harvester cacher

* update for ray 2.0.0

* create state.json with task output directory instead of expecting Harvester to create it. Fixed type in job parameter name

* rename job file instead of deleting it

* parse job parameters to replace input and output files

* update for new job definition parameters

* stop threads in case of exception

* check if threads are alive

* par --outputHITSFile before forwarding jobdef to pilot

* inject eventservice flags in jobdef for pilot

* support for merging multiple input files into the same output file, generates correct output filenames

* taskdir from config, fixed issue with cleaner thread not having most up to date taskdir path

* flake8

* Updated test conf

* Fix typo in test jobdef

* Generate job definition with appropriate number of input / output files

* update generated event ranges filename, pass an array to set_file_merged

* updated queuedata file

* remove obsolete autoConfiguration parameter

* parse hostname output for multiple IP addresses

* don't use bash arrays; Harvester will interpret {} as parameters to substitute

* assign event ranges by file

* revert queuedata file

* fixed issue with status of event range status before update rangesID_by_file

* install missing module

* fixed issue where results list was incorrectly indexed

* update actions version

* keep pre-allocating the list

* don't print entire merge transform command

* account for failed ranges when checking if a file is ready to be merged

* fixed failed events not being correctly accounted for when creating merge tasks

* fixe evnt_range not defined for failed ranges

* account for failed ranges in the assert

* handle failed event ranges by flagging as failed all ranges that should be merged with it

* only launch merge jobs if we have enough events, not counting failed events

* updated booking of ranges in the same file of a failed range

* missing braces

* formatting

* fix unit tests

* Unpin protobuf

* removed `_system_config` argument which fails with ray 2.3.0

* removed syste-config incompatible with ray 2.3.0

* updated harvester template script

* fixed issues when generating failed event ranges

* update bookkeeping correctly when receiving a failed env range from pilot

* flake8, remove unused method

* removed test for obsolete methods, generate correct event range ids

* Validate job output (#45)

* Script to validate job output. Check for duplicates and completeness

* install validation script

* flake8

* flake8

* Fix unit tests

* Athenamp jobpar (#46)

* force multiprocess option

* Only add multiprocess for release23

* reads container type and options from panda queue config

* remove multithread argument

* flake8

* Support for merge transform in apptainer

* unused import

* use PyROOT instead AthFile to validate output

* formatting
esseivaju added a commit that referenced this pull request May 23, 2023
* Apptainer merging (#47)

* updated json state example

* fixed issue where only one merge task per input file was created per call to check_mergeable_files

* stage in queuedata from cvmfs instead of Harvester cacher

* update for ray 2.0.0

* create state.json with task output directory instead of expecting Harvester to create it. Fixed type in job parameter name

* rename job file instead of deleting it

* parse job parameters to replace input and output files

* update for new job definition parameters

* stop threads in case of exception

* check if threads are alive

* par --outputHITSFile before forwarding jobdef to pilot

* inject eventservice flags in jobdef for pilot

* support for merging multiple input files into the same output file, generates correct output filenames

* taskdir from config, fixed issue with cleaner thread not having most up to date taskdir path

* flake8

* Updated test conf

* Fix typo in test jobdef

* Generate job definition with appropriate number of input / output files

* update generated event ranges filename, pass an array to set_file_merged

* updated queuedata file

* remove obsolete autoConfiguration parameter

* parse hostname output for multiple IP addresses

* don't use bash arrays; Harvester will interpret {} as parameters to substitute

* assign event ranges by file

* revert queuedata file

* fixed issue with status of event range status before update rangesID_by_file

* install missing module

* fixed issue where results list was incorrectly indexed

* update actions version

* keep pre-allocating the list

* don't print entire merge transform command

* account for failed ranges when checking if a file is ready to be merged

* fixed failed events not being correctly accounted for when creating merge tasks

* fixe evnt_range not defined for failed ranges

* account for failed ranges in the assert

* handle failed event ranges by flagging as failed all ranges that should be merged with it

* only launch merge jobs if we have enough events, not counting failed events

* updated booking of ranges in the same file of a failed range

* missing braces

* formatting

* fix unit tests

* Unpin protobuf

* removed `_system_config` argument which fails with ray 2.3.0

* removed syste-config incompatible with ray 2.3.0

* updated harvester template script

* fixed issues when generating failed event ranges

* update bookkeeping correctly when receiving a failed env range from pilot

* flake8, remove unused method

* removed test for obsolete methods, generate correct event range ids

* Validate job output (#45)

* Script to validate job output. Check for duplicates and completeness

* install validation script

* flake8

* flake8

* Fix unit tests

* Athenamp jobpar (#46)

* force multiprocess option

* Only add multiprocess for release23

* reads container type and options from panda queue config

* remove multithread argument

* flake8

* Support for merge transform in apptainer

* unused import

* Root validation (#49)

* Apptainer merging (#47) (#48)

* updated json state example

* fixed issue where only one merge task per input file was created per call to check_mergeable_files

* stage in queuedata from cvmfs instead of Harvester cacher

* update for ray 2.0.0

* create state.json with task output directory instead of expecting Harvester to create it. Fixed type in job parameter name

* rename job file instead of deleting it

* parse job parameters to replace input and output files

* update for new job definition parameters

* stop threads in case of exception

* check if threads are alive

* par --outputHITSFile before forwarding jobdef to pilot

* inject eventservice flags in jobdef for pilot

* support for merging multiple input files into the same output file, generates correct output filenames

* taskdir from config, fixed issue with cleaner thread not having most up to date taskdir path

* flake8

* Updated test conf

* Fix typo in test jobdef

* Generate job definition with appropriate number of input / output files

* update generated event ranges filename, pass an array to set_file_merged

* updated queuedata file

* remove obsolete autoConfiguration parameter

* parse hostname output for multiple IP addresses

* don't use bash arrays; Harvester will interpret {} as parameters to substitute

* assign event ranges by file

* revert queuedata file

* fixed issue with status of event range status before update rangesID_by_file

* install missing module

* fixed issue where results list was incorrectly indexed

* update actions version

* keep pre-allocating the list

* don't print entire merge transform command

* account for failed ranges when checking if a file is ready to be merged

* fixed failed events not being correctly accounted for when creating merge tasks

* fixe evnt_range not defined for failed ranges

* account for failed ranges in the assert

* handle failed event ranges by flagging as failed all ranges that should be merged with it

* only launch merge jobs if we have enough events, not counting failed events

* updated booking of ranges in the same file of a failed range

* missing braces

* formatting

* fix unit tests

* Unpin protobuf

* removed `_system_config` argument which fails with ray 2.3.0

* removed syste-config incompatible with ray 2.3.0

* updated harvester template script

* fixed issues when generating failed event ranges

* update bookkeeping correctly when receiving a failed env range from pilot

* flake8, remove unused method

* removed test for obsolete methods, generate correct event range ids

* Validate job output (#45)

* Script to validate job output. Check for duplicates and completeness

* install validation script

* flake8

* flake8

* Fix unit tests

* Athenamp jobpar (#46)

* force multiprocess option

* Only add multiprocess for release23

* reads container type and options from panda queue config

* remove multithread argument

* flake8

* Support for merge transform in apptainer

* unused import

* use PyROOT instead AthFile to validate output

* formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant