-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: command line parse xml #197
Conversation
pyhf/commandline.py
Outdated
|
||
@click.command() | ||
@click.option('--entrypoint-xml', required=True, prompt='Top-level XML', help='The top-level XML file for the workspace definition.', type=click.Path(exists=True)) | ||
@click.option('--workspace', required=True, prompt='Workspace directory', help='The location of workspace.', type=click.Path(exists=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think workspace
is a bit of a misnomer here. The workspace is the result of parsing the xml and root files via hist2workspace
. Should we use the same name as the parse
function, i.e. --rootdir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe top-level directory?
import pyhf | ||
|
||
# see test_import.py for the same (detailed) test | ||
def test_import_prepHistFactory(tmpdir, script_runner): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this fixture defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comes from pytest-console-scripts
(#198) which adds the fixture (see readme)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok click has some built in testing capabilities from click.testing
. Could use that and avoid the dependency unless pytest-console-scripts
adds some nice features (haven't used it)
example usage
https://github.com/yadage/yadage/blob/master/tests/test_maincli.py#L5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CliRunner doesn't isolate stdout/stderr. It's probably only specific to running click-enabled commands. The pytest-console-scripts is much more generic (runs any script). I would use CliRunner if I spent more time figuring out stderr extraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok yes, testing stdout/stderr separately is important, especially if we want to do e.g. > bla.json
where we must ensure that the stdout is json deserializable. let's go with pytest-consolte-scripts
then.
pyhf/commandline.py
Outdated
import json | ||
from . import readxml | ||
|
||
@click.command() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about a toplevel pyhf
command and xml2json
could be one of the subcommands
> pyhf --help
> pyhf xml2json --help
@click.group():
@click.option(...) #some global opts
def pyhf():
pass
@pyhf.command()
@click.option('--entrypoint-xml')
...
def xml2json(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one global option could be the logging verbosity
@click.option('verbosity', default='INFO')
def pyhf(verbosity):
logging.basicConfig(level=getattr(logging, verbosity), format=LOGFORMAT)
btw I kinda like using log formats with fixed width sections like, thoughts?
LOGFORMAT = '%(asctime)s | %(name)20.20s | %(levelname)6s | %(message)s'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that too. I'm fine with doing that as well.
setup.py
Outdated
@@ -52,6 +55,7 @@ | |||
] | |||
}, | |||
entry_points = { | |||
'console_scripts': ['pyhf_xml2json=pyhf.commandline:xml2json'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly renamed after comment above
pyhf/readxml.py
Outdated
@@ -11,6 +11,7 @@ def import_root_histogram(rootdir, filename, path, name): | |||
#import pdb; pdb.set_trace() | |||
#assert path == '' | |||
# strip leading slashes as uproot doesn't use "/" for top-level | |||
if path is None: path = '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was needed to handle situations where HistoPath
wasn't included -- and in these cases, it's equivalent to ''
. This code does need to be fixed up more to normalize the XMLs better...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe
path = path or ''
is more pythonic?
pyhf/commandline.py
Outdated
@pyhf.command() | ||
@click.option('--entrypoint-xml', required=True, prompt='Top-level XML', help='The top-level XML file for the PDF definition.', type=click.Path(exists=True)) | ||
@click.option('--basedir', required=True, prompt='Base directory', help='The base directory for the XML files to point relative to.', type=click.Path(exists=True)) | ||
@click.option('--output-file', required=True, prompt='Output file', help='The location of the output json file. If not specified, prints to screen.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about making the entrypoint-xml
be a click.argument
there is really not way to convert without input, so pyhf xml2json input.xml
seems to be a good cmd line
--basedir
could default to os.getcwd()
also, maybe it's somewhat more unixy to print to stdout if the output file is not provided?
pyhf xml2json input.xml > test.json
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pyhf xml2json input.xml > test.json ?
The good part is tqdm
is part of stderr
so we can definitely do that.
pyhf/commandline.py
Outdated
@click.option('--entrypoint-xml', required=True, prompt='Top-level XML', help='The top-level XML file for the PDF definition.', type=click.Path(exists=True)) | ||
@click.option('--basedir', required=True, prompt='Base directory', help='The base directory for the XML files to point relative to.', type=click.Path(exists=True)) | ||
@click.option('--output-file', required=True, prompt='Output file', help='The location of the output json file. If not specified, prints to screen.') | ||
@click.option('--tqdm/--no-tqdm', default=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: --track/--no-track
or --track-progress/--no-track-progress
pyhf/readxml.py
Outdated
@@ -26,7 +28,7 @@ def import_root_histogram(rootdir, filename, path, name): | |||
|
|||
raise KeyError('Both {0:s} and {1:s} were tried and not found in {2:s}'.format(name, os.path.join(path, name), os.path.join(rootdir, filename))) | |||
|
|||
def process_sample(sample,rootdir,inputfile, histopath, channelname): | |||
def process_sample(sample,rootdir,inputfile, histopath, channelname, enable_tqdm=False): | |||
if 'InputFile' in sample.attrib: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here and in the other cases, I'd also suggest renaming to track_progress
instead of enable_tqdm
looks good to me. we have a cli! ✨ |
Description
This allows someone to use
pyhf
from command-line to parse the XML workspaces using thereadxml.parse
functionality. This usesclick
to set up options/arguments.This also updates
readxml
with some slight reorganization to allow for atqdm
progress bar that can be enabled (default: disabled) for reading in the channels in various XML files.NB: both
click
andtqdm
are already installed via our dependencies, but I've explicitly listed them insetup.py
now.Checklist Before Requesting Approver