Feature: command line parse xml #197

kratsg · 2018-08-23T17:46:18Z

Description

This allows someone to use pyhf from command-line to parse the XML workspaces using the readxml.parse functionality. This uses click to set up options/arguments.

This also updates readxml with some slight reorganization to allow for a tqdm progress bar that can be enabled (default: disabled) for reading in the channels in various XML files.

pyhf_xml2json --entrypoint-xml validation/multibin_multibjets/config/NormalMeasurement.xml --workspace validation/multibin_multibjets/ --output_file test.json

NB: both click and tqdm are already installed via our dependencies, but I've explicitly listed them in setup.py now.

Checklist Before Requesting Approver

Tests are passing
"WIP" removed from the title of the pull request

coveralls · 2018-08-23T17:54:41Z

Coverage decreased (-0.0001%) to 96.878% when pulling 29bca28 on feature/commandLineParseXML into 4b04b01 on master.

lukasheinrich · 2018-08-23T22:51:43Z

pyhf/commandline.py

+
+@click.command()
+@click.option('--entrypoint-xml', required=True, prompt='Top-level XML', help='The top-level XML file for the workspace definition.', type=click.Path(exists=True))
+@click.option('--workspace', required=True, prompt='Workspace directory', help='The location of workspace.', type=click.Path(exists=True))


I think workspace is a bit of a misnomer here. The workspace is the result of parsing the xml and root files via hist2workspace. Should we use the same name as the parse function, i.e. --rootdir

maybe top-level directory?

lukasheinrich · 2018-08-23T22:58:36Z

tests/test_scripts.py

+import pyhf
+
+# see test_import.py for the same (detailed) test
+def test_import_prepHistFactory(tmpdir, script_runner):


where is this fixture defined?

Comes from pytest-console-scripts (#198) which adds the fixture (see readme)

ah ok click has some built in testing capabilities from click.testing. Could use that and avoid the dependency unless pytest-console-scripts adds some nice features (haven't used it)

example usage
https://github.com/yadage/yadage/blob/master/tests/test_maincli.py#L5

CliRunner doesn't isolate stdout/stderr. It's probably only specific to running click-enabled commands. The pytest-console-scripts is much more generic (runs any script). I would use CliRunner if I spent more time figuring out stderr extraction.

ok yes, testing stdout/stderr separately is important, especially if we want to do e.g. > bla.json where we must ensure that the stdout is json deserializable. let's go with pytest-consolte-scripts then.

lukasheinrich · 2018-08-23T23:00:54Z

pyhf/commandline.py

+import json
+from . import readxml
+
+@click.command()


what do you think about a toplevel pyhf command and xml2json could be one of the subcommands

> pyhf --help > pyhf xml2json --help

@click.group(): @click.option(...) #some global opts def pyhf(): pass @pyhf.command() @click.option('--entrypoint-xml') ... def xml2json(...)

one global option could be the logging verbosity

@click.option('verbosity', default='INFO') def pyhf(verbosity): logging.basicConfig(level=getattr(logging, verbosity), format=LOGFORMAT)

btw I kinda like using log formats with fixed width sections like, thoughts?

LOGFORMAT = '%(asctime)s | %(name)20.20s | %(levelname)6s | %(message)s'

I thought about that too. I'm fine with doing that as well.

lukasheinrich · 2018-08-23T23:01:24Z

setup.py

@@ -52,6 +55,7 @@
    ]
  },
  entry_points = {
+      'console_scripts': ['pyhf_xml2json=pyhf.commandline:xml2json']


possibly renamed after comment above

kratsg · 2018-08-24T01:09:52Z

This now spawns three lines of progress bars when running from command line. The first one is overall progress (how many channels left to process), the next line is the number of samples for the given channel, and the last (third) line is the modifiers for the given sample for the given channel.

kratsg · 2018-08-24T01:11:06Z

pyhf/readxml.py

@@ -11,6 +11,7 @@ def import_root_histogram(rootdir, filename, path, name):
    #import pdb; pdb.set_trace()
    #assert path == ''
    # strip leading slashes as uproot doesn't use "/" for top-level
+    if path is None: path = ''


This was needed to handle situations where HistoPath wasn't included -- and in these cases, it's equivalent to ''. This code does need to be fixed up more to normalize the XMLs better...

maybe

path = path or ''

is more pythonic?

lukasheinrich · 2018-08-24T04:41:35Z

pyhf/commandline.py

+@pyhf.command()
+@click.option('--entrypoint-xml', required=True, prompt='Top-level XML', help='The top-level XML file for the PDF definition.', type=click.Path(exists=True))
+@click.option('--basedir', required=True, prompt='Base directory', help='The base directory for the XML files to point relative to.', type=click.Path(exists=True))
+@click.option('--output-file', required=True, prompt='Output file', help='The location of the output json file. If not specified, prints to screen.')


what do you think about making the entrypoint-xml be a click.argument there is really not way to convert without input, so pyhf xml2json input.xml seems to be a good cmd line

--basedir could default to os.getcwd()

also, maybe it's somewhat more unixy to print to stdout if the output file is not provided?

pyhf xml2json input.xml > test.json ?

pyhf xml2json input.xml > test.json ?

The good part is tqdm is part of stderr so we can definitely do that.

lukasheinrich · 2018-08-24T04:43:03Z

pyhf/commandline.py

+@click.option('--entrypoint-xml', required=True, prompt='Top-level XML', help='The top-level XML file for the PDF definition.', type=click.Path(exists=True))
+@click.option('--basedir', required=True, prompt='Base directory', help='The base directory for the XML files to point relative to.', type=click.Path(exists=True))
+@click.option('--output-file', required=True, prompt='Output file', help='The location of the output json file. If not specified, prints to screen.')
+@click.option('--tqdm/--no-tqdm', default=True)


suggestion: --track/--no-track or --track-progress/--no-track-progress

lukasheinrich · 2018-08-24T04:49:26Z

pyhf/readxml.py

@@ -26,7 +28,7 @@ def import_root_histogram(rootdir, filename, path, name):

    raise KeyError('Both {0:s} and {1:s} were tried and not found in {2:s}'.format(name, os.path.join(path, name), os.path.join(rootdir, filename)))

-def process_sample(sample,rootdir,inputfile, histopath, channelname):
+def process_sample(sample,rootdir,inputfile, histopath, channelname, enable_tqdm=False):
    if 'InputFile' in sample.attrib:


here and in the other cases, I'd also suggest renaming to track_progress instead of enable_tqdm

…f not input specified

lukasheinrich · 2018-08-24T09:05:25Z

looks good to me. we have a cli! ✨

kratsg added 2 commits August 23, 2018 10:00

add entrypoint for readxml

150756e

use click for commandline, tqdm for readxml (disabled by default)

a54b77f

kratsg added the feat/enhancement New feature or request label Aug 23, 2018

kratsg requested review from lukasheinrich and matthewfeickert August 23, 2018 17:46

no need for checking if path exists for output json file

2c12f09

kratsg mentioned this pull request Aug 23, 2018

Use pytest-console-scripts to test entrypoint scripts in future #198

Closed

kratsg added 2 commits August 23, 2018 11:37

add more coverage for testing the entry point script. see #198.

54b378c

make pyflakes happy

de9a382

lukasheinrich reviewed Aug 23, 2018

View reviewed changes

kratsg added 5 commits August 23, 2018 16:17

change to click group

eb94579

workspace to basedir

b9b294c

update tests

aeffd4f

update tests again

616f5c6

more tqdms. it's soo pretty

6a28836

kratsg commented Aug 24, 2018

View reviewed changes

lukasheinrich reviewed Aug 24, 2018

View reviewed changes

lukasheinrich mentioned this pull request Aug 24, 2018

reorg: submodules "lib" "visualization" "cli" #199

Closed

kratsg added 4 commits August 23, 2018 22:56

tqdm -> track-progress/hide-progress

f591c19

more pythonic

a904111

entrypoint-xml is an argument. basedir defaults to cwd. output json i…

2dc2a02

…f not input specified

broke click, no help in argument

29bca28

lukasheinrich approved these changes Aug 24, 2018

View reviewed changes

lukasheinrich merged commit a3834a4 into master Aug 24, 2018

matthewfeickert deleted the feature/commandLineParseXML branch August 24, 2018 09:29

matthewfeickert mentioned this pull request Aug 24, 2018

Add notebook demoing pyhf xml2json #202

Open

kratsg mentioned this pull request Aug 25, 2018

Add documentation on cli commands (including piping) #212

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: command line parse xml #197

Feature: command line parse xml #197

kratsg commented Aug 23, 2018

coveralls commented Aug 23, 2018 •

edited

Loading

lukasheinrich Aug 23, 2018

kratsg Aug 23, 2018

lukasheinrich Aug 23, 2018

kratsg Aug 23, 2018

lukasheinrich Aug 24, 2018

kratsg Aug 24, 2018

lukasheinrich Aug 24, 2018

lukasheinrich Aug 23, 2018

lukasheinrich Aug 23, 2018

kratsg Aug 23, 2018

lukasheinrich Aug 23, 2018

kratsg commented Aug 24, 2018

kratsg Aug 24, 2018

lukasheinrich Aug 24, 2018

lukasheinrich Aug 24, 2018

kratsg Aug 24, 2018

lukasheinrich Aug 24, 2018

lukasheinrich Aug 24, 2018

lukasheinrich commented Aug 24, 2018

Feature: command line parse xml #197

Feature: command line parse xml #197

Conversation

kratsg commented Aug 23, 2018

Description

Checklist Before Requesting Approver

coveralls commented Aug 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kratsg commented Aug 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukasheinrich commented Aug 24, 2018

coveralls commented Aug 23, 2018 •

edited

Loading