Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support testing string or dictionary API outputs in test YAML #938

Closed
alexsoble opened this issue Jan 21, 2020 · 7 comments
Assignees
Labels
kind:feat A feature request, a feature deprecation

Comments

@alexsoble
Copy link

Hi there!

I really enjoy OpenFisca. I recently encountered an issue.

Here is what I did:

I am working on a prototype API with the USA Head Start early childhood program. You can see a work-in-progress prototype here.

The API includes a head_start_eligibility_status output variable that returns string information about Head Start eligibility. Example API outputs can look like this:

"Eligible for Head Start. Slot in a program not guaranteed. Eligible because the family is experiencing homelessness."

"Eligible for Head Start. Slot in a program not guaranteed. Eligible because the child is in foster care. Eligible because of eligibility for TANF or SSI. Eligible because family is below the federal poverty line."

"May be Eligible, depending on the child's needs and the slots available."

Here is what I expected to happen:

I was hoping that I could use the YAML test infrastructure provided by OpenFisca-Core to test values of the API's string outputs against my expected values.

Ideally, I would also like to use the YAML test infrastructure to test JSON objects returned from the API that might look something like this:

{
  "status": "ELIGIBLE",
  "description": "Eligible for Head Start. Slot in a program not guaranteed. Eligible because the family is experiencing homelessness."
}

At present, the API does not return JSON objects in this format. It does return string outputs as described above.

Here is what actually happened:

When I try to test the value of the API's string output against a short string like "hello", I see the following error:

ValueError: could not convert string to float: 'hello'

See failing test on CircleCI.

When I try to test the value of the API's string output against a longer string, I see the following error:

SyntaxError: invalid syntax

See failing test on CircleCI.

Here is what I see in the code:

Looking at the OpenFisca-Core code's assert_near function, I see that the code is currently set up to test Floats and Enums.

def assert_near(value, target_value, absolute_error_margin = None, message = '', relative_error_margin = None):

If a string is sent to the test runner, the code attempts to cast it to an arithmetical expression.

target_value = eval_expression(target_value)

I understand that this is likely a non-trivial feature request, since the test runner would have to be able to distinguish between arithmetical expressions which reduce to floats, and string values which would be compared to other strings.

The dictionary use case I describe above might be simpler to implement.

Here is data (or links to it) that can help you reproduce this issue:

See the above CircleCI links. Also, feel free to clone the prototype-openfisca-usa-headstart repo and run make test locally.

Context

I identify more as a: Developer (I create tools that use the existing OpenFisca code).

@alexsoble alexsoble added the kind:feat A feature request, a feature deprecation label Jan 21, 2020
@sandcha sandcha self-assigned this Jan 27, 2020
@sandcha
Copy link
Collaborator

sandcha commented Jan 30, 2020

Thank you @alexsoble for this well structured feature request!

The API includes a head_start_eligibility_status output variable that returns string information (...)

Did you try using the Enum type ? 🤔
It’s the usual way to define status variables in openfisca models.

It will look like this:

class HeadStartEligibilityStatus(Enum):
    __order__ = 'eligible may_be homelessness'
    eligible = 'Eligible for Head Start. Slot in a program not guaranteed.'
    may_be = 'May be Eligible, depending on the child\'s needs and the slots available.'
    homelessness = 'Eligible because the family is experiencing homelessness.'
    # …

class head_start_eligibility_status(Variable):
    value_type = Enum
    possible_values = HeadStartEligibilityStatus
    default_value = HeadStartEligibilityStatus.eligible
    entity = Family
    definition_period = YEAR
    label = u"Head Start Eligibility Status"

    def formula(family, period, parameters):
	# …	

I was hoping that I could use the YAML test infrastructure provided by OpenFisca-Core to test values of the API's string outputs against my expected values.

Using Enum type would allow you to test the eligibility values in your YAML tests.
Here is an example from the country-template that tests enums in the expected outputs.

Besides, it's less sensible than strings to small modifications (adding a comma, updating the wording...) that might occur but doesn't impact the eligibility status by itself.

Looking at the OpenFisca-Core code's assert_near function, I see that the code is currently set up to test Floats and Enums.

The thing is, the YAML testing should also be able to test expressions. I think that introducing string checks might lead us to find a way to check whether a text is an expression or a string text. It could be a bit tricky.

Here is an example of expressions from the French model (where ppa stands for employment bonus).

So, does the Enum solution work for you?
If it doesn't, I will see what could be done to help you. 🙂

@alexsoble
Copy link
Author

Thank you @sandcha, I will investigate the Enum route!

@alexsoble
Copy link
Author

@sandcha Hello! So, I got the Enum example in country-template working locally, and opened a PR to surface that example in the examples JSON and in the README: openfisca/country-template#83.

Where I'm still a bit stuck — in the housing.json example, the country-template API returns the following:

"housing_occupancy_status":{"2017-01":"tenant"}}

How could I get the API to return value string ("Tenant") instead of the name string ("tenant")? The name string is short, and the value string can be more thorough and descriptive.

I'm thinking it's either one of two things here:

  1. It might be that this is relatively straightforward, but I'm just not seeing how I would do this in the OpenFisca Enum docs...

  2. Or it might be that the OpenFisca "way" is to return names from Enums rather than full values, and have client-side applications map from names to full string descriptions...

Either is fine, and either way I would be happy to help add to the documentation to explain to newer OpenFisca users!

Thanks again for your help!

@Morendil
Copy link
Contributor

Morendil commented Feb 11, 2020

Hi @alexsoble - yes, (2) is the intended way; where OpenFisca shines is in dealing with numbers, with enumerated values a later addition and string values very much second-class citizens. (This is largely due to the way we rely on NumPy to efficiently process large collections of values and not just single values.) The item values of Enum variables are essentially deprecated since Core 21.0 for performance reasons.

I'd advise handling representation of computed values as strings in your presentation layer (i.e. the client-side app).

Updating the documentation to reflect current capabilities and intended use is definitely a good idea :)

@alexsoble
Copy link
Author

@Morendil Thank you, I appreciate the explanation!

Opened that documentation PR here: openfisca/openfisca-doc#225.

Cheers!

@bonjourmauko
Copy link
Member

Thank you @alexsoble @Morendil @sandcha ! Please let me know if we can close this issue, as it seems an answer was given and @alexsoble's contribution to the doc has been merged.

@alexsoble
Copy link
Author

I'm comfortable closing this — thanks all for the help and explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feat A feature request, a feature deprecation
Projects
None yet
Development

No branches or pull requests

4 participants