Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data importer - parsing units should be case sensitive #2019

Closed
PavelBal opened this issue May 30, 2023 · 14 comments · Fixed by #2024 or #2057
Closed

Data importer - parsing units should be case sensitive #2019

PavelBal opened this issue May 30, 2023 · 14 comments · Fixed by #2024 or #2057

Comments

@PavelBal
Copy link
Member

A while ago I created an issue that parsing of units during import should be case-insensitive #1949

Unfortunately, I have a case where this leads to wrong dimension - when trying to import µm of the dimension Lenght, the importer regodnizes it as µM of the dimension Concentration (molar).

@msevestre
Copy link
Member

well that's ok. We cannot have it both ways. So when you import, it's your job as a user to verify that units and dimensions are recognized properly.

@PavelBal
Copy link
Member Author

Yes but then I would vote for having it case-sensitive. Otherwise there is no way to import anything other than dm for length without having to manually change the dimension (== no way to automatize it through R)

@Yuri05
Copy link
Member

Yuri05 commented May 30, 2023

use µmol/L instead

@Yuri05
Copy link
Member

Yuri05 commented May 30, 2023

The most flexible way:

  1. Try to find the unit case sensitive first. If found: use it
  2. If not found: try to find the unit case insensitive

@msevestre
Copy link
Member

I think this is the way it's actually implemented. First case sensitive, then case insensitive

@Yuri05
Copy link
Member

Yuri05 commented May 31, 2023

I think this is the way it's actually implemented. First case sensitive, then case insensitive

But then µm should be recognized as µm and not as µM
@PavelBal Can you describe the steps to reproduce you problem?

@georgeDaskalakis
Copy link
Contributor

georgeDaskalakis commented May 31, 2023

It is not an issue of unit recognition but of automatically detecting the correct dimension. So (if I am not mistaken by what Pavel describes), in the calculation of the dimension from a unit given in an excel file:

https://github.com/Open-Systems-Pharmacology/OSPSuite.Core/blob/50a7aa040611e25be40541b8a8748976946808a7/src/OSPSuite.Infrastructure.Import/Core/DataFormat/AbstractColumnsDataFormat.cs#LL159C1-L167C8

we are not trying case sensitive first, we just get the first dimension that has such a unit. In Pavel's case with µm , the first dimension found that has such a unit ( µM in this case, since case insensitive) would be Concentration (molar).

@georgeDaskalakis
Copy link
Contributor

I think this is the way it's actually implemented. First case sensitive, then case insensitive

Actually implementing this would be a fix that would keep the correct behaviour also for #1949

@Yuri05
Copy link
Member

Yuri05 commented May 31, 2023

It is not an issue of unit recognition but of automatically detecting the correct dimension. So (if I am not mistaken by what Pavel describes), in the calculation of the dimension from a unit given in an excel file:

I think it would be better to get the dimension from unit via DimensionTask.DimensionForUnit() and then check if the retrieved dimension is not null and is contained in supportedDimensions

@georgeDaskalakis
Copy link
Contributor

georgeDaskalakis commented May 31, 2023

@Yuri05 yeap this is a better solution to avoid code duplication - and it would have exactly the same result. If no objections I would implement this.

@PavelBal
Copy link
Member Author

No objections.

@msevestre
Copy link
Member

Yes. I thought this was the code used indeed. Let's update

msevestre pushed a commit that referenced this issue Jun 6, 2023
* initial implementation

* incomplete unit test

* correction

* adding test

* test corrected

* creating new functions

* test corrected
@PavelBal
Copy link
Member Author

This is NOT fixed

UnitCaseInsensitive.xlsx

the unit is "µm", which should be recognized as the dimension "Length". however, this is being recognized as "Concentration (molar)", probably because it thinks it is "µM"

@PavelBal PavelBal reopened this Jun 29, 2023
@georgeDaskalakis
Copy link
Contributor

Reopened because currently the fix only works for units in the headers, but the behaviour is still the old faulty one when the unit comes from the content of a column.

msevestre pushed a commit that referenced this issue Jul 2, 2023
* initial not building fix with helper

* existing test are green

* avoiding code duplication

* correction

* resharper code cleanup

* adding integration test

* removing redundant function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants