-
Notifications
You must be signed in to change notification settings - Fork 120
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to differentiate decimal and floating point numbers #709
Comments
this does open the door to quite a bit more e.g. float vs double etc (we
should look at other systems). I've wondered whether we should use the
"format" argument on number to qualify things or similar vs having lots of
number types.
…_________________________________________________________________________
*Datopian *| https://datopian.com | Open solutions for a data driven world
*DataHub* | https://datahub.io | GitHub for data
*CKAN * | http://ckan.org <https://ckan.org/> | The world's leading
data portal solution
President - +44 7795176976 - @rufuspollock
On Wed, Oct 14, 2020 at 10:42 AM roll ***@***.***> wrote:
Overview
At the moment we have using a Python as a platform example but it's
similar in other languages:
- integer -> int (Python)
- number -> decimal (Python)
We don't have a type to represent numbers when it's ok to lose some
precision in favor of calculation speed. Decimals are really slow in
Python/JavaScript/etc
Here is a root issue with benchmarking -
frictionlessdata/frictionless-py#461
<frictionlessdata/frictionless-py#461>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#709>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABMDMTECRITVQCLG2R7NLDSKVQA5ANCNFSM4SQJTT2Q>
.
|
Extending It can have something like
I didn't mention it in the first place just because I'm not sure whether it's ok to have data types that maps to N native types e.g. BTW I think it's a really important thing to consider as our dependency on decimals withouth an opt-out option makes Frictionless really useless at crunching numbers and we get more and more people coming from Pandas/Numpy (cc @akariv) |
Out of curiosity which way is this leaning? "precision" or "format"? Asking since I'm in a position where I'm pondering ways to augment my own datapackages with both the above "singe", "double", "infinite" numbers but actually even more important to me for integers (=> (s)int[8|16|32|64) My thoughts thus far has been centered on making it custom formats since from what I understand unknown formats are validationwise ignored so it would be transparent to other consumers. edit: one perhaps even better solution could be to leverage the maxLength constraint and define it as the number of bytes for numbers and integers. That would limit the number of necessary new symbols avoid language specific names and let implementors pick the most suitable mapping for fp32,64,80 and int types. |
@drunkcod No support from the specs yet |
Any movement on this, even in terms of an unofficial pattern people have been using? Building some tooling and using the table schema where it would be very useful to specify a currency field, IE a number field with two decimal place precision. I can make up my own pattern but would like to use something with some precedent. |
I think we need to prioritize this feature as it's critical for a lot of cases |
Consider existing standards for low precision real number encoding, such as FP16, or Linear11 Also, Q notation is one option for flexible fixed point encoding. See Also in the following article has more info: If you are encoding [0...1) values such as covariance coefficients, you can fix the exponent and use an 8 bit mantissa. |
We may be trying to solve too many problems at once here, and there may be a reason so many standards just avoid dealing with this problem, but that doesn't exactly help people who need to represent something specific like currency that has two decimal places. If we wanted to just address the decimal problem, it could be with a new constraint of Another solution I've seen suggested is |
@roll i'd prefer doing the simplest thing possible at present and use @roll i'm happy with you to run with a recommendation on this one and trial into the frictionless framework and if you have something working well for us to "standardize" as it were. |
Hello - any current guidance on the use-case I described above? We have still not committed to a particular strategy for our currency columns. |
There is another isue for currency types: #352. In general I'd prefer to stick to an existing standard of data types such as XML Schema Datatypes or Spreadsheet datatypes instead of inventing our own solution. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Overview
At the moment we have (using a Python as a platform example but it's similar in other languages):
integer
->int
(Python)number
->decimal.Decimal
(Python) - guaranteed to have 100% precisionWe don't have a type to represent numbers when it's ok to lose some precision in favor of calculation speed. Decimals are really slow in Python/JavaScript/etc
Here is a root issue with benchmarking - frictionlessdata/frictionless-py#461
The text was updated successfully, but these errors were encountered: