-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data.table
principles
#5693
Comments
Low memory usage. |
Thanks, @jangorecki . I guess 6 was more of a goal than a current practice? I really like the addition of low memory usage. Will include that. |
I don't think we want to maintain multiple languages. Chinese, Russian, Spanish/Portuguese is reasonable maximum IMO. |
Good list |
@DavidArenburg only error/warning/verbose. We have them translated to Chinese for Chinese locale in user session. Point 4 is about backward compatibility (in our api). Point 5 extends it for running on old R version. |
I would possibly add to the list a comprehensive documentation. I haven't seen a package documented better than DT actually. Many just make minimal manual and put more info to vignettes, which is indirect documentation when it comes to description of a function, it's return value, etc. Vignettes should be an accompanying documentation, not the main. |
About international/multilingual/translations, it is true that only Chinese is supported in current message translations. Going forward in the next two years, I plan to invite more translators (of messages and docs), and I actually have money to pay them (20 translation projects, US$500 each). I expect that whoever contributes the intitial translation may be interested to maintain in the future. The goal of the translation effort is to increase the number of potential users and contributors in the data.table ecosystem. |
I wonder if you could please clarify point 4? Maybe change "Few breaking changes" to "Few breaking changes, to make it easy for other packages to use data.table" Is that what you meant? |
By point 4, yes, in my experience DT was always very careful about any releases that would have breaking changes requiring changes to other packages/code bases. There could be a better way of phrasing it but that was the idea behind it. |
@jangorecki I agree. I'll add comprehensive documentation to the list as number 8. |
Maybe consider including something about readability/useability. This could be its own principle, or part of principle 3, e.g. "Concise syntax (minimal redundancy in code), while maintaining readability and ease of use". The reason I suggest this is that data.table seems to have a reputation for being fast but relatively difficult to learn and use. I sometimes see comments like (paraphrasing) "tidyverse is fantastic, and in situations where speed is really important, there's data.table", as though the only advantage of data.table is its speed. Maybe also consider adding something about extensive functionality, unless this goes without saying. |
It nails down to from where you as a user are coming from. I understand your point well, and am observing the same. It just if we want to counter some judgments, marketed at some point by a new project that was targeting less technical audience, about data.table syntax then we could try to make it very precisely. @arunsrinivasan made a nice comment on syntax in his SO answer here: https://stackoverflow.com/a/27718317/2490497
Another good point. |
@markseeto for the extensive functionality, I think that makes a lot of sense. As I think about it, there is definitely some overlap with concise syntax as there are a bunch of things that can be done without going away from the @jangorecki thanks for that link. Feel like that answer should be turned into a blog post or something too. So much gold in that. Also, I think your point of it naturally fitting with SQL (relational databases) is one of its immediate strengths in learning the code. Was wondering if there is a principle there, potentially? Like "syntactic overlap with data analytics, engineering, and mathematics" or something like that? |
@TysonStanley For "extensive functionality", what I'm thinking of is separate from concise syntax, although I agree that there is some overlap. I'm thinking of the ability to do an extensive range of useful operations with the data, whether that's with |
I would say 'Comprehensive and accessible documentation'. I think we strive to have both technically complete, but also user-friendly Rd/vignettes and error/warning messages and NEWS entries |
Is the list meant to be numbered? i.e. are these principles ranked? If so, putting computational & memory efficiency in the same bullet makes sense to me. |
@MichaelChirico thanks! It's not ranked necessarily so I made it bullets instead. And updated it with your suggestions. |
I feel like international/multilingual bullet point could be deleted, since that is the "accessible" part of "Comprehensive and accessible documentation" ? |
I think it would clarify/simplify to combine "Few breaking changes" with "Backward compatibility" since they both are about stability of the code. How about "Stable code base (easy for users to upgrade to new data.table, and compatible with old R versions)" |
@MichaelChirico "I made sure to make note of other community members offering translations in other languages, those are: Vietnamese, French, Russian, Portugese, Farsi, Turkish, Hindi. That's already 4 years ago, so of course would need to check their interest again." -> could you please send me their contact info, so I can ask if they would be interested to apply for translation project awards? |
I shared the Google doc with you. It was mostly twitter replies.
…On Thu, Oct 12, 2023, 11:11 AM Toby Dylan Hocking ***@***.***> wrote:
@MichaelChirico <https://github.com/MichaelChirico> "I made sure to make
note of other community members offering translations in other languages,
those are: Vietnamese, French, Russian, Portugese, Farsi, Turkish, Hindi.
That's already 4 years ago, so of course would need to check their interest
again." -> could you please send me their contact info, so I can ask if
they would be interested to apply for translation project awards?
—
Reply to this email directly, view it on GitHub
<#5693 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB2BA5J7Q7NKOCMUPYNK223X7AXE5ANCNFSM6AAAAAA5KCMN54>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I'd definitely add clear error messages that provide underlying causes, explanations and possible solutions. I.e. not Your errors have helped me convert a few users. |
If Brazilian Portuguese is to be one of the languages, please contact me. I used to translate GNOME to pt_BR and even coordinated the national i10n team until I decided to focus on activities closer to my profession (medicine), which eventually came to mean doing research, which is how I know data.table. I'm not necessarily offering myself (although the money is tempting) but I can find one or another competent free software translator here and help them as needed. edit: now I see someone else volunteered already, so I guess they should probably be the first choice |
FWIW Mandarin took a team of 26 translators -- it's a rather sizeable pool of messages to translate, so having >1 hand available will be appreciated. |
As part of #5676 we would also like to compile a list of principles tied to
data.table
. This will be incorporated into other material over the next few months but wanted to see what you all thought about the list we have initially put together.Anything you'd add to this list? Anything you'd argue does not belong on the list?
Thanks!
The text was updated successfully, but these errors were encountered: