Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API CLI output formatting that is unix-idiomatic and machine-friendly #4348

Closed
dmos62 opened this issue Jul 1, 2020 · 7 comments
Closed

Comments

@dmos62
Copy link
Contributor

dmos62 commented Jul 1, 2020

The new gRPC API CLI client will be outputting various types of data and we want the output format, just like the rest of the CLI client, to be unix-idiomatic. By unix-idomatic, I mean following the unix philosophy, which, most notably, includes interoperating well with other unix standard tools. The idea is for the output logic to be simple and to be able to achieve complex use cases by piping to other simple and standard tools.

Table format

Currently, the output of getoffers call looks more or less like this (below output is a bit dated; some things like the datetime format has changed, but currently I don't have access to my dev workspace, so using this):

Buy/Sell  Price in USD for 1 BTC  BTC(min - max)            USD(min - max)  Payment Method  Creation Date             ID
BUY                   9,485.9500  0.03250000 - 0.06250000        308 - 593  Cash Deposit    Jun 21, 2020 10:49:46 PM  nywdxnx-e77596af-c7e9-4df0-b2cd-86a83eb717d1-135
BUY                   9,675.6690  0.07860000                           761  Zelle           Jun 22, 2020 10:27:40 AM  CCQZCI-fa41db95-f002-4ebe-83b0-c2be2d5573a6-134

Above output is not easy to destructure automatically. For example, using spaces to distinguish columns is not possible:

#!/bin/bash

function drop_header {
	tail -n +2
}

function squeeze_spaces {
	tr -s " "
}

function space_delimited_table {
	column -t -s " "
}

cat current-getoffers-output.txt \
| drop_header \
| squeeze_spaces \
| space_delimited_table

Outputs a garbled table:

BUY  9,485.9500  0.03250000  -    0.06250000  308  -    593   Cash      Deposit  Jun                                              21,  2020  10:49:46  PM  nywdxnx-e77596af-c7e9-4df0-b2cd-86a83eb717d1-135
BUY  9,675.6690  0.07860000  761  Zelle       Jun  22,  2020  10:27:40  AM       CCQZCI-fa41db95-f002-4ebe-83b0-c2be2d5573a6-134

The motivation behind current formatting is human friendliness, but, after giving it more thought, true developer friendliness requires easy programmatic output interpretation.

We can use a simpler columnar format with explicit delimiters (only whitespace between columns has been changed):

Buy/Sell|Price in USD for 1 BTC|BTC(min - max)|USD(min - max)|Payment Method|Creation Date|ID
BUY|9,485.9500|0.03250000 - 0.06250000|308 - 593|Cash Deposit|Jun 21, 2020 10:49:46 PM|nywdxnx-e77596af-c7e9-4df0-b2cd-86a83eb717d1-135    
BUY|9,675.6690|0.07860000|761|Zelle|Jun 22, 2020 10:27:40 AM|CCQZCI-fa41db95-f002-4ebe-83b0-c2be2d5573a6-134

To format into a human friendly table:

#!/bin/bash

function custom_delimiter_table {
	column -t -s "|"
}

cat proposed-getoffers-output.txt \
| custom_delimiter_table

Output:

Buy/Sell  Price in USD for 1 BTC  BTC(min - max)           USD(min - max)  Payment Method  Creation Date             ID
BUY       9,485.9500              0.03250000 - 0.06250000  308 - 593       Cash Deposit    Jun 21, 2020 10:49:46 PM  nywdxnx-e77596af-c7e9-4df0-b2cd-86a83eb717d1-135    
BUY       9,675.6690              0.07860000               761             Zelle           Jun 22, 2020 10:27:40 AM  CCQZCI-fa41db95-f002-4ebe-83b0-c2be2d5573a6-134

This requires the data to not include the delimiter, which might seem fiddly, but can be solved by allowing the user to specify a custom delimiter (e.g. |||||). Unless data in question might contain malicious input, sanitization shouldn't be necessary.

Column complexity

Another concern is that some columns are difficult to machine parse, because they're optimized for space efficiency and general human readability.

For example, the column titled BTC(min - max) can either contain the offer quantity or the offer quantity range, which means it can contain two different types of data: a hyphen-deliminated tuple of floats (0.03250000 - 0.06250000) or a single float (0.03250000).

The column BTC(min - max) can be unpacked into two columns minimum BTC quantity and maximum BTC quantity. That way we have the same information, but the complexity is reduced (both columns are just floats). When the quantity is a range, the new columns will have different values; otherwise, they will be the same. Same goes for UTC(min -max).

More data in rows

There's some information, like the offer pair (e.g. BTC and USD), that's embedded in the header, but not in the data rows. That makes it impossible to mix multiple currency pairs in the same table. If pair information was in the rows, you could query getoffers for multiple pairs at once, or greatly simplify subscribing to push updates (one subscription for any number of pairs); in short, a table that stores all information in the rows is easier to work with.

Instead of Price in USD for 1 BTC|BTC(min - max)|USD(min - max), we could have:

Price|Base currency|Base currency quantity(min - max)|Quote currency|Quote currency quantity(min - max)

or with the above proposed change

Price|Base currency|Minimum base currency quantity|Maximum base currency quantity|Quote currency|Minimum quote currency quantity|Maximum quote currency quantity

Some notes:

  • some abbreviations might be useful;
  • also, notice that Price in USD for 1 BTC implies it is the value of base currency being quoted against the quote currency, so I shortened it to Price;
  • see this article for definitions of currency pairs, base currency and quote currency: https://www.investopedia.com/terms/c/currencypair.asp ;
  • notice that the last example is considerably more verbose than the current table format, but it is much easier to machine parse (all cells are simple types with simple semantics); ease and simplicity of setting up machine parsing is the priority.
@dmos62
Copy link
Contributor Author

dmos62 commented Jul 7, 2020

Above thinking led me to a human-unfriendly format, but @ghubstan pointed out that some of us will want to use the API interactively (as in calling it from a terminal emulator, see [0] for term definition). That leads to a bit more compllicated output logic where you can choose between either a human-optimized or a machine-optimized output.

The simplest solution I see is to have the machine-optimized output as outlined above, but have an optional post-processing step (in terms of implementation) that would make the output more terse, merging multiple simple columns into one complex column, and thus more human friendly.

So you'd have something close to this for a human Price in USD for 1 BTC|BTC(min - max)|USD(min - max) and something closer to this Price|Base currency|Minimum base currency quantity|Maximum base currency quantity|Quote currency|Minimum quote currency quantity|Maximum quote currency quantity for a machine.

The "optional post-processing step" part is important, because it's the fact that the human friendly version is a derivative of the simple, machine-friendly version that makes the implementation simple (no duplicated logic).

Another alternative to keep in mind

Another option is to use a headerless table format, which is common amongst unix tools (like mount or git remote -v). It puts keys and values on the same row, which often makes the table easier to read (rows look more like terse sentences than rows in a table) and makes it easy to query with very basic parsing:

> mount | grep "type cgroup" | grep "cpu"
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)

Supporting both a headered table and a headerless table would be simple enough, but we'll only investigate adopting it if the headered table format described earlier is not human friendly enough.

"What would be more unix-idiomatic?"

Unix has prominent tools that use headered and headerless tables. mount example above is a headerless table. ps and ps aux are good examples of headered tables. What format fits the use-case better seems to be a function of number of columns. For example ps could easily be a headerless table, but ps aux has so many columns that a headerless table would be too verbose for a human:

> ps
    PID TTY          TIME CMD
   4958 pts/1    00:00:00 fish
   9527 pts/1    00:00:00 ps
> ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.2 167768 10240 ?        Ss   15:29   0:06 /sbin/init splash
root           2  0.0  0.0      0     0 ?        S    15:29   0:00 [kthreadd]
root           3  0.0  0.0      0     0 ?        I<   15:29   0:00 [rcu_gp]

[0] https://unix.stackexchange.com/questions/43385/what-do-you-mean-by-interactive-shell

@stale
Copy link

stale bot commented Oct 10, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the was:dropped label Oct 10, 2020
@cd2357
Copy link
Contributor

cd2357 commented Oct 10, 2020

One way to cover both is with an optional parameter like --format json or --format csv.

Outputs could then be human-readable by default for the interactive shell, but could be made machine-readable with just one flag.

@stale stale bot removed the was:dropped label Oct 10, 2020
@cd2357 cd2357 added the in:api label Oct 10, 2020
@stale
Copy link

stale bot commented Jan 10, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale
Copy link

stale bot commented Jun 11, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the was:dropped label Apr 16, 2022
@stale
Copy link

stale bot commented Apr 28, 2022

This issue has been automatically closed because of inactivity. Feel free to reopen it if you think it is still relevant.

@stale stale bot closed this as completed Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants