home | roadmap | issues | ©2022,2023 by tim menzies |
data.lua : an example csv reader script
(c)2022, Tim Menzies <timm@ieee.org>, BSD-2
USAGE: data.lua [OPTIONS] [-g ACTION]
OPTIONS:
-d --dump on crash, dump stack = false
-f --file name of file = ../etc/data/auto93.csv
-g --go start-up action = data
-h --help show help = false
-s --seed random number seed = 937162211
ACTIONS:
-g the show settings
-g sym check syms
-g num check nums
-g csv read from csv
-g data read DATA csv
-g stats stats from DATA
- obj(s:str) ⇒ t
-
create a klass and a constructor
Summarize a stream of symbols.
- SYM.new(i, at, txt) ⇒ SYM
-
constructor
- SYM.add(i, x) ⇒ nil
-
update counts of things seen so far
- SYM.mid(i, x) ⇒ n
-
return the mode
- SYM.div(i, x) ⇒ n
-
return the entropy
- SYM.rnd(i, x, n:num) ⇒ s
-
return
n
unchanged (SYMs do not get rounded)
Summarizes a stream of numbers.
- NUM.new(i, at, txt) ⇒ NUM
-
constructor;
- NUM.add(i, n:num) ⇒ NUM
-
add
n
, update lo,hi and stuff needed for standard deviation - NUM.mid(i, x) ⇒ n
-
return mean
- NUM.div(i, x) ⇒ n
-
return standard deviation using Welford's algorithm http://.ly/nn_W
- NUM.rnd(i, x, n:num) ⇒ n
-
return number, rounded
Factory for managing a set of NUMs or SYMs
- COLS.new(i, t:tab) ⇒ COLS
-
generate NUMs and SYMs from column names
- COLS.add(i, row:ROW) ⇒ nil
-
update the (not skipped) columns with details from
row
Store one record.
- ROW.new(i, t:tab) ⇒ ROW
Store many rows, summarized into columns
- DATA.new(i, src:str) ⇒ DATA
-
A container of
i.rows
, to be summarized ini.cols
- DATA.add(i, t:tab) ⇒ nil
-
add a new row, update column headers
- DATA.clone(i, init?) ⇒ DATA
-
return a DATA with same structure as `ii.
- DATA.stats(i, what?, cols:tab?, nPlaces:{num}?) ⇒ t
-
reports mid or div of cols (defaults to i.cols.y)
- rint(lo, hi) ⇒ n
-
a integer lo..hi-1
- rand(lo, hi) ⇒ n
-
a float "x" lo<=x < x
Note the following conventions for functions passed to map
or kap
.
- If a nil first argument is returned, that means :skip this result"
- If a nil second argument is returned, that means place the result as position size+1 in output.
- Else, the second argument is the key where we store function output.
- map(t:tab, fun:fun) ⇒ t
-
map a function
fun
(v) over list (skip nil results) - kap(t:tab, fun:fun) ⇒ t
-
map function
fun
(k,v) over list (skip nil results) - sort(t:tab, fun:fun) ⇒ t
-
return
t
, sorted byfun
(default=<
) - keys(t:tab) ⇒ ss
-
return list of table keys, sorted
- push(t:tab, x) ⇒ any
-
push
x
to end of list; returnx
- fmt(sControl:str, ...) ⇒ str
-
emulate printf
- oo(t:tab) ⇒ t
-
print
t
then return it - o(t:tab, isKeys:{bool}) ⇒ s
-
convert
t
to a string. sort named keys. - coerce(s:str) ⇒ any
-
return int or float or bool or string from
s
- csv(sFilename:str, fun:fun) ⇒ nil
-
call
fun
on rows (after coercing cell text)
- settings(s:str) ⇒ t
-
parse help string to extract a table of options
- cli(options:tab) ⇒ t
-
update key,vals in
t
from command-line flags
main
fills in the settings, updates them from the command line, runs
the start up actions (and before each run, it resets the random number seed and settings);
and, finally, returns the number of test crashed to the operating system.
- main(options:tab, help, funs:{fun}) ⇒ nil
-
main program
- eg(key, str:str, fun:fun) ⇒ nil
-
register an example.