Skip to content

Latest commit

 

History

History
281 lines (181 loc) · 5.84 KB

data.md

File metadata and controls

281 lines (181 loc) · 5.84 KB

 

home roadmap issues ©2022,2023 by tim menzies

DOI

data.lua

   
data.lua : an example csv reader script
(c)2022, Tim Menzies <timm@ieee.org>, BSD-2 

USAGE:   data.lua  [OPTIONS] [-g ACTION]

OPTIONS:
  -d  --dump  on crash, dump stack = false
  -f  --file  name of file         = ../etc/data/auto93.csv
  -g  --go    start-up action      = data
  -h  --help  show help            = false
  -s  --seed  random number seed   = 937162211

ACTIONS:
  -g  the	show settings
  -g  sym	check syms
  -g  num	check nums
  -g  csv	read from csv
  -g  data	read DATA csv
  -g  stats	stats from DATA

Classes

obj(s:str) ⇒ t

create a klass and a constructor

SYM

Summarize a stream of symbols.

SYM.new(i, at, txt) ⇒ SYM

constructor

SYM.add(i, x) ⇒ nil

update counts of things seen so far

SYM.mid(i, x) ⇒ n

return the mode

SYM.div(i, x) ⇒ n

return the entropy

SYM.rnd(i, x, n:num) ⇒ s

return n unchanged (SYMs do not get rounded)

NUM

Summarizes a stream of numbers.

NUM.new(i, at, txt) ⇒ NUM

constructor;

NUM.add(i, n:num) ⇒ NUM

add n, update lo,hi and stuff needed for standard deviation

NUM.mid(i, x) ⇒ n

return mean

NUM.div(i, x) ⇒ n

return standard deviation using Welford's algorithm http://.ly/nn_W

NUM.rnd(i, x, n:num) ⇒ n

return number, rounded

COLS

Factory for managing a set of NUMs or SYMs

COLS.new(i, t:tab) ⇒ COLS

generate NUMs and SYMs from column names

COLS.add(i, row:ROW) ⇒ nil

update the (not skipped) columns with details from row

ROW

Store one record.

ROW.new(i, t:tab) ⇒ ROW

DATA

Store many rows, summarized into columns

DATA.new(i, src:str) ⇒ DATA

A container of i.rows, to be summarized in i.cols

DATA.add(i, t:tab) ⇒ nil

add a new row, update column headers

DATA.clone(i, init?) ⇒ DATA

return a DATA with same structure as `ii.

DATA.stats(i, what?, cols:tab?, nPlaces:{num}?) ⇒ t

reports mid or div of cols (defaults to i.cols.y)

Misc support functions

Numerics

rint(lo, hi) ⇒ n

a integer lo..hi-1

rand(lo, hi) ⇒ n

a float "x" lo<=x < x

Lists

Note the following conventions for functions passed to map or kap.

  • If a nil first argument is returned, that means :skip this result"
  • If a nil second argument is returned, that means place the result as position size+1 in output.
  • Else, the second argument is the key where we store function output.
map(t:tab, fun:fun) ⇒ t

map a function fun(v) over list (skip nil results)

kap(t:tab, fun:fun) ⇒ t

map function fun(k,v) over list (skip nil results)

sort(t:tab, fun:fun) ⇒ t

return t, sorted by fun (default= <)

keys(t:tab) ⇒ ss

return list of table keys, sorted

push(t:tab, x) ⇒ any

push x to end of list; return x

Strings

fmt(sControl:str, ...) ⇒ str

emulate printf

oo(t:tab) ⇒ t

print t then return it

o(t:tab, isKeys:{bool}) ⇒ s

convert t to a string. sort named keys.

coerce(s:str) ⇒ any

return int or float or bool or string from s

csv(sFilename:str, fun:fun) ⇒ nil

call fun on rows (after coercing cell text)

Main

settings(s:str) ⇒ t

parse help string to extract a table of options

cli(options:tab) ⇒ t

update key,vals in t from command-line flags

main fills in the settings, updates them from the command line, runs the start up actions (and before each run, it resets the random number seed and settings); and, finally, returns the number of test crashed to the operating system.

main(options:tab, help, funs:{fun}) ⇒ nil

main program

eg(key, str:str, fun:fun) ⇒ nil

register an example.