Skip to content

ales-t/quicklines

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

quicklines

A simple CLI tool for efficient sampling of lines from large files.

Lines are selected randomly and by default, the output may contain the same line multiple times (i.e. sampling with replacement).

Usage:

quicklines -c HOW_MANY_LINES my-huge-file.txt

quicklines will return the requested number of lines from random positions in the input file.

Optionally, you can sample without replacement by using --no-duplicates. Be careful with this option, if you ask for a sample which is too large, this may cause the program to run forever (or for a very long time).

The implementation relies on mmap to work efficiently.

About

Efficient sampling of lines from large files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages