From 25d696a0adb94e875bad7887ad34ac4be4bd6d3f Mon Sep 17 00:00:00 2001 From: Gertjan van den Burg Date: Sat, 8 Apr 2023 09:29:13 +0100 Subject: [PATCH] CleverCSV Release v0.8.0 * Improve median runtime by ~68% (~52% on average) by: 1) more caching, 2) implementing a heavy function in C. * Redesign computation of consistency measure to a class: `ConsistencyDetector`. * Fix potential memory leak in C code for base abstraction * Fixes to escape sequences in regexes (thanks to @JakobGM!) * Various improvements to code quality * Switch documentation style to [furo](https://pypi.org/project/furo/). --- CHANGELOG.md | 11 +++++++++++ README.md | 6 ++++-- docs/CHANGELOG.rst | 13 +++++++++++++ docs/README.rst | 6 ++++-- man/clevercsv-code.1 | 4 ++-- man/clevercsv-detect.1 | 4 ++-- man/clevercsv-explore.1 | 4 ++-- man/clevercsv-help.1 | 4 ++-- man/clevercsv-standardize.1 | 4 ++-- man/clevercsv-view.1 | 4 ++-- man/clevercsv.1 | 4 ++-- 11 files changed, 46 insertions(+), 18 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 474dc91e..2a33e2ba 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,16 @@ # Changelog +## Version 0.8.0 + +* Improve median runtime by ~68% (~52% on average) by: 1) more caching, 2) + implementing a heavy function in C. +* Redesign computation of consistency measure to a class: + `ConsistencyDetector`. +* Fix potential memory leak in C code for base abstraction +* Fixes to escape sequences in regexes (thanks to @JakobGM!) +* Various improvements to code quality +* Switch documentation style to [furo](https://pypi.org/project/furo/). + ## Version 0.7.7 * Use r-prefix for regex patterns (thanks to @JakobGM!) diff --git a/README.md b/README.md index f67d7827..f9365c4b 100644 --- a/README.md +++ b/README.md @@ -226,8 +226,10 @@ with open("data.csv", "r", newline="") as fp: rows = list(reader) ``` -For **large files**, you can speed up detection by supplying a smaller sample -to the sniffer, for example: +Since CleverCSV v0.8.0, dialect detection is a lot faster than in previous +versions. However, for **large files**, you can speed up detection even more +by supplying a sample of the document to the sniffer instead of the whole +file, for example: ```python dialect = clevercsv.Sniffer().sniff(fp.read(10000)) ``` diff --git a/docs/CHANGELOG.rst b/docs/CHANGELOG.rst index 97c97a1a..477c3b96 100644 --- a/docs/CHANGELOG.rst +++ b/docs/CHANGELOG.rst @@ -2,6 +2,19 @@ Changelog ========= +Version 0.8.0 +------------- + + +* Improve median runtime by ~68% (~52% on average) by: 1) more caching, 2) + implementing a heavy function in C. +* Redesign computation of consistency measure to a class: + ``ConsistencyDetector``. +* Fix potential memory leak in C code for base abstraction +* Fixes to escape sequences in regexes (thanks to @JakobGM!) +* Various improvements to code quality +* Switch documentation style to `furo `_. + Version 0.7.7 ------------- diff --git a/docs/README.rst b/docs/README.rst index daefbc70..23a8218e 100644 --- a/docs/README.rst +++ b/docs/README.rst @@ -242,8 +242,10 @@ the Python CSV module: reader = clevercsv.reader(fp, dialect) rows = list(reader) -For **large files**\ , you can speed up detection by supplying a smaller sample -to the sniffer, for example: +Since CleverCSV v0.8.0, dialect detection is a lot faster than in previous +versions. However, for **large files**\ , you can speed up detection even more +by supplying a sample of the document to the sniffer instead of the whole +file, for example: .. code-block:: python diff --git a/man/clevercsv-code.1 b/man/clevercsv-code.1 index 8ad13240..46742eb9 100644 --- a/man/clevercsv-code.1 +++ b/man/clevercsv-code.1 @@ -2,12 +2,12 @@ .\" Title: clevercsv-code .\" Author: G.J.J. van den Burg .\" Generator: Wilderness -.\" Date: 2023-04-07 +.\" Date: 2023-04-08 .\" Manual: clevercsv Manual .\" Source: clevercsv 0.8.0 .\" Language: English .\" -.TH "CLEVERCSV-CODE" "1" "2023\-04\-07" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" +.TH "CLEVERCSV-CODE" "1" "2023\-04\-08" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- diff --git a/man/clevercsv-detect.1 b/man/clevercsv-detect.1 index 7233bb60..7a0c1a00 100644 --- a/man/clevercsv-detect.1 +++ b/man/clevercsv-detect.1 @@ -2,12 +2,12 @@ .\" Title: clevercsv-detect .\" Author: G.J.J. van den Burg .\" Generator: Wilderness -.\" Date: 2023-04-07 +.\" Date: 2023-04-08 .\" Manual: clevercsv Manual .\" Source: clevercsv 0.8.0 .\" Language: English .\" -.TH "CLEVERCSV-DETECT" "1" "2023\-04\-07" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" +.TH "CLEVERCSV-DETECT" "1" "2023\-04\-08" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- diff --git a/man/clevercsv-explore.1 b/man/clevercsv-explore.1 index c358c20a..7dcb5c42 100644 --- a/man/clevercsv-explore.1 +++ b/man/clevercsv-explore.1 @@ -2,12 +2,12 @@ .\" Title: clevercsv-explore .\" Author: G.J.J. van den Burg .\" Generator: Wilderness -.\" Date: 2023-04-07 +.\" Date: 2023-04-08 .\" Manual: clevercsv Manual .\" Source: clevercsv 0.8.0 .\" Language: English .\" -.TH "CLEVERCSV-EXPLORE" "1" "2023\-04\-07" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" +.TH "CLEVERCSV-EXPLORE" "1" "2023\-04\-08" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- diff --git a/man/clevercsv-help.1 b/man/clevercsv-help.1 index c0fb558a..c5ada246 100644 --- a/man/clevercsv-help.1 +++ b/man/clevercsv-help.1 @@ -2,12 +2,12 @@ .\" Title: clevercsv-help .\" Author: G.J.J. van den Burg .\" Generator: Wilderness -.\" Date: 2023-04-07 +.\" Date: 2023-04-08 .\" Manual: clevercsv Manual .\" Source: clevercsv 0.8.0 .\" Language: English .\" -.TH "CLEVERCSV-HELP" "1" "2023\-04\-07" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" +.TH "CLEVERCSV-HELP" "1" "2023\-04\-08" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- diff --git a/man/clevercsv-standardize.1 b/man/clevercsv-standardize.1 index 3e4b03e8..ca33bb0e 100644 --- a/man/clevercsv-standardize.1 +++ b/man/clevercsv-standardize.1 @@ -2,12 +2,12 @@ .\" Title: clevercsv-standardize .\" Author: G.J.J. van den Burg .\" Generator: Wilderness -.\" Date: 2023-04-07 +.\" Date: 2023-04-08 .\" Manual: clevercsv Manual .\" Source: clevercsv 0.8.0 .\" Language: English .\" -.TH "CLEVERCSV-STANDARDIZE" "1" "2023\-04\-07" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" +.TH "CLEVERCSV-STANDARDIZE" "1" "2023\-04\-08" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- diff --git a/man/clevercsv-view.1 b/man/clevercsv-view.1 index 971b3aa5..602aefec 100644 --- a/man/clevercsv-view.1 +++ b/man/clevercsv-view.1 @@ -2,12 +2,12 @@ .\" Title: clevercsv-view .\" Author: G.J.J. van den Burg .\" Generator: Wilderness -.\" Date: 2023-04-07 +.\" Date: 2023-04-08 .\" Manual: clevercsv Manual .\" Source: clevercsv 0.8.0 .\" Language: English .\" -.TH "CLEVERCSV-VIEW" "1" "2023\-04\-07" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" +.TH "CLEVERCSV-VIEW" "1" "2023\-04\-08" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- diff --git a/man/clevercsv.1 b/man/clevercsv.1 index f2d5e784..42b73643 100644 --- a/man/clevercsv.1 +++ b/man/clevercsv.1 @@ -2,12 +2,12 @@ .\" Title: clevercsv .\" Author: G.J.J. van den Burg .\" Generator: Wilderness -.\" Date: 2023-04-07 +.\" Date: 2023-04-08 .\" Manual: clevercsv Manual .\" Source: clevercsv 0.8.0 .\" Language: English .\" -.TH "CLEVERCSV" "1" "2023\-04\-07" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" +.TH "CLEVERCSV" "1" "2023\-04\-08" "Clevercsv 0\&.8\&.0" "Clevercsv Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" -----------------------------------------------------------------