diff --git a/NEWS.md b/NEWS.md index f4202b21b1..c07f4ee3ca 100644 --- a/NEWS.md +++ b/NEWS.md @@ -112,6 +112,11 @@ 7. Clarified that "data.table always sorts in `C-locale`" means that upper-case letters are sorted before lower-case letters by ordering in data.table (e.g. `setorder`, `setkey`, `DT[order(...)]`). Thanks to @hughparsonage for the pull request editing the documentation. Note this makes no difference in most cases of data; e.g. ids where only uppercase or lowercase letters are used (`"AB123"<"AC234"` is always true, regardless), or country names and words which are consistently capitalized. For example, `"America" < "Brazil"` is not affected (it's always true), and neither is `"america" < "brazil"` (always true too); since the first letter is consistently capitalized. But, whether `"america" < "Brazil"` (the words are not consistently capitalized) is true or false in base R depends on the locale of your R session. In America it is true by default and false if you i) type `Sys.setlocale(locale="C")`, ii) the R session has been started in a C locale for you which can happen on servers/services (the locale comes from the environment the R session is started in). However, `"america" < "Brazil"` is always, consistently false in data.table which can be a surprise because it differs to base R by default in most regions. It is false because `"B"<"a"` is true because all upper-case letters come first, followed by all lower case letters (the ascii number of each letter determines the order, which is what is meant by `C-locale`). +### Changes in v1.10.4-3 (on CRAN 20 Oct 2017) + +1. Fixed crash/hang on MacOS when `parallel::mclapply` is used and data.table is merely loaded, [#2418](https://github.com/Rdatatable/data.table/issues/2418). Oddly, all tests including test 1705 (which tests `mclapply` with data.table) passed fine on CRAN. It appears to be some versions of MacOS or some versions of libraries on MacOS, perhaps. Many thanks to Martin Morgan for reporting and confirming this fix works. Thanks also to @asenabouth, Joe Thorley and Danton Noriega for testing, debugging and confirming that automatic parallelism inside data.table (such as `fwrite`) works well even on these MacOS installations. See also news items below for 1.10.4-1 and 1.10.4-2. + + ### Changes in v1.10.4-2 (on CRAN 12 Oct 2017) 1. OpenMP on MacOS is now supported by CRAN and included in CRAN's package binaries for Mac. But installing v1.10.4-1 from source on MacOS failed when OpenMP was not enabled at compile time, [#2409](https://github.com/Rdatatable/data.table/issues/2409). Thanks to Liz Macfie and @fupangpangpang for reporting. The startup message when OpenMP is not enabled has been updated. diff --git a/src/openmp-utils.c b/src/openmp-utils.c index d6e3b70316..cd85ee9e01 100644 --- a/src/openmp-utils.c +++ b/src/openmp-utils.c @@ -56,23 +56,28 @@ SEXP setDTthreads(SEXP threads) { return ScalarInteger(old); } -// auto avoid deadlock when data.table called from parallel::mclapply -void when_fork() { +/* + Auto avoid deadlock when data.table is used from within parallel::mclapply + GNU OpenMP seems ok with just setting DTthreads to 1 which limits the next parallel region + if data.table is used within the fork'd proceess. This is tested by test 1705. - // attempted workaround for Intel's OpenMP implementation which leaves threads running after - // parallel region; these crash when forked. -#ifdef _OPENMP - omp_set_num_threads(1); -#endif + We used to have an after_fork() callback too, to return to multi-threaded mode after parallel's + fork completes. But now in an attempt to alleviate problems propogating (likely Intel's OpenMP only) + we now leave data.table in single-threaded mode after parallel's fork. User can call setDTthreads(0) + to return to multi-threaded as we do in tests on Linux. - // GNU OpenMP seems ok with just setting DTthreads to 1 which limits the next parallel region - // if data.table is used within the fork'd proceess. - DTthreads = 1; + DO NOT call omp_set_num_threads(1) inside when_fork()!! That causes a different crash/hang on MacOS + upon mclapply's fork even if data.table is merely loaded and neither used yet in the session nor by + what mclapply is calling. Even when passing on CRAN's MacOS all-OK. As discovered by several MacOS + and RStudio users here when 1.10.4-2 went to CRAN : + https://github.com/Rdatatable/data.table/issues/2418 + v1.10.4-3 removed the call to omp_set_num_threads(). + We do not know why calling (mere) omp_set_num_threads(1) in the when_fork() would cause a problem + on MacOS. +*/ - // We used to have an after_fork() callback too, to return to multi-threaded mode after parallel's - // fork completes. But now in an attempt to alleviate problems propogating (likely Intel's OpenMP only) - // we now leave data.table in single-threaded mode after parallel's fork. User can call setDTthreads(0) - // to return to multi-threaded as we do in tests on Linux. +void when_fork() { + DTthreads = 1; } void avoid_openmp_hang_within_fork() {