-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathCHANGELOG
214 lines (151 loc) · 8.93 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
xx. yyyy 2016
Version 1.6: 4th public release.
- Added phase-periodic boundary conditions in the space directions for
the quark fields (thanks to Isabel Campos). See doc/dirac.pdf and
doc/parms.pdf for details.
- The function chs_ubnd() [lattice/bcnds.c] has been removed since its
functionality is now contained in set_ud_phase() and unset_ud_phase()
[uflds/uflds.c].
- The error functions have been moved to a new module utils/error.c and
the error_loc() function has been changed so that it calls MPI_Abort()
when an error occurs in the calling MPI process. There is then no use
for the error_chk() function anymore and so it has been removed.
- There is a new module utils/hsum.c providing generic hierarchical
summation routines. These are now used in most programs where such
sums are performed, thus leading to an important simplification of
the programs (see linalg/salg_dble.c, for example).
- Corrected flags/rw_parms.c so as to get rid of compiler warnings when
NPROC=1.
- Corrected notes in utils/endian.c.
- In main/ym1.c delete superfluous second call of alloc_data() in the
main program.
- In main/ms4.c replaced "x0=(N0-1)" by "x0==(N0-1)" on line 185 and
"if (sp.solver==SAP_GCR)" by "else if (sp.solver==SAP_GCR)" on line
491 (thanks to Abdou Abdel-Rehim for noting this bug).
- In random/ranlux.c inserted error message in check_machine() to
make sure the number of actual MPI processes matches NPROC.
- In the main programs qcd1.c and ym1.c interchanged the order of init_ud()
and init_rng().
- Corrected avx.h in order to get rid of some Intel compiler warnings.
- Corrected a bug in cmat_vec() and cmat_vec_assign() [linalg/cmatrix.c]
that could lead to a segmentation fault if the option -DAVX is set,
the argument "n" is odd and if the matrix "a" is not aligned to a 16
byte boundary (thanks to Agostino Patella).
- Corrected the use of alignment data attributes so as to fully comply
with the syntactic rules described in the gcc manuals.
- Reworked the module su3fcts/chexp.c in order to make it safer against
aggressively optimizing compilers.
- Included the AVX su3_multiply_dble and su3_inverse_multiply_dble macros
in the module su3fcts/su3prod.c. Modified the inline assembly code in
prod2su3alg() in order to make it safer against aggressively optimizing
compilers.
- Included new inline assembly code in avx.h that makes use of fused
multiply-add (FMA3) instructions. These can be activated by setting
the compiler option -DFMA3 together with -DAVX. Several modules now
include such instructions, notably cmatrix.c, cmatrix_dble.c, salg.c
and salg_dble.c.
- Adapted all check programs to the changes in the modules.
- Modified devel/nompi/su3fcts/time1.c so as to more accurately measure
the time required for su3xsu3_vector multiplications.
- Added an "LFLAGS" variable to the Makefiles that allows the compiler
options in the link step to be set to a value different from CFLAGS.
- Revised the documentation files doc/dirac.pdf and doc/parms.pdf.
- Revised the top README file.
- Revised modules/flags/README.
22. April 2014
Version 1.4: 3rd public release.
- Changed the way SF boundary conditions are implemented so that the time
extent of the lattice is now NPROC0*L0 rather than NPROC0*L0-1.
- Adapted all modules and main programs so as to support four types of
boundary conditions (open, SF, open-SF and periodic). For detailed
explanations see main/README.global, doc/gauge_action.pdf, doc/dirac.pdf,
and doc/parms.pdf.
- The form of the gauge action near the boundaries of the lattice with SF
boundary conditions has been slightly modified with respect to the choice
made in version 1.2 (see doc/gauge_action.pdf; the modification only
concerns actions with double-plaquette terms).
- The programs for the light-quark reweighting factors now support
twisted-mass Hasenbusch decompositions into products of factors. In the
program main/ms1.c, the factorization (if any) can be specified through the
input parameter file. NOTE: the layout of the data on the output data file
produced by ms1.c had to be changed with respect to openQCD-1.2.
- Slightly modified the program main/ms2.c so as to allow for different
power-method iteration numbers when estimating the lower and upper
end of the spectrum of the Dirac operator (see main/README.ms2).
- Removed flags/sf_parms.c since the functionality of this module is now
included in lat_parms.c.
- Updated documentation files gauge_action.pdf, dirac.pdf, parms.pdf and
rhmc.pdf.
- In main/ms4.in and main/qcd1.in replaced "[Deflation projectors]" by
"[Deflation projection]".
- Removed main/qcd2.c since main/qcd1.c now includes the case of SF boundary
conditions.
- Corrected all check programs in ./devel/* so as to take into account the
different choices of boundary conditions. Many check programs now have a
command line option -bc <type> that allows the type of boundary condition to
be specified at run time.
- Corrected a bug in Dwee_dble() [modules/dirac/Dw_dbl.c] that shows up in
some check programs if none of the local lattice sizes L1,L2,L3 is divisible
by 4. The functionality of the other modules and the main programs in ./main
was not affected by this bug, because Dwee_dble() is not called in any of
these programs.
- Corrected modules/flags/rw_parms.c so as to allow for Hasenbusch factorized
reweighting factors.
- Corrected and improved the descriptions at the top of many module files.
- Corrected devel/ratfcts/INDEX.
- Added forgotten "plots" directory in devel/nompi/main.
- Replaced &irat in MPI_Bcast(&irat,3,MPI_INT,0,MPI_COMM_WORLD) by irat in
flags/force_parms.c [read_forc_parms() and read_force_parms2()]. This is not
a mistake but an unnatural and unintended use of the C language. Corrected
analogous cases in a number of check programs (thanks to Hubert Simma and
Georg Engel for noting these misprints).
- Corrected check program block/check1.c (the point labeling does not need to
respect any time ordering).
12. May 2013
Version 1.2: 2nd public release.
- Added AVX inline-assembly to the time-critical functions (Dirac operator,
linear algebra, SAP preconditioner, SU(3) functions). See the README file in
the top directory of the distribution.
- Added support for blocked MPI process ranking, as is likely to be profitable
on parallel computers with mult-core nodes (see main/README.global).
- Made the field import/export functions more efficient by avoiding the
previously excessive use of MPI_Barrier().
- Added import/export functions for the state of the random number generators.
Modified the initialization of the generators so as to be independent of the
ranking of the MPI processes. See the notes in modules/random/ranlux.c. Added
a check program in devel/random.
- Continuation runs of qcd1,qcd2,ym1 and ms1 now normally reset the random
number generators to their state at the end of the previous run. The
programs initialize the generators in the traditional way if the option
-norng is set (see README.qcd1, for example).
- Modified the deflated SAP+GCR solver (dfl/dfl_sap_gcr.c) by replacing the
deflation projectors through an inaccurate projection in the preconditioner
(as suggested by Frommer et al. [arXiv:1303:1377]; the deflation subspace
type and subspace generation algorithm are unchanged). This leads to a
structural simplification and, after some parameter tuning, to a slight
performance gain. NOTE: the deflation parameter set is changed too and the
number of status variables is reduced by 1 (see modules/flags/dfl_parms.c,
modules/dfl/dfl_sap_gcr.c and doc/parms.pdf).
- Included a program (devel/dfl/check4.c) that allows the parameters of the
deflated SAP+GCR solver to be tuned on a given lattice.
- Deleted the now superfluous module/dfl/dfl_projectors.c.
- Added the function fdigits() [utils/mutils.c] that allows double-precision
floating point numbers to be printed with all significant decimal digits
(and only these). The main programs make use of this function to ensure that
the values of the decimal parameters are printed to the log files with as
many significant digits as were given on the input parameter file (assuming
not more digits were specified than can be represented by a double number).
- Replaced "if" by "else if" on line 379 of main/ms2.c. This bug stopped the
program with an error message when the CGNE solver was used. It had no
effect when other solvers were used.
- Changed the type of the variable "sf" to "int" in lines 257 and 440 of
forces/force0.c. This bug had no effect in view of the automatic type
conversions performed by the compiler.
- Corrected sign in line 174 of devel/sap/check2.c. This bug led to wrong
check results, thus incorrectly suggesting that the SAP modules were
incorrect.
- Corrected a mistake in devel/tcharge/check2.c and devel/tcharge/check5.c
that gave rise to wrong results suggesting that the tested modules were
incorrect.
14. June 2012
Version 1.0: Initial public release.