You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As mentioned in ebuhle/LCRchumIPM#24, this is an annotated pseudocode describing the steps that will need to change from the existing LCR chum IPM in order to model the broodstock-to-smolt piece of the hatchery life cycle. Key assumptions discussed in that Issue are highlighted with italics below. As noted there, some of the notation is subject to change. In particular, I have already anticipated changing:
B_rate, B_rate_all -> b, b_all (the latter is padded with 0s b/c there is no b to estimate when no broodstock were taken) P_D -> p_D (it's an array container, but each element is a simplex vector) P_D_all -> P_D (the actual dispersal / straying matrix, a padded version of the contents of p_D)
Other planned changes are noted in the pseudocode itself.
In R data-wrangling:
Add columns to fish_data named n_B[o]_obs, where the os are the numeric factor codes corresponding to the set of hatcheries / spawning channels to which adult recruits were ever transferred / translocated (which are the same as the set of known origins represented in n_O_obs[-1,]). The values in row i are the observed counts of fish that returned to pop[i] (initial location) in year[i] and were then transferred to o (final disposition).
Do not filter age-frequency data to screen out hatchery fish. Age composition now includes all spawners regardless of origin. (This is already the case for sex composition, though I don't remember why I did it that way.)
In salmonIPM::stan_data():
Use n_B_obs to construct a N_year x N_pop x N_pop array P_B_obs where P_B_obs[t,j,k] for j != k is the observed proportion of the adults taken as broodstock from pop k in year t whose disposition was j. If no broodstock were taken from pop k, P_B_obs[t,j,] is a column of 0s; the diagonal is also 0. This is the conditional disposition distribution, given that a fish was taken as broodstock (cf. the logic of conditional age-at-return).
In IPM_LCRchum_pp.stan:
data
Declare an integer array which_O_pop, the indices of known-origin / transfer-recipient populations. Redefine the current which_H_pop to truly refer only to hatcheries.
Read in the N_year array of N x N_pop matrices P_B_obs from step (3).
transformed data
Rename which_W_pop (the complement of which_H_pop) to which_D_pop (the complement of which_O_pop), i.e. the set of pops that receive natural dispersal but not adult broodstock transfers, and whose origins are untraceable.
transformed parameters
Change the declaration and assignment of P_D, the padded length-N_pop array containing the primitive dispersal simplexes p_D for hatchery pops and vectors of 0s elsewhere. Instead, make it a N_pop x N_pop matrix where P_D[j,k] is the probability that a recruit from pop k (origin) returns to pop j (location). Thus if pop k is a known origin indexed by which_O_pop, P_D[which_D_pop,k] is the primitive simplex-array parameter p_D[k], otherwise it is a vector with P_D[k,k] = 1; all other elements are 0. That is, we assume unknown-origin pops do not stray and known-origin pops (hatcheries / channels) do not receive natural strays -- just like we do now.
Declare a N_year array of length-N_pop vectors b_all where b_all[t][j] is the corresponding element of the length-N_B primitive parameter vector b if broodstock removal from pop j occurred in year t, and 0 otherwise. This replaces salmonIPM's usual B_rate_all, a length-N vector similarly padded with 0s, b/c now we need to be able to index b by pop and year. Hopefully there's a more clever way to populate b_all than by looping over rows 1:N.
Declare a N_year array of N_pop x N_pop matrices P_T. Construct it by looping over years and then P_T[t] = diag_post_multiply(P_B[t], b_all[t]) + diag_matrix(1 - b_all[t]). These are the unconditional probabilities that a fish returning to location k (cols) was then translocated / transferred to disposition j (rows), including the probability of escaping broodstock collection and staying at k (the 1 - b_all on the diagonal).
Declare S_a, a N_year array of N_pop x N_age matrices of spawners-at-age. (Currently row vectors S_W_a and S_H_a are declared as local variables for each i in 1:N in the respective main loops, but now they need to be persistent so we can incrementally populate them.) Similarly, declare S_MF, a N_year array of N_pop x 2 matrices of spawners-by-sex. (Currently we have a local variable q_F_a, the proportion of spawners of each age that are female.) These new variables are analogous to S_O, the existing array of annual N_pop (to) x N_pop (from) spawners-by-origin matrices.
Adult recruitment, dispersal and translocation. Loop over rows i in 1:N and then over ages a in 1:N_age:
i. Recruitment. Calculate age-a recruits as we currently do, except use the orphan formulation of initial states for all pops (currently ignored in the "hatchery loop" b/c it was a distraction when getting the thing to work). Apply fishery mortality (currently all 0 in fish_data) but not broodstock removal yet. (Assume any fishing occurs en route to the spawning grounds, before broodstock is taken.) Assign the result to local scalar R_a.
ii. Natural dispersal and transfers / translocations. First we distribute the age-a recruits from pop[i] to all populations, including pop[i] itself (the only destination unless pop[i] is in which_O_pop), with P_D[,pop[i]]. Then we redistribute them through broodstock transfers to the populations in which_O_pop, which are also the hatcheries / channels. Assuming transfers are random w.r.t. origin, we can apply the annual pop-specific (non)transfer probabilities contained in P_T[year[i]] to the vector of pop[i]]-origin, age-a recruits that returned to each pop. After dispersal and transfers, age-a spawners from pop[i] are given by P_T[year[i]][,] * P_D[,pop[i]] * R_a. Assign this to local N_pop vector S_i. Increment spawners-by-origin P_O[year[i]][,pop[i]] += S_i. Then, assuming that both dispersal and transfers are random w.r.t. age and sex such that the recruits from pop[i] carry their demographics with them, increment S_a[year[i]][,a] += S_i and S_MF[year[i]][,] += S_i * [1 - pf, pf], where pf is shorthand for p_F[i-ocean_ages[a]] (that last bit is how we get the age-specific recruit sex structure now).
Spawner abundance and composition, smolt production. Loop over rows i in 1:N a second time:
i. Compute total spawner abundance S[i] by summing S_a[year[i]][pop[i],]. Compute the spawner age distribution q[i,] by normalizing that row by S[i]. Proportion female is similarly found by normalizing S_MF[year[i]][pop[i],]. Origin-composition comes from normalizing S_O[year[i]][pop[i],], but it's a little trickier b/c you need to sum all the unknown natural origins in which_D_pop and make that the first element of q_O[i,], followed by the which_O_pop.
ii. Compute E_hat, M_hat and M0 as usual, with one exception: if pop[i] is in which_H_pop, use the discrete exponential S-R function. (There's your use case, @tbuehrens. 😎) This assumes hatchery rearing survival is density-independent.
model
Observation model.
i. Spawner abundance. Now includes hatchery populations, with some suitably precise tau_S_obs TBD.
ii. Smolt abundance. Ditto for hatchery smolts and tau_M_obs.
iii. Age, sex and origin composition. Ditto again, these likelihoods now include all populations. Also, age and sex composition now include all spawners regardless of origin; previously salmonIPM models have always excluded hatchery-origin spawners from the age-comp likelihood.
In post-processing:
get_prior_info() and plot_prior_posterior()
Adapt these functions to either accommodate, or more likely ignore, parameters P_D and P_T which consist of multiple columns of simplexes that have Dir(1) priors.
One last comment for now. I'm sure @tbuehrens will notice that both natural dispersal and broodstock transfers / translocations are described by transition matrices, but we use them by looping over rows and/or columns (the dispersal matrix P_D is this way already). With all this matrix stuff, we're getting tantalizingly close to a Newman et al.-style formulation; as soon as you have interaction among populations, the model naturally wants to be multivariate. But I still loop over rows of fish_data to calculate recruitment because this part is much harder to reformulate as matrix algebra. (The basic reason is that, with our parameterization of SAR and conditional age-at-return, the model is no longer first-order Markovian. Newman et al. estimated annual marine survival and maturation rates; how well, I'm not sure.) Years ago I spent a significant amount of time trying unsuccessfully to bridge this gap. Still, it is tempting to imagine going the multivariate route with this model someday, but we're not there yet. For once I'm following the mantra "get the code to work, then optimize it".
The text was updated successfully, but these errors were encountered:
ebuhle
changed the title
Modeling the hatchery broodstock-to-smolt transition in IPM_LCRchum_pp
Modeling the hatchery broodstock-to-smolt transition in IPM_LCRchum_ppAug 13, 2024
As mentioned in ebuhle/LCRchumIPM#24, this is an annotated pseudocode describing the steps that will need to change from the existing LCR chum IPM in order to model the broodstock-to-smolt piece of the hatchery life cycle. Key assumptions discussed in that Issue are highlighted with italics below. As noted there, some of the notation is subject to change. In particular, I have already anticipated changing:
B_rate
,B_rate_all
->b
,b_all
(the latter is padded with 0s b/c there is nob
to estimate when no broodstock were taken)P_D
->p_D
(it's an array container, but each element is a simplex vector)P_D_all
->P_D
(the actual dispersal / straying matrix, a padded version of the contents ofp_D
)Other planned changes are noted in the pseudocode itself.
In R data-wrangling:
Add columns to
fish_data
namedn_B[o]_obs
, where theo
s are the numeric factor codes corresponding to the set of hatcheries / spawning channels to which adult recruits were ever transferred / translocated (which are the same as the set of known origins represented inn_O_obs[-1,]
). The values in rowi
are the observed counts of fish that returned topop[i]
(initiallocation
) inyear[i]
and were then transferred too
(finaldisposition
).Do not filter age-frequency data to screen out hatchery fish. Age composition now includes all spawners regardless of origin. (This is already the case for sex composition, though I don't remember why I did it that way.)
In
salmonIPM::stan_data()
:n_B_obs
to construct aN_year x N_pop x N_pop
arrayP_B_obs
whereP_B_obs[t,j,k]
forj != k
is the observed proportion of the adults taken as broodstock from popk
in yeart
whose disposition wasj
. If no broodstock were taken from popk
,P_B_obs[t,j,]
is a column of 0s; the diagonal is also 0. This is the conditional disposition distribution, given that a fish was taken as broodstock (cf. the logic of conditional age-at-return).In
IPM_LCRchum_pp.stan
:data
Declare an integer array
which_O_pop
, the indices of known-origin / transfer-recipient populations. Redefine the currentwhich_H_pop
to truly refer only to hatcheries.Read in the
N_year
array ofN x N_pop
matricesP_B_obs
from step (3).transformed data
which_W_pop
(the complement ofwhich_H_pop
) towhich_D_pop
(the complement ofwhich_O_pop
), i.e. the set of pops that receive natural dispersal but not adult broodstock transfers, and whose origins are untraceable.transformed parameters
Change the declaration and assignment of
P_D
, the padded length-N_pop
array containing the primitive dispersal simplexesp_D
for hatchery pops and vectors of 0s elsewhere. Instead, make it aN_pop x N_pop
matrix whereP_D[j,k]
is the probability that a recruit from popk
(origin
) returns to popj
(location
). Thus if popk
is a known origin indexed bywhich_O_pop
,P_D[which_D_pop,k]
is the primitive simplex-array parameterp_D[k]
, otherwise it is a vector withP_D[k,k] = 1
; all other elements are 0. That is, we assume unknown-origin pops do not stray and known-origin pops (hatcheries / channels) do not receive natural strays -- just like we do now.Declare a
N_year
array of length-N_pop
vectorsb_all
whereb_all[t][j]
is the corresponding element of the length-N_B
primitive parameter vectorb
if broodstock removal from popj
occurred in yeart
, and 0 otherwise. This replaces salmonIPM's usualB_rate_all
, a length-N
vector similarly padded with 0s, b/c now we need to be able to indexb
by pop and year. Hopefully there's a more clever way to populateb_all
than by looping over rows1:N
.Declare a
N_year
array ofN_pop x N_pop
matricesP_T
. Construct it by looping over years and thenP_T[t] = diag_post_multiply(P_B[t], b_all[t]) + diag_matrix(1 - b_all[t])
. These are the unconditional probabilities that a fish returning to locationk
(cols) was then translocated / transferred to dispositionj
(rows), including the probability of escaping broodstock collection and staying atk
(the1 - b_all
on the diagonal).Declare
S_a
, aN_year
array ofN_pop x N_age
matrices of spawners-at-age. (Currently row vectorsS_W_a
andS_H_a
are declared as local variables for eachi in 1:N
in the respective main loops, but now they need to be persistent so we can incrementally populate them.) Similarly, declareS_MF
, aN_year
array ofN_pop x 2
matrices of spawners-by-sex. (Currently we have a local variableq_F_a
, the proportion of spawners of each age that are female.) These new variables are analogous toS_O
, the existing array of annualN_pop (to) x N_pop (from)
spawners-by-origin matrices.Adult recruitment, dispersal and translocation. Loop over rows
i in 1:N
and then over agesa in 1:N_age
:i. Recruitment. Calculate age-
a
recruits as we currently do, except use the orphan formulation of initial states for all pops (currently ignored in the "hatchery loop" b/c it was a distraction when getting the thing to work). Apply fishery mortality (currently all 0 infish_data
) but not broodstock removal yet. (Assume any fishing occurs en route to the spawning grounds, before broodstock is taken.) Assign the result to local scalarR_a
.ii. Natural dispersal and transfers / translocations. First we distribute the age-
a
recruits frompop[i]
to all populations, includingpop[i]
itself (the only destination unlesspop[i]
is inwhich_O_pop
), withP_D[,pop[i]]
. Then we redistribute them through broodstock transfers to the populations inwhich_O_pop
, which are also the hatcheries / channels. Assuming transfers are random w.r.t. origin, we can apply the annualpop
-specific (non)transfer probabilities contained inP_T[year[i]]
to the vector ofpop[i]]
-origin, age-a
recruits that returned to eachpop
. After dispersal and transfers, age-a
spawners frompop[i]
are given byP_T[year[i]][,] * P_D[,pop[i]] * R_a
. Assign this to localN_pop
vectorS_i
. Increment spawners-by-originP_O[year[i]][,pop[i]] += S_i
. Then, assuming that both dispersal and transfers are random w.r.t. age and sex such that the recruits frompop[i]
carry their demographics with them, incrementS_a[year[i]][,a] += S_i
andS_MF[year[i]][,] += S_i * [1 - pf, pf]
, wherepf
is shorthand forp_F[i-ocean_ages[a]]
(that last bit is how we get the age-specific recruit sex structure now).Spawner abundance and composition, smolt production. Loop over rows
i in 1:N
a second time:i. Compute total spawner abundance
S[i]
by summingS_a[year[i]][pop[i],]
. Compute the spawner age distributionq[i,]
by normalizing that row byS[i]
. Proportion female is similarly found by normalizingS_MF[year[i]][pop[i],]
. Origin-composition comes from normalizingS_O[year[i]][pop[i],]
, but it's a little trickier b/c you need to sum all the unknown natural origins inwhich_D_pop
and make that the first element ofq_O[i,]
, followed by thewhich_O_pop
.ii. Compute
E_hat
,M_hat
andM0
as usual, with one exception: ifpop[i]
is inwhich_H_pop
, use the discrete exponential S-R function. (There's your use case, @tbuehrens. 😎) This assumes hatchery rearing survival is density-independent.model
Observation model.
i. Spawner abundance. Now includes hatchery populations, with some suitably precise
tau_S_obs
TBD.ii. Smolt abundance. Ditto for hatchery smolts and
tau_M_obs
.iii. Age, sex and origin composition. Ditto again, these likelihoods now include all populations. Also, age and sex composition now include all spawners regardless of origin; previously salmonIPM models have always excluded hatchery-origin spawners from the age-comp likelihood.
In post-processing:
get_prior_info()
andplot_prior_posterior()
P_D
andP_T
which consist of multiple columns of simplexes that have Dir(1) priors.One last comment for now. I'm sure @tbuehrens will notice that both natural dispersal and broodstock transfers / translocations are described by transition matrices, but we use them by looping over rows and/or columns (the dispersal matrix
P_D
is this way already). With all this matrix stuff, we're getting tantalizingly close to a Newman et al.-style formulation; as soon as you have interaction among populations, the model naturally wants to be multivariate. But I still loop over rows offish_data
to calculate recruitment because this part is much harder to reformulate as matrix algebra. (The basic reason is that, with our parameterization of SAR and conditional age-at-return, the model is no longer first-order Markovian. Newman et al. estimated annual marine survival and maturation rates; how well, I'm not sure.) Years ago I spent a significant amount of time trying unsuccessfully to bridge this gap. Still, it is tempting to imagine going the multivariate route with this model someday, but we're not there yet. For once I'm following the mantra "get the code to work, then optimize it".The text was updated successfully, but these errors were encountered: