* Setting all random seeds to  0 *
Loading model: out_models_control/net_retrieve_rep8_best.pt on cuda
 Loading model that has completed (or started) 49 of 50 epochs
  test episode_type: few_shot_human
  batch size: 25
  max eval length: 10
  number of steps: 192700
  best val loss achieved: -0.0000

BIML specs:
 nparams= 1393801
 nlayers_encoder= 3
 nlayers_decoder= 3
 nhead= 8
 hidden_size= 128
 dim_feedforward= 512
 act_feedforward= gelu
 dropout= 0.1
 

Fitting for the best value of p_lapse use log-like...
  Each value is replicated across 100 random runs/permutations
 p_lapse 0.01 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -2246.7001 (SD= 107.3863 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -2.2694
 p_lapse 0.02 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -2156.1603 (SD= 96.3517 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -2.1779
 p_lapse 0.03 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -2092.3258 (SD= 89.6357 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -2.1135
 p_lapse 0.04 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -2042.1891 (SD= 84.8289 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -2.0628
 p_lapse 0.05 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -2000.6841 (SD= 81.1046 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -2.0209
 p_lapse 0.06 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1965.2179 (SD= 78.0743 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.9851
 p_lapse 0.07 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1934.2587 (SD= 75.5242 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.9538
 p_lapse 0.08 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1906.8158 (SD= 73.3239 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.9261
 p_lapse 0.09 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1882.2064 (SD= 71.3884 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.9012
 p_lapse 0.1 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1859.9384 (SD= 69.6593 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.8787
 p_lapse 0.2 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1713.3451 (SD= 58.074 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.7307
 p_lapse 0.3 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1637.9745 (SD= 50.5725 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.6545
 p_lapse 0.4 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1599.3693 (SD= 44.3701 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.6155
 p_lapse 0.5 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1586.3521 (SD= 38.6092 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.6024
 p_lapse 0.6 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -1595.0569 (SD= 32.8474 , Nrep= 100 ) for 990 symbol predictions
    ave LL:  -1.6112
* BEST FIT * p_lapse= 0.5 with loglike score of -1586.3521
