* Setting all random seeds to  0 *
Loading model: out_models_paper/net_joint_cross3_rep3.pt on cuda
 Loading model that has completed (or started) 50 of 50 epochs
  test episode_type: few_shot_human
  batch size: 25
  max eval length: 10
  number of steps: 600000
  best val loss achieved: inf

BIML specs:
 nparams= 1393801
 nlayers_encoder= 3
 nlayers_decoder= 3
 nhead= 8
 hidden_size= 128
 dim_feedforward= 512
 act_feedforward= gelu
 dropout= 0.1
 

Fitting for the best value of p_lapse use log-like...
  Each value is replicated across 100 random runs/permutations
 p_lapse 0.01 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -357.8896 (SD= 2.44 , Nrep= 100 ) for 990.0 symbol predictions on average
    ave LL by cell:  -0.3615
 p_lapse 0.02 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -351.3545 (SD= 2.3036 , Nrep= 100 ) for 990.0 symbol predictions on average
    ave LL by cell:  -0.3549
 p_lapse 0.03 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -349.2256 (SD= 2.2208 , Nrep= 100 ) for 990.0 symbol predictions on average
    ave LL by cell:  -0.3528
 p_lapse 0.04 :
* Setting all random seeds to  0 *
  run 0
  run 20
  run 40
  run 60
  run 80
  loglike: M = -349.3483 (SD= 2.161 , Nrep= 100 ) for 990.0 symbol predictions on average
    ave LL by cell:  -0.3529
* BEST FIT * p_lapse= 0.03 with mean loglike score of -349.2256 (or -0.3528 per cell)
