You think the difficult part is merging observations with the last forecast? I guess it's a very underdetermined problem, but isn't the loss function (compare the forecast grid with later observations) the same whether you're doing grid_t0 -> grid_t1 or (observations, grid_t0) -> grid'_t0 -> grid_t1? I don't know enough about ML to know how much complexity the extra step adds, but doesn't seem like a massive difference.
dannyz|1 year ago