train this model on a down stream task