investigate using dvc to store data
we could use a folder on hpc2020 as a reporitory for the data and getch it using dvc. When running on hpc2020 it would use the filesystem remote, and when running on any other platform it would use the ssh remote connected to the same folder on hpc2020.