Text this: Image and video comprension standards algorithms and architectures