This is quite a large topic you are going into. I would suggest starting to learn some basic computer vision, which will give you the basics for the rest that you will need. A place to start can be OpenCV’s tutorials: Tutorial
This might also help you with the part of automatically recognize letters. Might need some background info, which the youtube Cyrill Stachniss often has some good one. An example is this
After this I would as you said yourself look into SLAM and maybe also look into optical flow, which can be used for the navigation. Optical flow might be the most straightforward point to start with for these two. But both of these are quite advanced topics if you have no experience with them, so it will take quite some time, but they are also very interesting topics. Unfortunately I don’t have any good links to tutorials for this.