Scraping Canvas (LMS)

Because my time at university is ending I thought it best to archive the canvas pages available to me for later reference should I not be able to access canvas later if they change platforms or disable my account. I should probably add this is for archival purposes and I will not be able to share the data I was able to collect. Thankfully I was able to get the whole thing going in a few minutes and downloading took a lot longer.

The first snippet I got from here, didn’t complete the first time, it seemed some image was causing issues so I moved to another gist, at this rate we could be done in half an hour 😊.

Unfortunately it also borked out on a similar place,

FileNotFoundError

I think it is because there’s something missing or I don’t have access to it. But the real problem is that its downloading content for a course I didn’t care about because I was enrolled in it but it’s full of junk I’m not interested in, so we can remove it by using the second scrapers code and specifying the course id’s which I had to manually go through, there was about 15 of them but it didn’t take too long. Which gave me the full command.

F:\Downloads\canvas>python canvas.py https://canvas.hull.ac.uk/ 4738~DUI9Nha9weSuemu1M2qsmhljoBcQtR0zghXTs3QA7ECHDHQkpsgBQ9RllbaEwySf output 52497,56148,56149,52493,54499,54452,54456,53441,52496,22257,22274,22276,22277,22278,22279,22280,50664,50656,22275,50652

The access token you can see above should be expired by now. You can do it yourself by downloading the same file and installing python3, pathvalidate and pycanvas. You need to generate a security token from /profile/settings and you can get the course id by clicking on the course like this /courses/56149 . When you generate a new token you should receive an email about it.

Canvas online with our starred modules displated.

I decided to make a small adaptation to catch the FileNotFoundError and went off to the races. It took over an hour so I decided it was best to leave it running overnight, when I returned in the morning I had 116 errors (failed downloads) and the rest is the course content!

Our Canvas Modules saved to Windows File Explorer.

Unfortunately I don’t seem to have the submissions for each of these courses so I needed to manually download them aswell and then our archive was completed.

Thanks for reading.