Event box

Lamont Library

Web Scraping (Part 1 of 2)

Friday, April 5, 2024, 9:00am - 12:00pm
Lamont Library - Collaborative Learning Space (Room #B-30)
Harvard ID required, Workshop,
Registration has closed.

This two-day workshop teaches participants how to automate the extraction of data from websites and other online repositories into a well-formatted, locally stored dataset, for later analysis. Web scraping tools make the process of collecting large amounts of online information more efficient, and help automate an otherwise tedious, time-consuming, and error prone process.

The workshop includes an introduction to web structures and provides direct, hands-on experience with a series of scraping techniques that run the gamut from simple to complex, including tools for batch file downloading, a full workflow using browser extensions only, and advanced HTML and DOM parsing techniques using Python.

This workshop is in person, 9 am-12 pm on Friday, April 5th, and Friday, April 19th, in Lamont Library Room B-30.

Add to: Google Calendar Other calendar (.ics)

Event Organizer

Profile photo of Jessica Cohen-Tanugi
Jessica Cohen-Tanugi