Schedule
Note that the schedule may be subject to change. Please check the course website frequently for the latest schedule.
For your reference: How to read & review a paper? How to give a talk?
Week | Date | Topics | References | Notes |
1 | 09/02 | Introduction | lecture | |
2 | 09/09 | Partition-based Method for String Similarity Join | lecture | |
3 | 09/16 | Partition-based Method for Set Similarity Join | lecture | |
4 | 09/23 | Heap-based Method for Overlap Set Similarity Join | lecture | |
5 | 09/30 | Heap-based Method for Approximate Entity Extraction | lecture | |
6 | 10/07 | Prefix Filtering for Set Similarity Search | lecture | |
7 | 10/14 | Prefix Filtering for String Similarity Search | lecture | |
8 | 10/21 | Product Quantization for Nearest Neighbor Search | | |
9 | 10/28 | Proximity Graph for Nearest Neighbor Search | e.g., Pandas | |
10 | 11/04 | LSH and R-Tree for Nearest Neighbor Search | e.g., Trifacta, OpenRefine | |
11 | 11/11 | Near-duplicate Passage Dection | e.g., Magellan, Biggorilla | |
12 | 11/18 | Program Synthesis for Data Wrangling | | |
13 | 11/25 | Thanksgiving - No Class | | |
14 | 12/02 | Data Cleaning | Guest Lecture | |
15 | 12/09 | Final Exam | Final Exam |
|
|