An organization which conducts product research has two data science projects. Student researchers could take either one -
- In a large set of photographs, identify the general type of product pictured and sort the photos. (e.g., phone, television, car, bedroom, etc.)
- In many product reviews and advertisements perform keyword disambiguation for mentioned terms. (e.g., Determine if "apple" refers to an advertisement for juice or phones)
Sort various media related to products, including advertisements, reviews, and user feedback, to categorize the media by its subject.
- Late August 2019
- Students select research projects from an available pool
- Late September 2019
- Proposal presentation
- May 2020
- Project ends
- https://dumps.wikimedia.org/, "A complete copy of all Wikimedia wikis, in the form of wikitext source and metadata embedded in XML."
- d:Wikidata:Data access
- d:Wikidata:How to use data on Wikimedia projects
- Research:Quarry, a tool with a support community which could assist with presenting the list of users who received a block
- Similar efforts
- For image recognition
- for text disambiguation
tool - give arbitrary text, spit out Wikidata IDs
- Research Proposal
- Data Product
- Technical Paper
- Research Poster
- Presentation of research at local conference in Charlottesville, Virginia
- video presentation?
- essay on ethics?
- method documentation?