Movement Insights/Movement metrics process
Appearance
(Redirected from Movement Insights/Movement metrics)
The instructions for the monthly movement metrics report are mostly in the readme of the code repository.
Data dependencies
[edit]Normally, you only have to wait for these dependencies to arrive. However, sometimes failures happen which means you find yourself blocked waiting for one of these. In that case, you'll need to contact the responsible people and ask them to fix the problem.
Most of these dependencies are produced by Airflow jobs. To check their status, follow the instructions at wikitech:Data Engineering/Systems/Airflow/Instances#Access.
Datasets owned by other teams
[edit]| dataset | expected arrival (day of the month) | Airflow job | notes |
|---|---|---|---|
| mediawiki_history | day 3-5 | main:mediawiki_history_denormalize | We receive an email alert when it is done (T357472) |
| editors_daily | main:editors_daily_monthly | ||
| pageview_hourly | main:pageview_hourly | ||
| virtualpageview_hourly | main:virtualpageview_hourly | ||
| net new pages API | day 5-10 | done if contributor and content data for the new month is available on Wikistats | |
| wmf_ |
main:unique_devices_per_project_family_monthly | ||
| research.article_features, research.article_quality_scores | updated daily | research:article_features (code) | Used to generate content gaps data. Depends on mediawiki_content_history_v1 |
| content_ |
day 11-13 | research:knowledge_gaps (code) | The notebooks can be safely re-run to incorporate these without affecting previously generated metrics |
Datasets owned by Movement Insights
[edit]Our movement_metrics job, which is scheduled to run on day 7 of the month, generates the following intermediate datasets.
- wmf_product.active_editors
- wmf_product.content_interactions
- wmf_product.global_markets_pageviews
- wmf_product.editor_month
- wmf_product.new_editors
- wmf_product.pageviews_corrected