Wikimedia Italia/Web app Wiki Loves Monuments/Backend
Overview
[edit]This document contains the documentation useful for configuring the web application "app.wikilovesmonuments" and the related data scraping processes. The configuration administration interface is based on the "Django" web framework and in particular on the "django-admin" component which allows data management essentially through a web-based interface for searching, viewing, and modifying the entities managed within the project.
Access to the administration platform is available at the URL:
https://wlm-it-visual.wmcloud.org/admin/
The wikilovesmonuments app allows the management of WLM contests in various geographical contexts, managing the related scraping and configuration data separately.
To manage a WLM contest, it will be necessary to perform a series of configurations related to the specific geographical context (i.e., the countries for which you want to manage the contests). Each country can be configured independently both in terms of data and contests and their respective dates.
In the following sections, all the entities that need to be configured to manage the contest in a country will be illustrated.
Workflow
[edit]The WLM application is based ona server web application with the following purposes:
- Defines a data model and the SQL database structure
- Schedule and run a data scraping process that periodically updates WLM data
- Provides a REST api to access data from the web frontend
The administration interface described in this document controls the contents of the SQL database and the scheduling of the scraping processes and is contained in the server application.
Structure of the editing interface
[edit]The editing interface, based on the "django-admin" component, manages the configuration through the compilation of records in the database. The interface is organized with a sidebar on the left that lists the various manageable data models. Clicking on each of the listed data models directs you to a page listing the instances of the respective data model, which allows:
- listing of instances, through which it is possible to manage the single instance
- searching (through the filter bar above the list). This feature is enabled only for some data models
- filtering (through the right sidebar). This feature is enabled only for some data models.
Country definition (GeoContext)
[edit]The definition of the characteristics of countries, or more generally "geographical contexts," is the basis for managing a contest related to the country itself.
The configuration involves defining an instance of the "GeoContext" data model, selectable from the left menu of the administrative panel. In the rest of the documentation, the concept of "Country" or "GeoContext" will be referred to equivalently.
As with all entities managed by the interface, access to the list of defined geographical contexts is done using the left sidebar.
To create the country, the following fields must be populated:
Fields for basic configuration
[edit]- label: label of the GeoContext within the system (reference from other entities, filters) - mandatory
- description: optional description of the GeoContext.
- country code: two-letter country code
- monument definition: definition of the monument (e.g., for Italy: "Italian monument")
- app domain: web domain (e.g., for Italy: "WLM.it")
- language code: language code used when proposing the "upload wizard" mode within the web app. This code is used to create the correct link to the wizard.
- commons category label: label to be used in commons categorization when images are uploaded.
- flag: unicode symbol of the GeoContext flag
- Timezone name: name of the time zone (field "TZ Identifier" from this table: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones). This data is used to correctly determine the start date and time of the contests and activate them in the web app.
Fields for configuring the map on the public web app
[edit]- centroid: point for the initial positioning of the map center on the public web app
- zoom level: initial map level on the public web app (5-20)
Fields for configuring the default intro text on public web app
[edit]The introduction text "what is WLM" is configured to change some contents based on the current selected country. There are two fields on the country configuration to control this behavior:
- reference email
- reference website
Fields for configuring geographical entities
[edit]Geographical entities are defined at 3 levels:
- region
- province
- municipality
These entities are populated with a procedure that queries the OpenStreetMap database using the "Overpass" query language. https://wiki.openstreetmap.org/wiki/Overpass_API.
- regions overpass query: overpass query for selecting entities at the region level
- regions name tag: name of the tag to be used in the overpass query result of the regions from which to take the label
- provinces overpass query: overpass query for selecting entities at the province level
- provinces name tag: name of the tag to be used in the overpass query result of the provinces from which to take the label
- municipalities overpass query: overpass query for selecting entities at the municipality level
- municipalities name tag: name of the tag to be used in the overpass query result of the municipalities from which to take the label
Fields related to the configuration of the donation popup
[edit]The following fields govern the functionalities related to the donation popup, a popup window that can appear on the web app to invite users to make a donation. The activation of this functionality and the related operating parameters are specific to each country and are managed by the following fields:
- enable donations text: flag that enables the donation popup mechanism
- donations popup probability: probability of displaying the donation popup after selecting a monument in the web app. It is a number from 0 to 100 that indicates the probability that the popup will be shown.
- Geo context donation texts: HTML texts to be displayed in various languages within the donation popup
Fields for Limesurvey configuration
[edit]The following fields control the "survey" linking from the frontend app (used to report problems about a monument)
- survey link: the base link for the Limesurvey form associated to the country. Il this field is completed, the survey will be accessible from the frontend application in two sections: from the main menu, without reference to the single monument, and from the single monument detail, with a link that embeds information about the reference
Fields for Commons scraping configuration
[edit]The following fields control the configuration of the "Commons scraping" process that can be enabled for each country:
- Commons scraping enabled: if flagged, after a Data scraping process, a full Commons scraping process will be performed.
- Commons scraping paramete: The number of days that will be needed to perform a complete Commons scraping process
Geographical contexts administrators
[edit]From the "Geo context admins" section, it is possible to create new administrators for geographical contexts.
To do this, it is necessary to create a new record specifying the reference GeoContext and the user.
The user must already be registered in the system and have a password.
If the user has registered through the web app, i.e., by logging in on Commons, they will not have a password to access the administration site. In this case, the system will generate a temporary password that will be shown after enabling the user. This password must be communicated to the user who can subsequently change it, once they have logged in for the first time, through the page:
https://wlm-it-visual.wmcloud.org/admin/password_change/
It is also possible to force the generation of a new password (even if one is already defined) through the "Generate or regenerate password" flag present in the administrator creation form. It should be noted that this password is not linked to access to the web app, but only to the administration interface.
The generated password will be shown only once after saving. If it is necessary to recover the set password, the procedure to follow to edit the record, and saving it flagging the "Generate or regenerate password" checkbox.
Note that each user who is an administrator of one or more geographical contexts will only be able to create administrators for those contexts.
Icons
[edit]The management of icons allows generating the set of icons that can be associated with each category of the WLM contest and are used in the contest web application to display monuments on maps and lists.
The icons are associated with each geographical context, allowing independent management.
Specifically, for each category of monument, a graphic symbol is defined, which is used to identify the type, plus several "themes":
- number of photos: symbol on background color, with 3 colors indicating the absence of photos, the presence of a number of photos from 1 to 10, or the presence of more than 10 photos
- contest: symbol filled with white if the monument is in the contest and black if the monument is out of the contest
The combinations of these two themes lead to the pre-generation of 6 icons to be used on maps, plus other "partial" renderings, such as white, black, and the primary color symbol of the web app theme on a transparent background.
The generation of these icons starts from uploading an .svg file containing the symbol. This file must contain a symbol with black fill on a transparent background. For most of the icons configured for the 2024 contest, icons from the "Maki" icon set released under the "Creative Commons CC0 Public Domain Dedication" license were used, but any icon in svg format can be used.
Management is done through the "Icons" section of the administration panel with the usual interface for listing, creating, and modifying/deleting.

App categories
[edit]App categories are defined for each geographical context and are applied to monuments during the scraping phase. Specifically, each monument will be associated with a single app category.
For each category, the following fields must be defined:
- geo context: selection of the geographical context for which the category is defined
- name: name of the category (both in the web app and in the administration interface)
- sector: optional string related to the management of local contests (see the relevant section)
- order: display order of the category in the web app
- icon: icon to be associated with the category (see the section on icons)
- is municipality: special flag indicating that the category is related to the "overview" of municipalities
- is other monuments: special flag indicating that the category is generic and indicates non-belonging to a specific category. If a category with this flag exists, it is used by the scraping process to assign this category if it is not possible to assign another one based on the configured categorization rules.
App categories, in addition to enabling filtering and theming on the public web app, are used in the scraping process along with "Category Rules".
Contests
[edit]Through the "Contests" section, it is possible to configure the contests for the various managed countries.
Managing a contest involves defining the following fields:
- label: label used within the administration interface
- start date: start date of the contest
- end date: end date of the contest
The dates determine whether a given contest is active or not in the web app. For each GeoContext, it is not possible to enter two overlapping contests in terms of dates (only one contest can be active).
The following additional fields:
- description
- link
Are not currently used within the web app.
Local contests
[edit]The concept of "Local Contest" was defined so that, when a user submits images during the contest, they are also categorized for participation in local contests.
Defining "Local contests" requires filling in the following fields:
- contest: the contest to which the local contest refers
- label: a label that will be used in the categorization of images in case of a match with the local contest
- has award: indicates whether the local contest offers a prize
- sparql: optional SPARQL query to determine Q numbers for which the local contest is active
- regions: any regions for which the local contest is defined
- provinces: any provinces for which the local contest is defined
- municipalities: any municipalities for which the local contest is defined
Note that for the local contest to be significant, at least one of the fields sparql, regions, provinces, or municipalities must be populated.
It is also possible to define a series of exceptions by entering a series of Q numbers, for which, even if there is a match for the previously defined parameters, the monument is excluded from the local contest.
At the time of image submission, the inclusion of a monument in a local contest is determined by analyzing the "Local contests" configured for the reference country.
Specifically, the procedure is as follows: for each defined local contest associated with the current contest, at the time of selecting the monument for image submission, the following conditions are evaluated:
- the inclusion of the monument's Q number in the local contest's exclusion list. If positive, the monument will not be part of the contest, and the next contest is evaluated.
- the inclusion of the monument with any geographical entities defined for the contest. Additionally, if a SPARQL query is defined for the local contest, the monument's Q number is compared with the query results. If the monument belongs to the geographical areas or its Q number is among the SPARQL query results of the local contest, the commons category constructed with the following template is added: "Images from Wiki Loves Monuments ||year|| in ||country|| - ||local_contest.label||". Additionally, if the monument is associated with an "App category" and this app category has a defined "sector" property, the commons category generated by the template "Images from Wiki Loves Monuments ||year|| in ||country|| - ||local_contest.label|| - ||monument.app_category.sector||" is also added.
- if the monument does not fall into any local contest or falls into local contests that do not have the "has award" flag set to "True", the commons category "Images from Wiki Loves Monuments ||year|| in ||country|| - without local award" is added.
In the previous paragraphs, when referring to the "country" placeholder within category templates, the "commons_category_label" field defined on the referenced GeoContext will be used.
Blacklisted monuments
[edit]The administration panel allows defining a list of "Black listed Monuments" by adding records to the relative admin section. Monuments included in this list won't be part of the current contest, even if their properties classify them as partecipants. The feature is available for each geocontext.
Single item insertion
[edit]In order to add an item to this list, a new record should be created. The blacklist records have the following fields:
- Geo context: the country for which the blacklisting should take effect (automatically limited to the administered country for administrator of a single country)
- Q number: the Q number identifying the monument on Wikidata
- Notes: optional text notes
Once the record is saved, another field named label will be populated by a lookup on the wikimedia api that determines the label for the referenced page.
Bulk insertion
[edit]For facilitating the input of a list of blacklisted monuments, the administration panel provides a dedicated interface for uploading a text file. The text file should cointain one Q number per row.
All the Q numbers found in the file (and not already present in the list) will be added as blacklisted records. The bulk insertion is available by visiting the link named "Upload a text file" present on the list page of the black listed monuments.
Queries
[edit]The "queries" section defines the queries that are executed during the data scraping phase. When periodic data scraping is performed for a country (GeoContext), all queries are processed.
As defined by the fields described below, some are used for contest-related categorization (i.e., assignment of "App category"), while others are processed without performing this categorization. In the latter case, if defined, the app category with the "is other monuments" flag is associated.
Queries are defined for each GeoContext and are essentially SPARQL queries that are used to execute against the "Wikidata Query Service" (https://query.wikidata.org/).
The fields to be filled in to define a query are as follows:
- geo context: the GeoContext for which the query is defined
- label: query label in the administration system and in the web app scraping panel
- sparql: SPARQL code to be executed, see the following sections of this page for a more detailed explanation
- description: free description field of the query
- categorize for app: boolean flag to indicate that the query results should be evaluated through the "Categorization Rules" to classify the monuments according to the "App categories" and related rules defined in the system
- data categories: "Data categories" to be associated with the monuments. The selected categories are assigned to all monuments resulting from the query
- placeholder: any placeholder to be expanded (see the next section)
SPARQL Syntax and "Query Placeholders"
[edit]To avoid the timeout of particularly complex queries, both in terms of search and result serialization, a run partialization feature has been implemented, which allows repeating the same query based on a placeholder, which is replaced at run-time by a list of values. For each value defined on the placeholder, a query is executed, and the results are then concatenated.
In the following example, this strategy was used to execute the "WLM" query on Italy, actually performing a query for each Italian region. The placeholder in question in this case is the string ITA_REGION.
To function correctly, this placeholder must be present in the system and selected among the query configuration fields (field "placeholder").
Example of a query with placeholders:
SELECT DISTINCT
?mon ?monLabel
?locationLabel
?article ?commonsCat ?geo ?wlm
(group_concat(DISTINCT ?instanceOf; separator=";") as ?instanceOf_n)
(group_concat(DISTINCT ?parent; separator=";") as ?parent_n)
(group_concat(DISTINCT ?children; separator=";") as ?children_n)
(group_concat(DISTINCT ?place; separator=";") as ?place_n)
(group_concat(DISTINCT ?start; separator=";") as ?start_n)
(group_concat(DISTINCT ?end; separator=";") as ?end_n)
(group_concat(DISTINCT ?approvedBy; separator=";") as ?approvedBy_n)
(group_concat(DISTINCT ?endorsedBy; separator=";") as ?endorsedBy_n)
(group_concat(DISTINCT ?accreditedBy; separator=";") as ?accreditedBy_n)
(group_concat(DISTINCT ?relevantImage; separator=";") as ?relevantImage_n)
(SAMPLE(?address) as ?address)
(SAMPLE(?place) as ?adminEntity)
(SAMPLE(?location) as ?location)
WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "it, [AUTO_LANGUAGE]". }
#-FILTER-MONUMENT-PLACEHOLDER-#
# select monuments participating in WLM in Italy
?mon wdt:P17 wd:Q38;
#region selection parameterized at run time;
wdt:P131* wd:ITA_REGION;
p:P2186 ?wlms.
?wlms ps:P2186 ?wlm.
?mon wdt:P31 ?instanceOf .
OPTIONAL { ?wlms pq:P580 ?start . }
OPTIONAL { ?wlms pq:P582 ?end . }
OPTIONAL { ?wlms pq:P790 ?approvedBy. }
OPTIONAL { ?wlms pq:P8001 ?endorsedBy. }
OPTIONAL { ?wlms pq:P5514 ?accreditedBy. }
OPTIONAL { ?mon wdt:P625 ?geo . }
OPTIONAL { ?mon wdt:P131 ?place . }
OPTIONAL { ?mon wdt:P361 ?parent }
OPTIONAL { ?mon wdt:P527 ?children }
OPTIONAL { ?mon wdt:P18 ?relevantImage . }
OPTIONAL { ?mon wdt:P373 ?commonsCat . }
OPTIONAL { ?mon wdt:P276 ?location . }
OPTIONAL {
?article schema:about ?mon ;
schema:isPartOf <https://it.wikipedia.org/> .
}
OPTIONAL { ?mon wdt:P6375 ?address . }
}
GROUP BY ?mon ?monLabel ?locationLabel ?article ?commonsCat ?geo ?wlm
Configuration of "Query Placeholders"
[edit]The "Query placeholders" introduced in the previous section are entities managed on the administration site in the appropriate section.
Each placeholder is characterized by the following fields:
- geo context: the GeoContext for which it is defined
- symbol: the text string that will be used in the SPARQL queries to call this placeholder
- query placeholder values: the list of values that will be expanded in place of the placeholder in the SPARQL query. For each value, it is possible to add a comment that helps in management.
Below is an example of defining placeholders for the query mentioned earlier:

Placeholder for monument update
[edit]The web application has an additional function that allows scraping a single monument to update the data. To perform this update, a query containing the special placeholder is searched among the queries defined for the relevant GeoContext:
#-FILTER-MONUMENT-PLACEHOLDER-#
The single monument update procedure requires that within the query code, this special placeholder is replaced with the code
FILTER(?mon = wd:|q_number|).
where |q_number| represents the q number of the monument for which the update is requested.

WLM Category Rules
[edit]The "WLM Category Rules" are entities that serve to manage the categorization of SPARQL query results to apply the correct category for the web app to the monuments resulting from the query.
As indicated in the query configuration section, not all queries are subject to this categorization process. The process may be "opted in" by activating the flag "Categorize for app" on the query.
Categorization is specific to each GeoContext and is managed by the "WLM Category Rules" section of the administration site.
The definition of a categorization rule is done through the following fields:
- Geo context: the GeoContext for which it is defined
- App category: the category among the "App Category" entities defined in the system for which the rule is defined
- order: order of rule evaluation
- WLM category rules predicates: a list in which it is possible to indicate a set of properties (selectable from a predefined set) and their respective values, which are used to evaluate whether the monument subject to categorization satisfies the rule.
The procedure for categorizing a monument is as follows:
- the categorization rules in the system for the GeoContext on which the scraping is being performed are evaluated in order (based on the "order" field defined above)
- if all the predicates defined for a rule are satisfied, the category associated with the rule is selected, and the evaluation ends
- if no rule is satisfied, the monument is categorized with the category that has the is other monuments flag, if defined for the GeoContext
This mechanism is based on the fact that the SPARQL queries candidate for categorization provide an output set of fields compatible with those evaluated by the predicates specified for each rule.
Scraping processes
[edit]For each country, there are a set of "scraping processes" that must be enabled (and run at least one time) in order to update the internal database. The processes are the following:
- data scraping: the process that performs the queries against wikidata SPARQL endpoint in order to get relevant monuments for the country and saves their properties to the platformd database.
- commons scraping: the process that updates the current images on Commons available for each monument in the database
- OSM scraping: the process that creates or updates the geographical entities related to the country (regions, provinces, municipalities)
All the listed processes are run in by a asyncronous task runner (celery).
Data scraping
[edit]The data scraping process is responsible for updating the database with all relevant monuments. The process is controlled by a set of SPARQL queries, described in the queries section of the documentation . The data scraping process may be activated by
Commons scraping
[edit]The commons scraping process is responsible for updating the database with references to the existing Commons images for each monument in the database. This task is quite expensive, as for each monument a couple of api calls will be performed agains the wikimedia api.
There are 3 ways of triggering a commons scraping job:
- with data scraping: if in the country configuration the "Commons scraping enabled" field is flagged, a full commons scraping will be performed after that task
- by scheduling a "full" commons scraping job in the Jobs configuration (se the following "Jobs scheduling section")
- by scheduling a "partial" commons scrapig job in the Jobs configuration (se the following "Jobs scheduling section")
The "partial" commons scarping works by selecting a subset of the monuments for which the Commons informations will be updated for a given day, with this rule: on day D are included only the Qnumbers that are postive for the following test:
Qnumber Mod A = Julian(Day) mod A (default value for A=30)
Example: day = 22/3/2025 hence Julian day: 91 hence Julian(Day) mod 30 = 1 hence Q94142816 is excluded because 94142816 mod 30 = 26 instead Q55056361 is excluded because 55056361 mod 30 = 1
This metod grants that after A days all items are refreshed from Commos and that on each single day only 1/A effort is spent. The parameter "A" can be controlled by setting the Commons scraping parameter field in the country administration interface.
OSM scraping
[edit]This process is responsible of updating the geographical informations for a country. Normally, it's not needed to run this task periodically, but it should be run at least once per country. The run of this process is handled manualli from the WLM app (see relevant documentation in the app section).
Jobs scheduling
[edit]The scheduling of the long-running scraping jobs is configured in the "Jobs" section of the admin interfaces, exposed under the "Cron tools" section of the django admin.
This interface allows to schedule a job to run periodically or one-shot, by creating a record of "Job" instance. Each job is described by the following fields:
- execution time: date and time of execution for a single-run job. If a periodic run is configured, this field must be left blank
- cron expression: unix cron-syntax expression for describing a periodical job run. If a single run is being configured this field must be left blank
- job type: the job being run, selectable from a list of available jobs
- kwargs: arguments for the job, in form of a JSON object
- tag: a free text that can be used to describe the job
The available jobs are the following:
- data_scraping: executes a data scraping for a country. The argument that must be passed is the "geo_context_id", which refers to the id of the country for which the job is performed (visibile on the country administration section, in the list of available countries)
- commons_full_scraping: executes full commons scraping for a country. The argument that must be passed is the "geo_context_id", which refers to the id of the country for which the job is performed (visibile on the country administration section, in the list of available countries)
- commons_partial_scraping: executes full partial scraping for a country. The argument that must be passed is the "geo_context_id", which refers to the id of the country for which the job is performed (visibile on the country administration section, in the list of available countries)
Manual runs - frontend application interface
[edit]The frontend application hosted at app.wikilovesmonuments.it has some administrative features related to scraping. This interface can be used by GeoContext admins to manually control the scraping processes and are available only for the desktop version of the app.
If the logged in user has administrative permissions for the current selected country, in the topbar 3 buttons give access to the three scraping processes. By clicking each button, a dedicated page shows:
- the date and time of the last run
- the state of the running proccess, if present
- a button to start and stop the current process.

Please note thath the OSM scraping can be triggered only from this interface, while data and commons jobs are normally scheduled.
Single monument update
[edit]For country administrator, the frontend application provides a feature for updating a single monument's data from Wikidata. This feature is available from the detail of a single monument. At the bottom of the monument data, you will find two buttons: one for performing a data update and one for temporarily disabling the monument from the app. In case of a disabled monument, this will be visible for administrator only in the app.

Languages
[edit]The administration of languages and related settings can only be managed by super users (not simple country administrators).
In the web app, the concept of language is separate from that of the country where the contest is managed.
In the web app, the current browser language is initially selected. If the browser language is not among those configured in the system, the English language is selected. The user can change the current language at any time.
To manage languages, it is necessary to use the "Languages" section of the administration panel. Managing a language record involves filling in the following fields:
- code: language code (For example, "it" for Italian and "en" for English)
- name: name of the language
Both fields are mandatory.
Translations
[edit]The public web abb is available in all configured languages.
For each language, there are some translated contents that must be supplied, partially defined in:
- the country administration panel, for controlling donations text transations and mail and website that will be printend in the "WLM intro" of the webapp.
- the Translations tokens section of administration interface
The Translation Tokens sections contains a set of "Translation tokens", that are placeholders used in the code of the public webapp, that will be replaced with the current translation. The interface allows to modify translation tokens for each language. Once a token is selected, the administration interface shows available translations for all current languages, allowing edits.
Translation tokens are directly referenced by the source code of the pulic webapp, so there is no need to create new ones if the code of the app does not change.
When a new language is created, the tokens are automatically updated for including the new language, that will be initialized with the value of the token itself.
Full tokens list
[edit]The full list of translated tokens can be downloaded from the adminstration section of the Translations tokens. Before the list of tokens a link is available to download the tokens list with actual translations.