Jump to content

WSoR datasets/dataset form

From Meta, a Wikimedia project coordination wiki

Explain what the dataset is and what it is useful for. Make sure to *at least* include what you used it for.

Location

[edit]

Specify how someone can find it.

If it is a file on a server, give at least <server name>:<absolute path> (eg. internproxy.wikimedia.org:/home/halfak/data/user_first_message.tsv)

If it is a table in a database, give at least <server name>:<database>.<table name> (eg. db42:halfak.user_first_message)

Fields

[edit]
Display the fields and describe what they mean 

For database tables, run:
    $ mysql -h db42 -e "EXPLAIN <table name>;SELECT * FROM <table name> LIMIT 3" <database>

For files, run:
    $ head -n 3 <filename>

Copy and paste the output including the command you ran and the prompt so we can see what directory you were in

Each row represents a thingie. There is a row in this table/file for each thing that the rows correspond to.

  • field name1: What do values in field1 represent?
  • field name2: What do values in field2 represent?

Reproduction

[edit]

Instructions for reproducing the dataset. This could be an SQL query or a script that you (should have) checked into version control.

It may be a sequence of operations. Describe them. Give examples.

Notes

[edit]

Is the dataset incomplete? Are there known issues? What kind of things should someone who uses this dataset be aware of?