Research talk:Memcached Optimization

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

Providing we can genuinely anonymise the data before providing it this seems innocuous to me. Just to clarify, you aren't requesting any data that isn't individually publicly accessible? WereSpielChequers 13:15, 4 July 2011 (UTC)

Caching algorithms[edit]

What caching algorithms do you intend on trialling? By caching algorithms, do you mean replacement strategies, like Greedy-Dual-Size-Frequency? -- Tim Starling 02:58, 6 July 2011 (UTC)

We intend to try various caching algorithms (=replacement strategies) such as Adaptive Replacement Cache (ARC), LRFU and Greedy-Dual-Frequency. With the data we have requested, we can compare different strategies in terms of improvement in latency and maximum throughput.

The data we have requested is not publicly available because it is generated by the web server at runtime and not made public due to privacy concerns. Since we have no interest in the real identity (IP addresses) of the clients, but we do want to be able to tell if two accesses are by the same client or not, we've proposed an anonymization scheme that will enable us to differentiate between client but not see any IP addresses.

Thanks,

Yuval Meir Limor Gavish

Legal review[edit]

It seems this project is of interest to the ops team, we need to run it through the Legal to discuss the privacy implications first. --DarTar 20:13, 2 December 2011 (UTC)