IP Editing: Privacy Enhancement and Abuse Mitigation/zh

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Kaganer (talk | contribs) at 19:11, 10 November 2021 (Created page with "我们希望就以下问题征求意见: * 检查IP的编辑时,会寻找哪些信息?会利用哪些页面来查看这些信息? * 对您来说最有用的信息是什么? * 当您与他人分享IP相关的信息时,您觉得哪种信息可能将匿名编者置于风险中?"). It may differ significantly from the current version.

概要

近年来,互联网用户逐渐意识到,理解他们的个人信息被收集和使用的情况十分重要。多国政府制定了法律来保护用户的隐私。维基媒体基金会的法律和公共政策团队正持续关注全球法律的发展,并研究我们如何才能在保护用户隐私、尊重用户期待方面做到最好,同时维护维基媒体运动的价值观。在这种背景之下,我们需要调查并开始对项目进行技术改进。我们需要和你们协作完成这项工作。

MediaWiki现在在页面历史和日志中公示未注册贡献者的IP地址。毫无疑问会损害其匿名性,甚至会给其带来政府迫害的风险。尽管我们已经告知IP地址会被公开,很少人了解信息公开的后果如何。我们正在着手增强未注册贡献者的匿名性,去识别化其IP地址,就好比读者看不到已注册用户的IP地址一样。这将引入一个「IP掩蔽」用户名,自动生成,人类可读。我们正在讨论具体如何实现。您可以 在评论区留下看法

维基媒体计划有理由存储并公开IP地址,IP地址在反破坏和反骚扰中扮演重要角色。巡查员、管理员和工作人员藉此识别并封禁破坏者、傀儡账号、有利益冲突的编者以及其他不当参与者。

我们希望和你们一起,找出保护用户隐私的方法,同时让我们的反故意破坏工具与它们现在的工作方式保持一致。这其中最重要的部分是开发新的反破坏工具。一旦这一工作完成,我们会把工作重点移至隐藏IP地址,包括限制能看到其他用户IP地址的人数、减少IP地址在数据库和日志中存储的时间。需要注意的是,这项工作的一个关键部分是确保我们的维基仍然能使用同等或更高级别的反破坏工具,并且不会面临滥用的风险。

维基媒体基金会的目标是建立一套反破坏的工具,使得所有人都能直接获取IP地址的需求不复存在。随着反破坏工具的改进,我们将能够隐藏未注册用户的IP地址。我们很清楚这种改变会影响目前的反破坏工作方式,并且希望保证新的工具仍能有效地反破坏,保护项目不受破坏侵扰且支持社群监督。

我们只有通过与用户查核员、监管员、管理员以及其他参与反破坏工作的人士合作,才能达到这个目标。

这是非常具有挑战性的问题。如果我们失败了,那么保护维基不受破坏的能力就会受到影响,这也是多年来这个项目被不断推迟的原因。但是由于互联网上不断演进的数据隐私标准、新的法律以及用户不断变化的期待,维基媒体基金会认为现在正是解决这个问题的时候。

更新

2021年8月30日

大家好,这里是关于葡萄牙语维基百科自禁止未注册用户编辑之后,相关统计数据的更新。我们在影响报告页面发布了详细报告。本报告包含从数据中获取的统计指标以及我们在葡萄牙语维基百科活跃编者中进行的调查。

总而言之,这份报告呈现出积极的变化。在数据收集期间,我们没有发现任何明显的负面影响。有鉴于此,我们鼓励在两个或更多项目上进行实验以观察是否会发生类似的变化。所有的项目都有自己的情况,在葡萄牙语维基百科上能成立的情形在其他项目上未必也能成立。我们想在两个项目上进行有期限的实验,禁止未注册用户的编辑。我们估计大约需要8个月事件来收集足够的数据以便观察到较为显著的变化。此后,我们会停止实验,允许未注册用户编辑,同时分析收集到的数据。一旦数据发布,社群可以自行决定他们是否继续禁止未注册用户编辑。

我们称此为禁止IP编辑实验。您可以在该页上查看时间线和详细信息。请使用该页及其讨论页就这项实验进行更多讨论。

2021年6月10日

大家好,本项目从上次发布更新以来已经有几个月了。这期间我们与许多用户进行了对话,包括编者社群和基金会内的人。我们和资深社群成员就此项目对反破坏工作的影响进行了讨论,对于讨论中提出的各个忧虑,我们都仔细小心地加以考虑。另外还有许多人支持这项提议,认为它是改进未注册用户的隐私以及减少暴露IP带来的法律风险的一步。

我们曾经不清楚这个项目会是怎样的,当时们的目的在于理解IP地址对于社群的意义。此后我们在对话中收到了很多关于这个问题的反馈,有许多不同的语言、不同的社群参与了讨论。我们非常感谢所有付出时间,让我们了解在他们的维基站点或跨维基环境中,反破坏工作是如何有效进行的。

仅向需要的人公示IP地址的提议

关于这个计划,我们现在已经有一些较为具体提议,可以确保反破坏工作进行同时避免向无关的人披露IP地址。 请注意这是「提议」,不是将要发生的事或者最终决定,我们只是想问问您的意见如何。——您认为哪一点比较好?哪一点不可行?有何其他方法?”

我们和熟练的维基社区成员讨论了,有了以下想法,和我们的法规部门合作,完善了这些想法。 概要如下:

  • 用户核查员, 监管员和管理员可以查看完整的IP地址,仅在他们在设置中选择同意不与未授权者共享之后。
  • 参与反破坏活动的维基人,经社区批准后,可以获得查看IP地址权利。这可以通过与我们项目的管理员类似的方式来处理。这些维基人需要有一个至少已经建立一年的帐户,以及至少500次编辑。
  • 所有拥有至少一年的帐户和至少500次编辑的用户都可以访问IP的一部分,IP地址的尾部八位字节隐藏,仅在他们在偏好设置同意不与未授权者共享之后。
  • 所有其他用户无法读取未注册贡献者的IP地址。

获取IP地址的操作将会被记录下来,以备需要时检查。这和现在记录用户查核操作的日志类似。我们希望藉此平衡用户的隐私需求和社群反破坏工作中获取信息的需求。我们希望能向有需要的人提供信息,但同时需要一个流程,使得只有真正需要的人才能获得这些信息,并且获取操作会被记录下来。

我们希望听取您对这个提议的意见。请在讨论页发表看法。

  • What do you think will work?
  • What do you think won’t work?
  • What other ideas can make this better?

工具开发更新

您可能已经知道,我们正在开发一些新的工具,用来减轻屏蔽IP地址带来的影响,同时对所有人来说也是更好的反破坏工具。我们的项目所能提供给社群的反破坏工具有许多不足,这不是秘密,这些工具有许多可以改进的方面。我们希望开发一种工具让进行反破坏工作的社群成员能更高效地工作。我们也希望降低参与反破坏工作的门槛。

我们此前已经谈过这些关于这些工具的一些想法,我会在下方提供近期的更新。需要注意的是,近几个月以来,由于我们团队正对安全投票进行重大修改以满足即将举行的WMF理事会选举,这些工具的开发进度有所减缓。

IP 资讯功能

IP 资讯的一个样例

IP地址相关的信息非常常用,我们正在开发一个显示这些信息的工具。巡查员、管理员和用户查核员目前依靠外部网站来获取这些内容。我们希望通过将这些信息整合到我们的网站中,以便简化查询IP信息的过程。我们最近完成了该工具原型并进行了一轮用户测试以检验我们的方法。我们发现大多数参与采访的编者认为这个工具很有用,并表示有意愿在未来继续使用。 我们希望就以下问题征求意见

  • 检查IP的编辑时,会寻找哪些信息?会利用哪些页面来查看这些信息?
  • 对您来说最有用的信息是什么?
  • 当您与他人分享IP相关的信息时,您觉得哪种信息可能将匿名编者置于风险中?

编者匹配功能

这个项目在之前的讨论中也被称为“附近编者”或“傀儡探测”。我们尝试给它起一个合适的名字,以便不熟知“傀儡”一词的人理解。

目前本项目处于早期阶段。维基媒体基金会研究计划中有一个项目就是关于辅助探测两个具有相似编辑特征的编者。这将会有助于将使用不同自动生成用户名的未注册编者联系起来。在我们刚开始讨论这个项目的时候,就有许多支持本项目的声音。我们也得知开发这个功能的一些风险。

我们计划近期开发出一个原型并与社群分享。关于此项目,有一个十分粗糙的项目页面。我们希望可以尽快更新该页。如果您有任何关于这个项目的想法,请至项目讨论页留言。

This project has also been referred to as "Nearby editors" and "Sockpuppet detection" in earlier conversations. We are trying to find a suitable name for it that is understandable even to people who don't understand the word sockpuppetry.

We are in the early stages of this project. Wikimedia Foundation Research has a project that could assist in detecting when two editors exhibit similar editing behaviors. This will help connect different unregistered editors when they edit under different auto-generated account usernames. We heard a lot of support for this project when we started talking about it a year ago. We also heard about the risks of developing such a feature. We are planning to build a prototype in the near term and share it with the community. There is a malnourished project page for this project. We hope to have an update for it soon. Your thoughts on this project are very welcome on the project talk page.

葡萄牙语维基百科禁止IP编辑后的数据

葡萄牙语维基百科自去年起禁止未注册用户编辑。近几个月内我们团队一直在收集该行动如何影响维基的总体健康。我们也和多名社区成员交流此事。我们正在处理数据收集的最后一块,以呈现此维基准确的发展状态。相信不久我们会有更新。

Portuguese Wikipedia banned unregistered editors from making edits to the project last year. Over the last few months, our team has been collecting data about the repercussions of this move on the general health of the project. We have also talked to several community members about their experience. We are working on the final bits to compile all the data that presents an accurate picture of the state of the project. We hope to have an update on this in the near future.

最近更新

2020年10月30日

30 October 2020

We have updated the FAQ with more questions that have been asked on the talk page. The Wikimedia Foundation Legal department added a statement on request to the talk page discussion, and we have added it here on the main page too. On the talk page, we have tried to explain roughly how we think about giving the vandal fighters access to the data they need without them having to be CheckUsers or admins.

2020年10月15日

This page had become largely out of date and we decided to rewrite parts of it to reflect where we are in the process. This is what it used to look like. We’ve updated it with the latest info on the tools we’re working on, research, fleshed out motivations and added a couple of things to the FAQ. Especially relevant are probably our work on the IP info feature, the new CheckUser tool which is now live on four wikis and our research into the best way to handle IP identification: let us know what you need, the potential problems you see and if a combination of IP and a cookie could be useful for your workflows.

工具

如前文所述,我们的首要目标是为反破坏战士提供更好的工具,让她们有更好的管理体验,同时努力使IP地址字串对她们的价值更少。这么做的另一个原因是—IP地址难以理解,IP地址仅对精通技术的用户很有用。与IP地址共事的学习曲线更高,这给没有技术背景的用户进入职能角色制造了壁垒。我们需要任何人都可以使用的管理工具。

第一件事是让用户查核工具更灵活、强大且易于使用。用户核查工具是检测并封禁破坏者(特别是长期的滥用者)的重要工具,但其很多年没有维护,现在因此看起来过时,而且缺乏必要功能。

我们预见到IP遮蔽生效后参加成为用户查核员的用户会增加。这更加需要更好更易用的用户核查体验。怀着这一想法,反骚扰工具团队去年一年都在改进用户查核工具—使其更有效率,更加用户友好。我们也将社区提出的许多出色功能计入工作范围。在此期间,我们持续地咨询用户查核员和巡查员们,尽我们之所能实现她们的期望。新功能将在2020年10月在所有维基媒体计划可用。

我们专注的下一个功能是IP信息。在向六个维基咨询后,我们决定了这个项目,她们帮我们锁定IP地址的使用例。IP地址提供的一些重要信息应该对巡查员可用,以高效完成工作,所以IP信息的目标是快速而简易地呈现IP地址的关键信息。IP地址提供比如位置、组织、是Tor/VPN节点的可能性、rDNS、IP地址段。IP信息快速而简易地呈现这些信息,不需要任何可能难用的外部工具,我们希望这对巡查员轻松工作很有裨益。这些信息层级足够高,展示这些不会威胁此匿名用户的隐私,同时巡查员们也能据此对IP作出质量判断。

IP信息之后,我们会投入开发找寻类似编辑者功能。我们会使用机器学习模型,由我们和用户查核员们协力构建,由过去的用户查核数据训练,用以比较用户行为,并标记两个或多个用户看起来行为类似。模型会考虑用户在何页面活跃、写作风格、编辑次数等等,以预测某两个用户有多类似。我们尽力使模型更加准确。

一旦完成,此模型可以用在许多地方。第一步我们会用它帮助用户查核员侦测傀儡,省去她们做大量手动工作的麻烦。在未来我们可以考虑如何让更多人使用此工具,以及用它侦测恶意傀儡环(malicious sockpuppeting rings)和错误信息战(disinformation campaigns)。

您可以在该工具的项目页面了解更多情况和评论

計劃動機

我们的法律和公共政策团队建议改进维基媒体计划对IP地址的处理,以跟进当今的隐私标准、法律和用户期待。这是主要原因。

我们认为也有其他令人信服的理由。比如某人想要帮助维基百科然而不明白IP地址被公开的后果,她们想让世界和维基百科更好的美好愿望就变成,无意地泄露个人信息。这不是刚出现的争议,维基百科成立时就有了。IP地址可以透露该用户的地理位置信息、机构和其他可识别的个人信息,具体取决于该IP是如何分配以及由谁分配的。有时IP地址可以直接指明某个人,在某地作了编辑,如果在其地理位置的编辑者本来就很少。对泄露IP地址的担心被社区反复提出,整个维基媒体运动都在讨论如何解决此问题,长达十五年。请看之前有关此话题的不完全列表

我们承认这是棘手的问题,有可能造成到对我们十分尊敬的工作流程的干扰,我们很不愿意打扰到她们。我们只在非常完备的理由下,承担此工作,付出许多精力与时间。这是独立的,同样重要的问题,也启发了维基媒体计划:我们需要,也想要保护贡献维基的人、我们世界的发展和项目赖以生存的在线环境。

研究

由維基媒體基金會支持的關於IP掩蔽對我們社群造成影響的報告。

IP掩蔽的影响

IP地址作为半可靠的部分性的身份标志是有价值的,它不能被用户自己轻易地更改。然而由于网络服务提供商及设备配置的原因,IP地址提供的信息并不总是可靠,而且需要较深的技术知识和熟练度才能有效运用IP地址信息,虽然管理员目前不需要证明他们具有这样的能力。这些技术信息也可用于支撑额外的信息(所谓的“行为信息”),并可以显著影响最终实施的管理操作。

在社群方面,是否允许未注册用户编辑是一个长久以来激烈辩论的话题。到目前为止,这些讨论在允许未注册用户编辑方面存在一些问题。这类讨论通常围绕着如何制止破坏进行,而非围绕保留伪匿名编辑的权限并降低编辑门槛展开。由于未注册用户往往与破坏有关,编辑者对于他们往往存在偏见,这也在诸如ORES的工具的算法中有所体现。另外,与未注册用户的沟通中也存在较大的问题,这主要是与缺乏通知有关,而且也不能保证同一个人能够持续关注发送至该IP讨论页的消息。

关于隐藏IP地址的潜在影响,IP地址被隐藏以后,管理员的工作流程会受到显著影响,并且可能在短期内增加用户查核员的工作量。我们预计管理员控制破坏的能力会受到很大影响。不过,我们可以通过提供等同或更好的反破坏工具来降低这种影响,然而老工具过渡到新工具的则需要一定的过渡期,在此期间管理员的反破坏效率会不及以往。为了给管理员们提供合适的工具,我们必须小心地保留或提供某些当前依赖IP信息运作之功能的替代品:

  • 封禁的有效性及预估的附加封禁设置
  • 在未注册用户之间展现出相似性或固定模式的方法,例如地理相似性,某些机构(例如某些编辑是来自同一所高中或大学)
  • 标记出一组未注册用户的能力,例如在特定IP段内不断改变的IP破坏者
  • 限定位置或特定机构的操作(未必是封禁),例如确定编辑是来自开放代理或学校、图书馆一类的开放场所。

根据我们处理临时账户和识别未注册用户的方法,我们或许可以提升与未注册用户的沟通效果。如果我们隐藏IP地址,并保持未登录用户的编辑权限,则对未注册编辑、匿名破坏以及对未注册用户的偏见的相关讨论和担忧不太可能发生重大变化。

用户查核员工作流程

我们在设计新的 Special:Investigate 工具的过程中,与多个计划上的用户查核员进行了交流。通过这些交流,以及实际经历过真实的案例后,我们将通用的用户查核工作流程分为五个部分:

  • 分类评估:分析案例的查核可行性及复杂性。
  • 画像:描述用户的行为模式,以便用来辨别多个账号背后的人。
  • 查核:使用用户查核工具检查IP地址和用户代理。
  • 判断:将技术信息与画像步骤中建立的行为信息进行比对,确定需要采取何种管理措施。
  • 结束:将查核结果在公开及非公开(如有需要)平台报告,并将信息适当存档以便将来使用。

我们也和信任与安全团队成员合作,了解用户查核工具在维基媒体基金会的调查以及需要该团队处理的案件中的作用。

最普遍和常见的痛点都围绕着用户核查工具不直观的信息呈现、每次需要在新标签打开每个链接,这造成了许多困惑,因为标签页的增长很快失去控制。而且,用户查核员面对的信息高度技术性,无法快速理解,很难跟上这些标签页。所有我们询问的人都表示她们使用第三方软件或者纸和笔来记录信息。

我们也对英文维基百科的傀儡调查页面做了基础的分析,获得了关于她们处理的有多少件、多少被拒绝、一个报告通常包含多少傀儡的统计数据。

巡查员对IP地址的使用

Previous research on patrolling on our projects has generally focused on the workload or workflow of patrollers. Most recently, the Patrolling on Wikipedia study focuses on the workflows of patrollers and identifying potential threats to current anti-vandal practices. Older studies, such as the New Page Patrol survey and the Patroller work load study, focused on English Wikipedia. They also look solely at the workload of patrollers, and more specifically on how bot patrolling tools have affected patroller workloads.

我们的研究数据于五个维基获取:

  • 日语维基百科
  • 荷兰语维基百科
  • 德语维基百科
  • 中文维基百科
  • 英语维基百科

选择这些维基是由于其对IP编辑的已知态度、每月IP编辑占比和其他特定或不寻常的IP编辑者可能遇到的情况(比如使用修订巡查功能或者对代理的广泛使用)。参与者在互动客栈(或者其他类似场所)召集。我们也尽可能在维基大使馆页面张贴了。然而,尽管我们有对采访本身的翻译支持,我们并没有给发布出的的消息翻译,这可能是低回应率的原因。所有采访通过Zoom进行,笔记员同时参加。

与之前的研究相同,我们没有发现对IP信息系统地或统一地使用。这些信息仅仅在有了一定怀疑之后才会被查询。大多数检查用户可疑活动都是从维基上的公开信息开始的,比如本地编辑、全域编辑或者查询过往封禁记录。

Precision and accuracy were less important qualities for IP information: upon seeing that one chosen IP information site returned three different results for the geographical location of the same IP address, one of our interviewees mentioned that precision in location was not as important as consistency. That is to say, so long as an IP address was consistently exposed as being from one country, it mattered less if it was correct or precise. This fits with our understanding of how IP address information is used: as a semi-unique piece of information associated with a single device or person, that is relatively hard to spoof for the average person. The accuracy or precision of the information attached to the user is less important than the fact that it is attached and difficult to change.

Our findings highlight a few key design aspects for the IP info tool:

  • Provide at-a-glance conclusions over raw data
  • Cover key aspects of IP information:
    • Geolocation (to a city or district level where possible)
    • Registered organization
    • Connection type (high-traffic, such as data center or mobile network versus low-traffic, such as residential broadband)
    • Proxy status as binary yes or no

As an ethical point, it will be important to be able to explain how any conclusions are reached, and the inaccuracy or imprecisions inherent in pulling IP information. While this was not a major concern for the patrollers we talked to, if we are to create a tool that will be used to provide justifications for administrative action, we should be careful to make it clear what the limitations of our tools are.

IP Masking Implementation Approaches (FAQ)

October 2021

This FAQ helps answer some likely questions the community will have about the various implementation approaches we can take for IP Masking and how each of them will impact the community.

Q: Following implementation of IP Masking, who will be able to see IP addresses?

A: Checkusers, stewards and admins will be able to see complete IP addresses by opting-in to a preference where they agree not to share it with others who don't have access to this information.

Editors who partake in anti-vandalism activities, as vetted by the community, can be granted a right to see IP addresses to continue their work. This user right would be handled like other user rights by the community, and require a minimum number of edits and days spent editing.

All users with accounts over a certain age and with a minimum number of edits (to be determined) will be able to access partially unmasked IPs without permission. This means an IP address will appear with its tail octet(s) – the last parts – hidden. This will be accessible via a preference where they agree not to share it with others who don't have access to this information.

All other users will not be able to access IP addresses for unregistered users.

Q: What are the potential technical implementation options?

A: Over the last few weeks we have been engaged in multiple discussions about the technical possibilities to accomplish our goal for IP Masking while minimizing impact to our editors and readers. We gathered feedback from across different teams and gained varying perspectives. Below are the two key paths.

  • IP based identity: In this approach, we keep everything as is but replace existing IP addresses with a hashed version of IPs. This preserves a lot of our existing workflows but does not offer any new benefits.
  • Session based identity: In this approach, we create an identity for the unregistered editors based on a browser cookie which identifies their device browser. The cookie persists even when their IP address changes hence their session does not end.

Q: How does IP based identity path work?

A: At present, unregistered editors are identified by their IP addresses. This model has worked for our projects for many years. Users well-versed with IP addresses understand that a single IP address can be used by multiple different users based on how dynamic that IP address is. This is more true for IPv6 IP addresses than IPv4.

An unregistered user may also change IP addresses if they are commuting or editing from a different location. If we pursue the IP-based identity solution for IP Masking, we would be preserving the way IP addresses function today by simply masking them with an encrypted identifier. This solution will keep the IPs distinct while maintaining user privacy. For example, an unregistered user such as User:192.168.1.2 may appear as User:ca1f46.

Benefits of this approach: Preserves existing workflows and models with minimal disruption

Drawbacks of this approach: Does not offer any advantages in a world moving rapidly towards more dynamic/less useful IP addresses

Q: How does session-based identity path work?

A: The path is to create a new identity for unregistered editors based on a cookie placed in their browser. In this approach there is an auto-generated username which their edits and actions are attributed to. For example, User:192.168.1.2 might be given the username: User:Anon3406.

In this approach, the user’s session will persist as long as they have the cookie, even when they change IP addresses.

Benefits of this approach:

  • Ties the user-identity to a device browser, offering a more persistent way to communicate with them.
  • User identity does not change with changing IP addresses
  • This approach can offer a way for unregistered editors to have access to certain preferences which are currently only available to registered users
  • This approach can offer a way for unregistered editors to convert to a permanent account while retaining their edit history

Drawbacks of this approach:

  • Significant change in the current model of what an unregistered editor represents
  • The identity for the unregistered editor only persists as long as the browser cookie does
  • Vandals in privacy mode or who delete their cookies would get a new identity without changing their IP
  • May require rethinking of some community workflows and tools

Q: Does the Foundation have a preferred path or approach?

A: Our preferred approach will be to go with the session-based identity as that will open up a lot of opportunities for the future. We could address communication issues we’ve had for twenty years. While someone could delete the cookie to get a new identity, the IP would still be visible to all active vandal fighters with the new user right. We do acknowledge that deleting a cookie is easier than switching an IP, of course, and do respect the effects it would have.

You can talk to us about these approaches, on the talk page.

维基媒体基金会法务部门的声明

2021年7月

First of all, we’d like to thank everyone for participating in these discussions. We appreciate the attention to detail, the careful consideration, and the time that has gone into engaging in this conversation, raising questions and concerns, and suggesting ways that the introduction of masked IPs can be successful. Today, we’d like to explain in a bit more detail how this project came about and the risks that inspired this work, answer some of the questions that have been raised so far, and briefly talk about next steps.

背景

To explain how we arrived here, we’d like to briefly look backwards. Wikipedia and its sibling projects were built to last. Sharing the sum of all knowledge isn’t something that can be done in a year, or ten years, or any of our lifetimes. But while the mission of the communities and Foundation was created for the long term, the technical and governance structures that enable that mission were very much of the time they were designed. Many of these features have endured, and thrived, as the context in which they operate has changed. Over the last 20 years, a lot has evolved: the way societies use and relate to the internet, the regulations and policies that impact how online platforms run as well as the expectations that users have for how a website will handle their data.

In the past five years in particular, users and governments have become more and more concerned about online privacy and the collection, storage, handling, and sharing of personal data. In many ways, the projects were ahead of the rest of the internet: privacy and anonymity are key to users’ ability to share and consume free knowledge. The Foundation has long collected little information about users, not required an email address for registration, and recognized that IP addresses are personal data (see, for example, the 2014–2018 version of our Privacy policy). More recently, the conversation about privacy has begun to shift, inspiring new laws and best practices: the European Union’s General Data Protection Regulation, which went into effect in May 2018, has set the tone for a global dialogue about personal data and what rights individuals should have to understand and control its use. In the last few years, data protection laws around the world have been changing—look at the range of conversations, draft bills, and new laws in, for example, Brazil, India, Japan, or the United States.

法律問題

The Foundation’s Privacy team is consistently monitoring this conversation, assessing our practices, and planning for the future. It is our job to look at the projects of today, and evaluate how we can help prepare them to operate within the legal and societal frameworks of tomorrow. A few years ago, as part of this work, we assessed that the current system of publishing IP addresses of non-logged-in contributors should change. We believe it creates risk to users whose information is published in this way. Many do not expect it—even with the notices explaining how attribution works on the projects, the Privacy team often hears from users who have made an edit and are surprised to see their IP address on the history page. Some of them are in locations where the projects are controversial, and they worry that the exposure of their IP address may allow their government to target them. The legal frameworks that we foresaw are in operation, and the publication of these IP addresses pose real risks to the projects and users today.

We’ve heard from several of you that you want to understand more deeply what the legal risks are that inspired this project, whether the Foundation is currently facing legal action, what consequences we think might result if we do not mask IP addresses, etc. (many of these questions have been collected in the expanded list at the end of this section). We’re sorry that we can’t provide more information, since we need to keep some details of the risks privileged. “Privileged” means that a lawyer must keep something confidential, because revealing it could cause harm to their client. That’s why privilege is rarely waived; it’s a formal concept in the legal systems of multiple countries, and it exists for very practical reasons—to protect the client. Here, waiving the privilege and revealing this information could harm the projects and the Foundation. Generally, the Legal Affairs team works to be as transparent as possible; however, an important part of our legal strategy is to approach each problem on a case by case basis. If we publicly discuss privileged information about what specific arguments might be made, or what risks we think are most likely to result in litigation, that could create a road map by which someone could seek to harm the projects and the communities.

That said, we have examined this risk from several angles, taking into account the legal and policy situation in various countries around the world, as well as concerns and oversight requests from users whose IP addresses have been published, and we concluded that IP addresses of non-logged-in users should no longer be publicly visible, largely because they can be associated with a single user or device, and therefore could be used to identify and locate non-logged-in users and link them with their on-wiki activity.

Despite these concerns, we also understood that IP addresses play a major part in the protection of the projects, allowing users to fight vandalism and abuse. We knew that this was a question we’d need to tackle holistically. That’s why a working group from different parts of the Wikimedia Foundation was assembled to examine this question and make a recommendation to senior leadership. When the decision was taken to proceed with IP masking, we all understood that we needed to do this with the communities—that only by taking your observations and ideas into account would we be able to successfully move through this transition.

I want to emphasize that even when IP addresses are masked and new tools are in place to support your anti-vandalism work, this project will not simply end. It’s going to be an iterative process—we will want feedback from you as to what works and what doesn’t, so that the new tools can be improved and adapted to fit your needs.

提问

Over the past months, you’ve had questions, and often, we’ve been unable to provide the level of detail you’re hoping for in our answers, particularly around legal issues.

你们具体在担心什么法律风险?

我们无法提供正在评估的特定法律风险。我们理解询问为什么后得到“这是优先事项”的答复让人沮丧。很抱歉,我们无法提供更多细节,但是如前文所述,我们需要我们需要保证风险评估和我们看到的地平线上的法律风险,是机密的,因为提供这些细节可以是某人了解如何损害维基媒体计划、社区和基金会。

对一些问题,我们有固定的回答。

此项目在正在进行吗?

是的,我们正继续找寻最佳方案,能同时隐藏未登录用户IP地址和维持社区保护维基媒体项目的能力,并会执行此方案。

能否在不同地区有差别地执行此更改?

不可能。我们以同样标准保护所有用户的隐私。此更改会在所有维基媒体计划执行。

如果未登录用户的其他信息公开了(比如位置或者互联网服务提供商)那么公开IP本身也没问题,对吗?

不是这样的。在新系统中,我们公开的信息是大体的,不会链接到个人或者具体设备,提供城市级别精确的信息,或者仅能提供此编辑是从某学校的某人。这确实是该用户的有关信息,但是比IP地址更不精确和更不私人。所以尽管我们为了避免滥用公开了一些信息,我们也更好地保护了特定贡献者的隐私。

我们只要告诉她,你的IP会公开,这不足够吗?

不。如上所述,许多人看到IP地址被公开感到困惑。而且,即使某人知晓此信息,基金会也有法律责任合适地处理其个人数据。我们认定不应当公开未登录用户的IP地址,因为那不是当下关于隐私的最佳实践,因为其产生的风险,包括对用户自己的风险。

IP遮蔽如何影响CC-BY-SA的署名?

IP masking will not affect CC license attribution on Wikipedia. The 3.0 license for text on the Wikimedia projects already states that attribution should include “​​the name of the Original Author (or pseudonym, if applicable)” (see the license at section 4c) and use of an IP masking structure rather than an IP address functions equally well as a pseudonym. IP addresses already may vary or be assigned to different people over time, so using that as a proxy for un-registered editors is not different in quality from an IP masking structure and both satisfy the license pseudonym structure. In addition, our Terms of use section 7 specify that as part of contributing to Wikipedia, editors agree that links to articles (which include article history) are a sufficient method of attribution.

但有时,我们不知道怎么回答一些问题,因为这需要您和我们共同努力来找到答案。

想获得这个新的用户权限的人需要满足什么条件?

首先有年龄限制,我们还没决定,但应该是至少16岁。另外她们应当是活跃,信誉良好的社区成员。我们想听您的意见。

I see that the team preparing these changes is proposing to create a new userright for users to have access to the IP addresses behind a mask. Does Legal have an opinion on whether access to the full IP address associated with a particular username mask constitutes nonpublic personal information as defined by the Confidentiality agreement for nonpublic information, and will users seeking this new userright be required to sign the Access to nonpublic personal data policy or some version of it?
1 If yes, then will I as a checkuser be able to discuss relationships between registered accounts and their IP addresses with holders of this new userright, as I currently do with other signatories?
2 If no, then could someone try to explain why we are going to all this trouble for information that we don't consider nonpublic?
3 In either case, will a checkuser be permitted to disclose connections between registered accounts and unregistered username masks?

This is a great question. The answer is partially yes. First, yes, anyone who has access to the right will need to acknowledge in some way that they are accessing this information for the purposes of fighting vandalism and abuse on the projects. We are working on how this acknowledgement will be made;the process to gain access is likely to be something less complex than signing the access to non-public personal data agreement.

As to how this would impact CUs, right now, the access to non-public personal data policy allows users with access to non-public personal data to share that data with other users who are also able to view it. So a CU can share data with other CUs in order to carry out their work. Here, we are maintaining a distinction between logged-in and logged-out users, so a CU would not be able to share IP addresses of logged-in users with users who have this new right, because users with the new right would not have access to such information.

Presuming that the CU also opts in to see IP addresses of non-logged-in users, under the current scheme, that CU would be able to share IP address information demonstrating connections between logged-in users and non-logged-in users who had been masked with other CUs who had also opted in. They could also indicate to users with the new right that they detected connections between logged-in and non-logged-in users. However, the CU could not directly the share IP addresses of the logged-in users with non-CU users who only have the new right.

如果这听起来不可行,请让我们知晓。如前文所述,我们正在了解情况,并想要从您获得反馈,确认它是否工作。

下一步

Over the next few months, we will be rolling out more detailed plans and prototypes for the tools we are building or planning to build. We’ll want to get your feedback on these new tools that will help protect the projects. We’ll continue to try to answer your questions when we can, and seek your thoughts when we should arrive at the answer together. With your feedback, we can create a plan that will allow us to better protect non-logged-in editors’ personal data, while not sacrificing the protection of Wikimedia users or sites. We appreciate your ideas, your questions, and your engagement with this project.

2020年10月

This statement from the Wikimedia Foundation Legal department was written on request for the talk page and comes from that context. For visibility, we wanted you to be able to read it here too.

Hello All. This is a note from the Legal Affairs team. First, we’d like to thank everyone for their thoughtful comments. Please understand that sometimes, as lawyers, we can’t publicly share all of the details of our thinking; but we read your comments and perspectives, and they’re very helpful for us in advising the Foundation.

On some occasions, we need to keep specifics of our work or our advice to the organization confidential, due to the rules of legal ethics and legal privilege that control how lawyers must handle information about the work they do. We realize that our inability to spell out precisely what we’re thinking and why we might or might not do something can be frustrating in some instances, including this one. Although we can’t always disclose the details, we can confirm that our overall goals are to do the best we can to protect the projects and the communities at the same time as we ensure that the Foundation follows applicable law.

Within the Legal Affairs team, the privacy group focuses on ensuring that the Foundation-hosted sites and our data collection and handling practices are in line with relevant law, with our own privacy-related policies, and with our privacy values. We believe that individual privacy for contributors and readers is necessary to enable the creation, sharing, and consumption of free knowledge worldwide. As part of that work, we look first at applicable law, further informed by a mosaic of user questions, concerns, and requests, public policy concerns, organizational policies, and industry best practices to help steer privacy-related work at the Foundation. We take these inputs, and we design a legal strategy for the Foundation that guides our approach to privacy and related issues. In this particular case, careful consideration of these factors has led us to this effort to mask IPs of non-logged-in editors from exposure to all visitors to the Wikimedia projects. We can’t spell out the precise details of our deliberations, or the internal discussions and analyses that lay behind this decision, for the reasons discussed above regarding legal ethics and privilege.

We want to emphasize that the specifics of how we do this are flexible; we are looking for the best way to achieve this goal in line with supporting community needs. There are several potential options on the table, and we want to make sure that we find the implementation in partnership with you. We realize that you may have more questions, and we want to be clear upfront that in this dialogue we may not be able to answer the ones that have legal aspects. Thank you to everyone who has taken the time to consider this work and provide your opinions, concerns, and ideas.