Jump to content

Grants talk:IEG/WikiBrainTools: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Latest comment: 9 years ago by Shilad in topic Suggestions
Content deleted Content added
→‎Suggestions: new section
Shilad (talk | contribs)
Line 43: Line 43:
# I '''STRONGLY''' recommend that you move from a co-located team to a geographically diverse team.
# I '''STRONGLY''' recommend that you move from a co-located team to a geographically diverse team.
cheers [[User:Stuartyeates|Stuartyeates]] ([[User talk:Stuartyeates|talk]]) 01:41, 3 October 2014 (UTC)
cheers [[User:Stuartyeates|Stuartyeates]] ([[User talk:Stuartyeates|talk]]) 01:41, 3 October 2014 (UTC)

: [[User:Stuartyeates|Stuartyeates]], Thanks for all the great feedback! I want to follow up on a few of user suggestions.
** WikiBrain installation needs: WikiBrain makes use of a few other data sources (page view data, Natural Earth GIS data, several public NLP datasets), but you are correct that it primarily uses WikiDumps. One of the primary goals of this project is to eliminate the need for tool developers to install WikiBrain at all. We would install WikiBrain on Wikimedia Labs, preprocess the data, and provide a web API for bots and researchers. I think this point should address your first few concerns.
** Integration tests: At the moment, we do have a continuous unit test server (Travis CI), but not an integration test server. I have a short term (next month) goal to revive our integration tests.
** Mailing list: Totally agreed! I'll use your suggestion as a catalyst to encourage this change.
** 404: Thanks for the tip. Looks like the link didn't survive a recent refactoring. I've now fixed it.
** Geographically diverse team: YES! Are you volunteering? :) I'm only partially kidding. I do hope that a side-effect of the engagement plan for this IEG is to build a broader coalition of developers. I understand we'll need to be better about communication patterns to make this work (e.g. the mailing list).
: [[User:Shilad|Shilad]] ([[User talk:Shilad|talk]]) 05:19, 3 October 2014 (UTC)

Revision as of 05:19, 3 October 2014

Some algorithmically intensive tools that already exist

This is a neat idea! Here are a few tools that already use rich algorithms and might be helpful to look into/talk to their developers:

Useful people to talk to:

Useful places to seek feedback and post notifications:

Looking forward to hearing more about your idea! Cheers, Jake Ocaasi (talk) 17:16, 25 September 2014 (UTC)Reply

Thanks for the suggestions, User:Ocaasi! I've already made some of the content-based changes you suggested (e.g. useful mailing lists to tap). I've also been in touch with User:EpochFail. Once the feedback period is open on Wed, I'll email the remaining people to see what kinds of improvements they'd suggest. Shilad (talk) 19:31, 28 September 2014 (UTC)Reply

Finalize your proposal this week!

Hi Shilad and Brenthect. Thanks for drafting this proposal!

  • We're hosting one last IEG proposal help session in Google Hangouts this weekend, so please join us if you'd like to get some last-minute help or feedback as you finalize your submission.
  • Once you're ready to submit it for review, please update its status (in your page's Probox markup) from DRAFT to PROPOSED, as the deadline is September 30th.
  • If you have any questions at all, feel free to contact me (IEG committee member) or Siko (IEG program head), or just post a note on this talk page and we'll see it.

Cheers, Ocaasi (talk) 20:04, 25 September 2014 (UTC)Reply

Promoting WikiBrain?

@Shilad and Brenthecht: Hey there. Very pleased to read this as someone who reads up on Wikipedia-related research. I'm not surprised to hear many researchers stray away from research in the area due to the interface-related obstacles they face that prevent data extraction. As this is one of the main problems you identify, though, I wanted to ask what this team might consider doing to inform and attract potential researchers to this new treasure trove of algorithms. Presumably (but correct me if I am wrong), many in the WikiTools community are already familiar with extracting data to inform their work. I get that this proposal would make their jobs easier and would open up new research avenues, but how will you be reaching out beyond the WikiTools community? I JethroBT (talk) 22:08, 29 September 2014 (UTC)Reply

Good question! We've already taken some first steps to promote WikiBrain to algorithmic researchers. We've published a paper describing WikiBrain, and several other papers that use WikiBrain and refer algorithmic and community researchers of Wikipedia to WikiBrain. We have begun to make some inroads, and have received algorithmic contributions from some other research groups. In addition, this grant would also support traveling to two major algorithmic conferences (SIGIR and WWW), where we would present demo posters and organize "Birds of a Feather" sessions. I'd also be interested to hear any other ideas you have! Shilad (talk) 03:18, 30 September 2014 (UTC)Reply

Suggestions

Here are a couple of suggestions based on a short poke around the website(s)

  1. It seems like WikiBrain is entirely based on the wikipedia dumps. If it is it needs to be made clear that data no tin the wikidumps is not accessible via WikiBrain.
  2. It seems like WikiBrain relies on downloads of the wikipedia files, which are huge downloads. The pitfalls of this need to be made clear.
  3. https://github.com/shilad/wikibrain has contributions from 13 contributors, which is better than I expected.
  4. It seems to me that to make non-trivial use of WikiBrain, an intricate java development environment needs to be installed, this needs to be made clearer.
  5. Is there a continuous integration server? That seems like the kind of thing that would be very useful
  6. The mailing list needs to show active use. You may need to encourage your co-located devs to switch to communicating via it.
  7. The beginners example at https://shilad.github.io/wikibrain/# links to https://github.com/shilad/wikibrain/blob/master/wikibrain-cookbook/src/main/java/org/wikibrain/phrases/cookbook/ResolveExample.java which is 404
  8. I STRONGLY recommend that you move from a co-located team to a geographically diverse team.

cheers Stuartyeates (talk) 01:41, 3 October 2014 (UTC)Reply

Stuartyeates, Thanks for all the great feedback! I want to follow up on a few of user suggestions.
    • WikiBrain installation needs: WikiBrain makes use of a few other data sources (page view data, Natural Earth GIS data, several public NLP datasets), but you are correct that it primarily uses WikiDumps. One of the primary goals of this project is to eliminate the need for tool developers to install WikiBrain at all. We would install WikiBrain on Wikimedia Labs, preprocess the data, and provide a web API for bots and researchers. I think this point should address your first few concerns.
    • Integration tests: At the moment, we do have a continuous unit test server (Travis CI), but not an integration test server. I have a short term (next month) goal to revive our integration tests.
    • Mailing list: Totally agreed! I'll use your suggestion as a catalyst to encourage this change.
    • 404: Thanks for the tip. Looks like the link didn't survive a recent refactoring. I've now fixed it.
    • Geographically diverse team: YES! Are you volunteering? :) I'm only partially kidding. I do hope that a side-effect of the engagement plan for this IEG is to build a broader coalition of developers. I understand we'll need to be better about communication patterns to make this work (e.g. the mailing list).
Shilad (talk) 05:19, 3 October 2014 (UTC)Reply