Develop a ML-based service to detect vandalism on Wikidata
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	diego
	Feb 3 2023, 10:01 PM

Description

The Research team in collaboration with the ML-Platform team are creating a new service to help Wikidata patrollers to detect revisions that might be reverted.

Requirments:

Model should be able to run in Lift Wing
Improve performance on existing models (see baselines comparison)

Related Objects
Search...

Status	Subtype	Assigned	Task
In Progress		diego	T333892 Develop a new generation of ML models for Wikidata
Resolved		diego	T328813 Develop a ML-based service to detect vandalism on Wikidata
Resolved		diego	T341820 Evaluate and improve the Revert Risk model for Wikidata.
Open		None	T343419 Move Wikidata tools to Lift Wing
Resolved		noarave	T343731 Update API calls from ORES to Lift Wing
Open		MunizaA	T344016 Improvements to Annotool
Resolved	BUG REPORT	MunizaA	T343973 Fix relative links for Qids in Annotool
Resolved		MunizaA	T344152 Allow users to edit annotations in Annotool
Resolved		MunizaA	T348666 Add randomization to the revision order showed in Annotool
Open		None	T349739 [Annotool] Include additional information on private project exports
Open		None	T369371 [Research Engineering Request] Deploy the new Wikidata Revert Risk Model

Event Timeline

diego created this task.Feb 3 2023, 10:01 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 3 2023, 10:01 PM

Updates

We are working on manually evaluating reverts to identify the right data to train the model.

Lydia_Pintscher added a project: Wikidata.Feb 6 2023, 6:11 PM

Lydia_Pintscher added a project: Wikidata data quality and trust.

Update

Still working on the data evaluation. Currently I'm studying the use of tags and user groups and their relation with reverts.

Update

Currently I'm working on featuring engineering. The current model has around 72% accuracy on balanced data.

Update

New features had slightly improved the accuracy (now is 75%), I'm still working on improving the model.

Michael subscribed.Mar 14 2023, 4:16 PM

Michael mentioned this in T332021: Wikidata Articlequality ORES/ML model needs updating after MUL.Mar 14 2023, 4:48 PM

• Manuel mentioned this in T312097: [EPIC] MUL - Default values for labels and aliases .Mar 15 2023, 1:26 PM

achou mentioned this in T333125: Deploy Revert-risk wikidata model to ml-staging.Mar 27 2023, 8:25 AM

Update

I'm testing a Deep Learning approach, to see if offers relevant advantages over the current XGBOOST model.

leila added a parent task: T333892: Develop a new generation of ML models for Wikidata.Apr 3 2023, 10:32 PM

leila moved this task from FY2022-23-Research-January-March to In Progress on the Research board.

leila edited projects, added Research; removed Research (FY2022-23-Research-January-March).

leila moved this task from In Progress to Epics on the Research board.Jul 25 2023, 5:53 PM

KHernandez-WMF added a subtask: T341820: Evaluate and improve the Revert Risk model for Wikidata..Oct 31 2023, 6:40 PM

diego closed subtask T341820: Evaluate and improve the Revert Risk model for Wikidata. as Resolved.Apr 29 2024, 3:30 PM

diego added a subscriber: Trokhymovych.Jun 28 2024, 7:57 AM

diego added a subscriber: XiaoXiao-WMF.

@Trokhymovych has addresed the comments and submitted the merge request. Model binary can be found here.
I'm going to coordinate with research engineers to decide next steps.

@Trokhymovych, please post here the models' performance results

Model Performance on Historical Holdout Testset*

Model	AUC	PR@R0.99	PR@R0.90	PR@R0.50
Rule-based	0.767755	0.075708	0.075708	0.482298
ORES	0.867058	0.083206	0.128595	0.567549
Graph2vec model	0.922219	0.102188	0.225654	0.759679

Model Performance on Human Labeled Testset**

Model	AUC	PR@R0.99	PR@R0.90	PR@R0.50
Rule-based	0.894360	0.737920	0.957547	0.957547
ORES	0.926412	0.802880	0.949649	0.963675
Graph2vec model	0.935937	0.838346	0.959763	0.967811

where PR@R - Precision at Recall level.

*Holdout dataset details: 127,489 revisions between 2023-05-01 and 2023-08-01. Only revisions with ORES predictions are included; self-reverts are filtered out. The revert rate is ~7.5%, and the anonymous rate is ~9.2%.

**Labeled testset details: 1,221 revisions between 2022-04-26 and 2023-07-31. Only revisions with ORES predictions are included; self-reverts are filtered out. The revert rate is ~73.8%, and the anonymous rate is ~69.5%.