2018 | VASTE – Veracity Assesment in Spatio-TEmporal heterogeneous data. An application on Web animal epidemiological surveillance. 

Axe & tâche scientifique DigiCosme : DataSense & Tâche 2
Coordinateurs : Fatiha Saïs & Juliette Dibie
Nom & Prénom du Candidat : Joana Esther Gonzales Malaverri
Institutions :

  • Paris Saclay :
    • LRI
    • AgroParisTech
    • David
    • Telecom ParisTech
  • Hors périmètre Paris Saclay :
    • TETIS (Montpellier)

Laboratoire gestionnaire : LRI
Adossé à l’action DigiCosme : GT D2K
Durée & Dates de la mission : 1 an – mai 2018/2019

VASTE project aims at assessing the veracity of epidemiological events by exploiting the knowledge and the data coming from different data sources of different origins and different quality levels: structured data derived from expert reports published by official agencies (e.g. OIE, FAO, OMS) and data produced by a process of text mining from unofficial sources (e.g. local newspapers, blogs).
Objectif :
For this purpose, we plan to develop an approach that will combine data linking approaches and reasoning mechanisms from argumentation theory. Indeed, defining new data linking methods which while considering data quality indicators (e.g., freshness, reliability) will allow to determine the groups of events referring to the same disease, the same species, the same localisation and the same period, and thus will be able to indicate how true is an event. The argumentation reasoning will allow enforcing the obtained truthfulness, by reasoning on positive and negative expert arguments. To prove the effectiveness and the efficiency of the proposed approach, an experimental evaluation will be conducted on datasets that were already collected by TETIS Lab on the animal epidemiological surveillance domain.
Work in progress:
As a first step of this project, we have studied the related work on veracity assessment and on temporal information representation in Knowledge Graphs (KG). We are currently defining an approach that allows to first enrich an existing KG with temporal information by relying on existing knowledge graphs such as Yago and Wikidata. As a second stage of the approach is to assess the temporal veracity of each triple in the KG by combining different kinds of information.

For communication and dissemination on VASTE project, we are replying to a call for workshops at EGC 2019 in which we will target the scientific community working on truth discovery and veracity assessment.

Prior publications:

  • Journal articles
    • MALAVERRI, J. E. G.; SANTANCHÈ, A.; MEDEIROS, C. M. B., 2014, A Provenance-based approach to evaluate data quality in eScience. Int. Journal of Metadata, Semantics, and Ontologies, v. 9, p. 15-28.
  • Conference papers
    • CARVALHO, L. A. M. C.; MALAVERRI, J. E. G.; MEDEIROS, C. M. B., 2017. Implementing W2Share: Supporting Reproducibility and Quality Assessment in eScience. In: 11th BreSci – Brazilian e-Science Workshop.
    • RAIZER, K.; MALAVERRI, J. E. G.; LI, L. T.; DIAS, E. Z. V.; MACEDO, C. P., 2015. Gradual Migration of Legacy Software to the Cloud: a Telco OSS Case Study. In: 45th IEEE/IFIP International Conference on Dependable Systems and Networks.
    • SOUSA, R. B.; CUGLER, D. C.; MALAVERRI, J. E. G.; MEDEIROS, C. M. B., 2014. A Provenance-based Approach to Manage Long Term Preservation of Scientific Data. In: 30th IEEE Int. Conf. on Data Engineering, Workshop on Long Term Preservation for Big Scientific Data.
    • MALAVERRI, J. E. G.; MOTA, M. S.; MEDEIROS, C. M. B., 2013. Estimating the quality of data using provenance: a case study in eScience. In: 19th Americas Conf. on Inf. Systems.
    • MALAVERRI, J. E. G.; MEDEIROS, C. M. B., 2012. Data Quality in Agriculture Applications. In: XIII Brazilian Symposium on GeoInformatics (GeoInfo).
    • MALAVERRI, J. E. G.; MEDEIROS, C. M. B.; LAMPARELLI, R. A. C., 2012. A Provenance Approach to Assess the Quality of Geospatial Data. In: Symposium On Applied Computing.