Skip to content
This repository has been archived by the owner on Oct 22, 2020. It is now read-only.

Provide Data Access From xPub / Libero Reviewer To Data Hub #2347

Open
tayowonibi opened this issue Aug 7, 2019 · 0 comments
Open

Provide Data Access From xPub / Libero Reviewer To Data Hub #2347

tayowonibi opened this issue Aug 7, 2019 · 0 comments

Comments

@tayowonibi
Copy link

Problem / Motivation

Data hub ideally provides a centralized, integrated, and cleansed data repository for the different various departmental/teams/siloed data, and the dataset/date entities managed in the repo are either data which are currently being used for known analytics requirements and data which we assess to be potentially useful for some future known/unknown analytics requirement.

  • WHO wants this? Data Hub Team
  • WHEN do they want it? ASAP. Data hub should have implemented a system using xpub as its data source by the time elife system finally migrates. Besides, editorial team has provides us with a list of data requirements, which requires data probably present only in xpub
  • WHAT do they want? Access From xPub / Libero Reviewer To Data Hub
  • WHY do they want this?

Based on current eJP system, some of the entitites we manage in the repo include

  • manuscript
  • manuscript versions
  • manuscript reviews
  • generic persons (authors, reviewers, editors etc)
  • person roles
  • Specific person types
    • reviewer information
    • early career researchers information
    • editors

Section of the data schema for the manuscript data is shown below.
nks

Data Access Requirements
Data

  • ability to access the data in the libero reviewer database to satisfy our current dataset requirements (as shown above above)
  • ability to quickly/easily/flexibly add new entity types dataset e.g. financial data because finance departments require some analytics
  • ability to do both full and incremental data load (ability to sort and filter by time of last update/insert/delete)
    Data Load Frequency
  • presently daily for most dataset based on the limitations imposed by EJP, however, i imagine having some sub-daily data loading (hourly ? perhaps)
  • Possible Nice To Have -message streaming using web sockets or pub-sub system for listening for new/deleted/updated entities from the database - THIS IS NOT A REQUIREMENT

Proposed solution

Possible solutions

  • Provide/Use direct access to the database
  • Provide some web api for accessing the entity - REST / GraphQl
  • Use of messaging systems to listening to for new/deleted/updated entities

BTW these days, I am leaning on use of direct access to the database as it will provide the data hub team with the flexibility and ease of defining data source independently of the xpub team, however, we will be tightly coupted to the database/data schema definition used by the libero reviewer.

/cc @de-code @hdrury1

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant