Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROJECT IDEA] XM-DAC organisation lists #25

Open
stevieflow opened this issue Dec 10, 2020 · 8 comments
Open

[PROJECT IDEA] XM-DAC organisation lists #25

stevieflow opened this issue Dec 10, 2020 · 8 comments

Comments

@stevieflow
Copy link

Rationale

Useful for getting consistent org identifiers, which are essential for publication of networked IATI data

Proposal

Publish a list of identifiers minted from the Channel Codes of the OECD DAC CRS list: http://www.oecd.org/development/financing-sustainable-development/development-finance-standards/dacandcrscodelists.htm

Specifically:

  • take the "Channel ID" code for each entry on the "Channel codes" list and prefix with "XM-DAC-"
    -- example: African Medical and Research Foundation (21045) --> XM-DAC-21045

  • In doing this, also include in the list the Year, Acronym and Full Name (in en and fr, when available)

  • take the "Agency" list and do similar:
    -- example: Austrian Federal Ministry of Finance - donorcode=1; AgencyCode=1 --> XM-DAC-1-1

  • In doing this, also include the Acronym and Full Name (in en and fr, when available) and Type of Agency

@stevieflow
Copy link
Author

(also - in doing this, set up something (@andylolz has something?) to periodically synch ....)

@markbrough
Copy link
Member

Maybe this is potentially some kind of extension of this?
https://org-id-finder.codeforiati.org/

@matmaxgeds
Copy link
Contributor

Whatever the scraper/generator comes up with, would be great if org-id could suck it in - I guess we might be able to come up with several sources to scrape for org_IDs.......I wondered about scraping a list of all the orgs mentioned in IATI without an org_ID. I also wondered about some tool to link orgs across the different lists where they are the same org - that is on the heuristic de-duplication list of things to do

@stevieflow
Copy link
Author

Thanks @markbrough @matmaxgeds

I'm minded to keep this limited in scope / application to begin with.

We know that the OECD Purpose Code and Agency (which is the bases for this list, which is actually incorrect: https://codelists.codeforiati.org/OrganisationIdentifier/) lists are useful to IATI publishers in order to describe other organisations successfully.

The workflow, however (download list; find entity; mint XM-DAC identifier) is long-winded and an unneeded overhead. If we can generate and (auto) maintain a reference list, then it can indeed have all sorts of applications. But for now - I'd just get the list in place....

@andylolz
Copy link
Member

Just adding a note… Agencies are available as JSON here:
https://datahub.io/core/dac-and-crs-code-lists/r/agencies.json

Channel codes are available as JSON here:
https://datahub.io/core/dac-and-crs-code-lists/r/channel-codes.json

@stevieflow
Copy link
Author

stevieflow commented Jan 26, 2021

The OECD now have an update of the lists in XML: https://www.oecd.org/dac/financing-sustainable-development/development-finance-topics/crs-xml.htm

For the "Channel of delivery" , there's (handily) a status field and activition-date:

<codelist-item status="active" activation-date="2015-01-01" mcd="MCD"><code>12004</code><name><narrative>Other public entities in recipient country</narrative><narrative xml:lang="fr">Autres entité publique dans le pays bénéficiaire</narrative></name><description><narrative></narrative><narrative xml:lang="fr"></narrative></description><category>12000</category><dac:reference>DCD/DAC/STAT(2015)14/REV1</dac:reference></codelist-item>
<codelist-item status="active" activation-date="1998-01-01" mcd="non-MCD"><code>21001</code><name><narrative>Association of Geoscientists for International Development </narrative><narrative xml:lang="fr">Association de géoscientifiques pour le développement international</narrative></name><description><narrative></narrative><narrative xml:lang="fr"></narrative></description><category>21000</category> <dac:coefficient>100</dac:coefficient></codelist-item>

@stevieflow
Copy link
Author

@andylolz can this be added to the Projects board, please?

@markbrough
Copy link
Member

This would be cool! I also heard mention of Public Bodies again the other day... made me think back to Tim's paper on govt organisation identifiers!

In terms of how to implement this:

  • codeforIATI/codelist-updater has a bunch of importers that generally just read CSV files and make PRs to other repositories of that data in the IATI CLv3 XML format.
  • It's good to keep most of the logic of scraping out of that repository -- so we then have one repository per scraper that pushes nice CSV files to each repository's gh-pages branch, like codeforIATI/imf-exchangerates. codelist-updater then just has a nightly process that goes and checks each of those CSV files, converts to XML, and makes a PR to the relevant repository as required.
  • It might be worth thinking about the interaction between this and the rest of the OECD codelist scraping (@andylolz might have some updates on this -- I think the scraper is here). It would be good not to have to approve the same changes to e.g. a new Channel Codes multiple times. NB we also have to sort out this PR at some point soon :'(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants