Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BSI data source issue: Shop only returning results Amendment supplemented results for "BS EN ISO 14044:2006" #12

Closed
ronaldtse opened this issue Jan 19, 2022 · 13 comments
Assignees
Labels
data source problem Authoritative data source problem help wanted Extra attention is needed

Comments

@ronaldtse
Copy link

ronaldtse commented Jan 19, 2022

A search for "BS EN ISO 14044:2006" only returns the document references that already have Amendments attached. That works for shopping (a buyer would want to get the free amendment) but is not appropriate for fetching references.

Screenshot 2022-01-19 at 8 18 49 PM

We need to ask BSI how we can obtain such reference.

Originally posted by @ronaldtse in #11 (comment)

@ronaldtse ronaldtse added the help wanted Extra attention is needed label Jan 19, 2022
@ronaldtse ronaldtse self-assigned this Jan 19, 2022
@opoudjis
Copy link

The same is happening for PAS 2035/2030:2019 : https://shop.bsigroup.com/search?query=PAS+2035%3A2019&status=Current&type=products&page=1

I think we're going to need to be able to truncate "+..." in document retrieval.

@Intelligent2013
Copy link

I think we're going to need to be able to truncate "+..." in document retrieval.

@opoudjis please be careful about this case BS EN ISO 14044:2006+A1:2018, (BS 202000):
image

@ronaldtse
Copy link
Author

I've asked BSI about this problem pending a reply.

@ronaldtse
Copy link
Author

From @opoudjis

The problem is that the BSI site does not have an entry for PAS 2030:2019 at all, distinct from PAS 2035/2030:2019+A1:2022.

In fact, I was able to search for it on the webshop:
https://shop.bsigroup.com/search?page=1&query=PAS%202030%3A2019&type=products

Screenshot 2022-03-09 at 12 50 58 AM

Then I click on that link with the most similar document identifier:
https://shop.bsigroup.com/products/specification-for-the-installation-of-energy-efficiency-measures-in-existing-dwellings-and-insulation-in-residential-park-homes/standard

Screenshot 2022-03-09 at 12 51 23 AM

And then "Document History":

Screenshot 2022-03-09 at 12 51 43 AM

The whole list of superseded documents are there! They are just not for sale, but they exist!

@ronaldtse
Copy link
Author

@andrew2net and I found out that a GraphQL JSON query is used to fetch Document History:
e.g.

curl 'https://standards.accord.bsigroup.com/standards/00-3492300904/hierarchy' \
-X 'GET' \
-H 'Accept: */*' \
-H 'Content-Type: application/json' \
-H 'Origin: https://shop.bsigroup.com' \
-H 'Accept-Encoding: gzip, deflate, br' \
-H 'Host: standards.accord.bsigroup.com' \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15' \
-H 'Referer: https://shop.bsigroup.com/' \
-H 'Accept-Language: en-GB,en;q=0.9' \
-H 'Connection: keep-alive'

This gives:

[{
    "level": 3,
    "nodes": [{
        "standard": {
            "assetId": "01-4190858983",
            "title": "Improving the energy efficiency of existing buildings. Specification for installation process, process management and service provision",
            "primaryDesignator": "PAS 2030:2012 Edition 2",
            "status": "Withdrawn",
            "handle": "improving-the-energy-efficiency-of-existing-buildings-specification-for-installation-process-process-management-and-service-provision-1"
        },
        "parentAssetIds": ["01-1721332204"]
    }]
}, {
    "level": 2,
    "nodes": [{
        "standard": {
            "assetId": "00-4175954925",
            "title": "Improving the energy efficiency of existing buildings Specification for installation process, process management and service provision",
            "primaryDesignator": "PAS 2030:2014",
            "status": "Withdrawn",
            "handle": "improving-the-energy-efficiency-of-existing-buildings-specification-for-installation-process-process-management-and-service-provision-6"
        },
        "parentAssetIds": ["01-4190858983"]
    }]
}, {
    "level": 1,
    "nodes": [{
        "standard": {
            "assetId": "00-8691469210",
            "title": "Specification for the installation of energy efficiency measures (EEM) in existing buildings",
            "primaryDesignator": "PAS 2030:2017",
            "status": "Withdrawn",
            "handle": "specification-for-the-installation-of-energy-efficiency-measures-eem-in-existing-buildings"
        },
        "parentAssetIds": ["00-4175954925"]
    }]
}, {
    "level": 0,
    "nodes": [{
        "standard": {
            "assetId": "00-3492300904",
            "title": "Specification for the installation of energy efficiency measures in existing dwellings and insulation in residential park homes",
            "primaryDesignator": "PAS 2030:2019+C2:2021",
            "status": "Withdrawn",
            "handle": "specification-for-the-installation-of-energy-efficiency-measures-in-existing-dwellings-and-insulation-in-residential-park-homes"
        },
        "parentAssetIds": ["00-8691469210"]
    }]
}, {
    "level": -1,
    "nodes": [{
        "standard": {
            "assetId": "00-8446111360",
            "title": "Retrofitting dwellings for improved energy efficiency. Specification and guidance",
            "primaryDesignator": "PAS 2035/2030:2019+A1:2022",
            "status": "Current",
            "handle": "retrofitting-dwellings-for-improved-energy-efficiency-specification-and-guidance-3"
        },
        "parentAssetIds": ["00-3492300904"]
    }]
}]

@ronaldtse
Copy link
Author

Actually I realized that the endpoint is just a Shopify GraphQL endpoint. If we have some documentation to build a proper query for the primaryDesignator, we could query it to provide more results that can match.

@ronaldtse
Copy link
Author

UPDATE: the BSI shop does this:

  • Search: query to Algolia
  • Standard page: GraphQL to Shopify
  • Document History: GraphQL to standards.accord.bsigroup.com

And actually, even in the Document History, we cannot find the "standalone" PAS 2030:2019 document without the C2.

Screenshot 2022-03-09 at 1 37 06 AM

BSI responded that this is because without the "Corrigendum" the document is "wrong", so they only have that entry. Then we will have to make do with this entry.

i.e. we have no choice but to return "PAS 2030:2019+C2:2021" if our query is "PAS 2030:2019". Locally, perhaps we can drop the "+C2:2021" portion.

@ronaldtse ronaldtse assigned andrew2net and unassigned ronaldtse Mar 8, 2022
@ronaldtse
Copy link
Author

I think we're going to need to be able to truncate "+..." in document retrieval.

As @opoudjis this is probably the step we have to take.

@opoudjis
Copy link

opoudjis commented Mar 9, 2022

i.e. we have no choice but to return "PAS 2030:2019+C2:2021" if our query is "PAS 2030:2019". Locally, perhaps we can drop the "+C2:2021" portion.

Or substitute it with the previous corrigendum, as required. Ugh.

@ronaldtse ronaldtse changed the title BSI Shop only returning results Amendment supplemented results for "BS EN ISO 14044:2006" BSI data source issue: Shop only returning results Amendment supplemented results for "BS EN ISO 14044:2006" Mar 19, 2022
@ronaldtse ronaldtse added the data source problem Authoritative data source problem label Mar 19, 2022
@Intelligent2013
Copy link

There are another examples of queries (from metanorma/mnconvert#129):

Query Return
BS 4592-0:2006 BS 4592-0:2006+A1:2012
BS 5266-1 BS 5266-1 ExComm (Fire)
BS 5839-1 BS 5839-1 ExComm
BS 7273-4 BS 7273-4+A1 — SET
BS 7273-4:2015 BS 7273-4:2015+A1:2021 — SET
BS 8500-2:2015 BS 8500-2:2015+A2:2019
BS EN 12973 BS EN 12973-TC
BS EN ISO 13485:2016 BS EN ISO 13485:2016+A11:2021
BS EN 13659:2004 BS EN 13659:2004+A1:2008
BS EN 15663 BS EN 15663+A1
BS ISO 26000 BS ISO 26000 + IWA 26
BS ISO 26000:2010 BS ISO 26000:2010 + IWA 26:2017
BS ISO 44001 BS ISO 44001 + BS ISO 44002
BS ISO 44001:2017 BS ISO 44001:2017 + BS ISO 44002:2019`
BS 9999:2017 BS 9999:2017 ExComm (Fire)
BS EN 12845:2015 BS EN 12845:2015+A1:2019
BS EN ISO 14040:2006 BS EN ISO 14040:2006+A1:2020
BS ISO 26000:2010 BS ISO 26000:2010 + IWA 26:2017
BS ISO 55000:2014 BS ISO 55000:2014 Combined IAM

@opoudjis
Copy link

https://github.com/metanorma/metanorma-bsi/issues/132 is the same set of issues.

@andrew2net
Copy link
Contributor

I think we're going to need to be able to truncate "+..." in document retrieval.

As @opoudjis this is probably the step we have to take.

@ronaldtse are we talking about truncating "+..." for all the BSI documents?

@ronaldtse
Copy link
Author

@andrew2net we have to offer both identifiers (docid) for this one entry:

  • search for "PAS 2030:2019+C2:2021" => return PAS 2030:2019+C2:2021 as primary docid
  • search for "PAS 2030:2019" => return PAS 2030:2019 as primary docid but with metadata taken from PAS 2030:2019+C2:2021

andrew2net added a commit that referenced this issue Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data source problem Authoritative data source problem help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants