-
Notifications
You must be signed in to change notification settings - Fork 50
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #202 from jeffmorin/main
[DS-4301] Added Content Reports section and Filtered Collections report therein
- Loading branch information
Showing
3 changed files
with
291 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
# Filtered Collections report | ||
[Back to the list of all defined endpoints](endpoints.md) | ||
|
||
This endpoint provides aggregated statistics about the number of items per collection according to selected filters. | ||
|
||
NOTE: This is currently a beta feature. | ||
|
||
|
||
**GET /api/contentreport/filteredcollections** | ||
|
||
The endpoint takes a `filters` query parameter whose value is a comma-separated list of filters | ||
like the following: | ||
``` | ||
?filters=is_discoverable,has_multiple_originals,has_pdf_original | ||
``` | ||
|
||
Alternatively, the comma-separated list can be replaced by a repetition of the `filters` parameter | ||
for each requested filter: | ||
``` | ||
?filters=is_discoverable&filter=has_multiple_originals&filter=has_pdf_original | ||
``` | ||
|
||
|
||
Please see [below](#available-filters) for the list of available filters. | ||
|
||
## Report contents | ||
|
||
For each collection, the basic report consists of: | ||
* name (label) and handle of the collection | ||
* name (label) and handle of the parent community | ||
* total number of items | ||
* number of items matching all selected filters | ||
|
||
In addition, a `summary` element provides the total number of items and the total number of items matching all filters | ||
for the whole repository. | ||
|
||
An example JSON response document to `/api/contentreport/filteredcollections`: | ||
```json | ||
{ | ||
"id": "filteredcollections", | ||
"collections": [ | ||
{ | ||
"label": "Collection 1", | ||
"handle": "100/1", | ||
"values": { | ||
"is_discoverable": 23, | ||
"has_multiple_originals": 3, | ||
"has_pdf_original": 14 | ||
}, | ||
"community_label": "Community 1", | ||
"community_handle": "20.500.11794/1", | ||
"nb_total_items": 23, | ||
"all_filters_value": 3 | ||
}, | ||
{ | ||
"label": "Collection 2", | ||
"handle": "100/2", | ||
"values": { | ||
"is_discoverable": 1, | ||
"has_multiple_originals": 0, | ||
"has_pdf_original": 0 | ||
}, | ||
"community_label": "Community 1", | ||
"community_handle": "20.500.11794/1", | ||
"nb_total_items": 1, | ||
"all_filters_value": 0 | ||
}, | ||
{ | ||
"label": "Collection 3", | ||
"handle": "100/3", | ||
"values": { | ||
"is_discoverable": 1, | ||
"has_multiple_originals": 0, | ||
"has_pdf_original": 1 | ||
}, | ||
"community_label": "Community 1", | ||
"community_handle": "20.500.11794/1", | ||
"nb_total_items": 1, | ||
"all_filters_value": 0 | ||
} | ||
], | ||
"summary": { | ||
"label": null, | ||
"handle": null, | ||
"values": { | ||
"is_discoverable": 25, | ||
"has_multiple_originals": 3, | ||
"has_pdf_original": 15 | ||
}, | ||
"community_label": null, | ||
"community_handle": null, | ||
"nb_total_items": 25, | ||
"all_filters_value": 3 | ||
}, | ||
"type": "filtered-collections", | ||
"_links": { | ||
"self": { | ||
"href": "http://localhost:8080/dspace-server/api/contentreport/filtered-collections" | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Available filters | ||
|
||
The available filters are as follows: | ||
|
||
* Item Property Filters | ||
* `is_item`: Is Item - always true | ||
* `is_withdrawn`: Withdrawn Items | ||
* `is_not_withdrawn`: Available Items - Not Withdrawn | ||
* `is_discoverable`: Discoverable Items - Not Private | ||
* `is_not_discoverable`: Not Discoverable - Private Item | ||
* Basic Bitstream Filters | ||
* `has_multiple_originals`: Item has Multiple Original Bitstreams | ||
* `has_no_originals`: Item has No Original Bitstreams | ||
* `has_one_original`: Item has One Original Bitstream | ||
* Bitstream Filters by MIME Type | ||
* `has_doc_original`: Item has a Doc Original Bitstream (PDF, Office, Text, HTML, XML, etc) | ||
* `has_image_original`: Item has an Image Original Bitstream | ||
* `has_unsupp_type`: Has Other Bitstream Types (not Doc or Image) | ||
* `has_mixed_original`: Item has multiple types of Original Bitstreams (Doc, Image, Other) | ||
* `has_pdf_original`: Item has a PDF Original Bitstream | ||
* `has_jpg_original`: Item has JPG Original Bitstream | ||
* `has_small_pdf`: Has unusually small PDF | ||
* `has_large_pdf`: Has unusually large PDF | ||
* `has_doc_without_text`: Has document bitstream without TEXT item | ||
* Supported MIME Type Filters | ||
* `has_only_supp_image_type`: Item Image Bitstreams are Supported | ||
* `has_unsupp_image_type`: Item has Image Bitstream that is Unsupported | ||
* `has_only_supp_doc_type`: Item Document Bitstreams are Supported | ||
* `has_unsupp_doc_type`: Item has Document Bitstream that is Unsupported | ||
* Bitstream Bundle Filters | ||
* `has_unsupported_bundle`: Has bitstream in an unsupported bundle | ||
* `has_small_thumbnail`: Has unusually small thumbnail | ||
* `has_original_without_thumbnail`: Has original bitstream without thumbnail | ||
* `has_invalid_thumbnail_name`: Has invalid thumbnail name (assumes one thumbnail for each original) | ||
* `has_non_generated_thumb`: Has non-generated thumbnail | ||
* `no_license`: Doesn't have a license | ||
* `has_license_documentation`: Has documentation in the license bundle | ||
* Permission Filters | ||
* `has_restricted_original`: Item has Restricted Original Bitstream | ||
* `has_restricted_thumbnail`: Item has Restricted Thumbnail | ||
* `has_restricted_metadata`: Item has Restricted Metadata | ||
|
||
Possible response status: | ||
|
||
* 200 OK - The specific report data was found, and the data has been properly returned. | ||
* 403 Forbidden - In case of unauthorized user session. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,140 @@ | ||
# Metadata query (aka Filtered Items) report | ||
[Back to the list of all defined endpoints](endpoints.md) | ||
|
||
This endpoint provides a custom query API to select items from existing collections, | ||
according to given Boolean and metadata filters. | ||
|
||
NOTE: This is currently a beta feature. | ||
|
||
|
||
**GET /api/contentreport/filtereditems** | ||
|
||
The report parameters are described [below](#report-parameterization). | ||
|
||
Additionally, a `pageNumber` parameter is available to retrieve results starting at a given page | ||
(according to `pageLimit`, the maximum number of items per page). Page numbering starts at 0. | ||
|
||
All parameters except `pageNumber` and `pageLimit` are repeatable. Multiple values can be expressed either | ||
by repeating the corresponding parameter, e.g.: | ||
``` | ||
?filters=is_discoverable&filters=has_multiple_originals&filters=has_pdf_original | ||
``` | ||
|
||
of by using a comma-separated value, e.g.: | ||
|
||
``` | ||
?filters=is_discoverable,has_multiple_originals,has_pdf_original | ||
``` | ||
|
||
except the `queryPredicates` parameter, which supports only parameter repetition for multiple values | ||
to avoid any ambiguities in case a predicate values contains commas. | ||
|
||
Please see [below](#report-parameterization) for parameterization details. | ||
|
||
## Report contents | ||
|
||
An example JSON response document to `/api/contentreport/filtereditems` (metadata removed for brevity): | ||
```json | ||
{ | ||
"id": "filtereditems", | ||
"items": [ | ||
{ | ||
"id": "07e388ff-f22b-4d4f-8275-acab5c3edacc", | ||
"uuid": "07e388ff-f22b-4d4f-8275-acab5c3edacc", | ||
"name": "Enhancing the lubricity of an environmentally friendly Swedish diesel fuel MK1", | ||
"handle": "20.500.11794/42", | ||
"metadata": { | ||
"dc.contributor.author": [ | ||
{ | ||
"value": "Smith, John", | ||
"language": null, | ||
"authority": "6eee383a-f126-4705-9ffb-b4aa4832070e", | ||
"confidence": 600, | ||
"place": 0 | ||
} | ||
], | ||
"dc.publisher": [ | ||
{ | ||
"value": "Elsevier", | ||
"language": "fr_CA", | ||
"authority": null, | ||
"confidence": -1, | ||
"place": 0 | ||
} | ||
], | ||
}, | ||
"inArchive": true, | ||
"discoverable": true, | ||
"withdrawn": false, | ||
"lastModified": "2015-11-23T17:30:21.463+00:00", | ||
"entityType": "Publication", | ||
"owningCollection": { | ||
"id": "d98a828c-45c2-43d9-9861-6b9800bf14f5", | ||
"uuid": "d98a828c-45c2-43d9-9861-6b9800bf14f5", | ||
"name": "Articles publiés dans des revues avec comité de lecture", | ||
"handle": "100/1", | ||
"metadata": { | ||
"dc.identifier.uri": [ | ||
{ | ||
"value": "http://localhost:4000/handle/100/1", | ||
"language": null, | ||
"authority": null, | ||
"confidence": -1, | ||
"place": 0 | ||
} | ||
], | ||
"dspace.entity.type": [ | ||
{ | ||
"value": "Publication", | ||
"language": null, | ||
"authority": null, | ||
"confidence": -1, | ||
"place": 0 | ||
} | ||
] | ||
}, | ||
"type": "collection" | ||
}, | ||
"type": "item" | ||
}, | ||
{ | ||
... | ||
} | ||
], | ||
"itemCount": 40, | ||
"type": "filtereditemsreport", | ||
"_links": { | ||
"self": { | ||
"href": "http://localhost:8080/dspace-server/api/contentreport/filtereditems" | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Report parameterization | ||
|
||
The parameters are specified as follows: | ||
|
||
* `collections`: The collection UUIDs where to search items. If none are provided, the whole repository is searched. | ||
* `presetQuery`: This parameter is not used on the REST API side. It defines a predefined set of query predicates | ||
defined in the Angular layer. | ||
* `queryPredicates`: Predicates used to filter matching items. They can be predefined (see `presetQuery` above) | ||
or defined specifically by the user. As mentioned above, they are the only parameter that cannot be repeated | ||
using comma-separated values. | ||
* `pageLimit`: Maximum number of items per page. | ||
* `filters`: Supplementary filters, these are the same as those available in the Filtered Collections report. | ||
Please see [/api/contentreport/filteredcollections](contentreport-filteredcollections.md#available-filters) for details. | ||
* `additionalFields`: Fields to add to the basic report for each item included in the report. | ||
|
||
The _basic report_ mentioned above includes, for each item: | ||
|
||
* Sequential number (order of appearance in the report) | ||
* UUID | ||
* Parent collection | ||
* Handle | ||
* Title | ||
|
||
Possible response status: | ||
|
||
* 200 OK - The specific report data was found, and the data has been properly returned. | ||
* 403 Forbidden - In case of unauthorized user session. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters