-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GNIP 100: Assets #12124
Comments
Refactoring data upload procedureThe initial When a user uploads some data, the original data will be saved as an Asset. Then, some heuristic will find the type of the uploaded data:
When the Resource is created, the Asset pointing to the uploaded data is linked to the ResourceBase via a Link (the new nullable
We may need to split the logic
|
AuthorizationAn improvement comes for free with the Asset refactoring: at the moment downloadable files are public: if a URL for a data resource leaks out from someone having access to the Resource, such URL can be used by anybody to download the data file. By checking the authorizations for the URL accessing the Assets' data, we'll add protection to the published data, allowing the download only to users having access to the Resource. |
I generally like the idea of assets and forming a resource out of multiple assets. This was also discussed beforehand in a research data infrastructure group and we thought about using the Research Object Crate (RO-Crate) concept or the Annotated Research Object (ARC) concept as our 'assets'. It was just a short discussion and we have yet to do anything in terms of how to incorporate it into the GeoNode architecture. But these parts of the mentioned Motivation are of particular interest to us as research institutes:
Here is an excerpt of the brief discussion the research infrastructure group had regarding RO-Crates. It is a bit outdated, but I think you get the gist of it: Looking at other data portals like CKAN or OpenAgrar (based on MyCORE framework), you can describe a dataset which consists of multiple files/resources. Here are two examples: https://demo.ckan.org/dataset/sample-dataset-1 https://www.openagrar.de/receive/openagrar_mods_00054877?lang=en The latter example on a GeoNode instance: https://atlas.thuenen.de/layers/geonode_data_ingest:geonode:bze_lw_standorte_verschleiert Were the additional files are linked as documents: There is a GeoNode developer workshop creating a so-called GeoCollection object to link multiple GeoNode ResourceBase objects together: https://docs.geonode.org/en/master/devel/workshops/index.html#create-your-own-django-app My idea is to build on top of this concept and try to implement RO-Crate as a collection object: https://www.researchobject.org/ro-crate/ RO-Crates do use a metadata JSON to describe the Crate: https://www.researchobject.org/ro-crate/1.1/root-data-entity.html Most (all?) of the listed attributes of those datasets can be read by the GeoNode API for the bundled resources. Therefore, you only need to describe the ROCrate bundle itself. I do not propose using RO-Crates as base implementation for assets! I just wanted to make clear that the underlying motivation is interesting for a part of the GeoNode community. |
@etj in the implementation ERD diagram, the link between In a settings file in which the storing of original data is disabled, there will be no |
@gannebamm, about cardinalities:
Datasets or other ResourceBase can have no associated Assets at all, as in the case of Datasets only related to GeoServer layers. |
@etj I like to idea making things more flexible here. I took some time to think about the GNIP and want to make some comments, also by sprinkling in questions and personal opinions. However, I cannot forsee what components and workflows (e.g. geonode-importer) have to be touched in the end. Technical questions
Does this mean that each asset has its own StorageManager/-Handler where actual download is being delegated to? Does this complement or even rescind the changes you did recently to the DownloadHandler?
To me, this is definetely an asset on its own which also could be applied to multiple resources. However, what about differentiating
I could not find a
So From the end user perspective
Opportunities Besides those opportunities you mentioned already, I see the following:
|
@ridoo thanks for your comments. You touched on several points that we also included in out discussion. Many of them will probably come in the future, since the concept of Assets could bring a copernican changes to GeoNode in many ways... Let me explain the current scope of this proposal first.
We're not going to cover the management of Assets from the GeoNode UI in this initial implementation. For the moment we only want to prepare the models to support present and future functionalities. We assume that the resource has been configured with assets in some way (DB operations, Django Admin, whatever).
@etj can you please confirm, correct, extend the points above? |
Let's say that this is the first step that could bring to more important changes in the future. We don't have a roadmap, actually, but making the relation between Catalog resource and Data source could:
Regarding the details on the points discussed in your comment, we need to wait for @etj which is working hard these days to connect the dots and prepare more information to share :) |
@ridoo after reviewing with @etj the status of this PR (which is ready for review), we confirm that it neither changes not adds features to GeoNode for the moment. In terms of public APIs and functionality it behaves exactly the same as before, with The next steps will be the implementation of the "primary" asset concept and the multiplicity of assets that can be assigned/downloaded to/from a resource. It will come with a new GNIP. |
* [Fixes #12124] GNIP 100: Assets * Fix migrations * Add _create_asset_dir method * [Fixes #12124] Fix data retriever to handle not default destination folders * Support for assets in upload * Point geonode-importer to branch assets_data_retriever --------- Co-authored-by: Mattia <[email protected]>
* [Fixes #12124] GNIP 100: Assets --------- Co-authored-by: etj <[email protected]>
…2411) * [Fixes #12124] GNIP 100: Assets (#12335) * [Fixes #12124] GNIP 100: Assets --------- Co-authored-by: etj <[email protected]> * [Fixes #12226] Directory assets (#12337) [Fixes #12226] Directory assets --------- Co-authored-by: etj <[email protected]> * [Fixes #12341] Asset download handler and link generator (#12343) * [Fixes #12341] Download handler fix * [Fixes #12341] Assets: link generation (#12342) --------- Co-authored-by: Emanuele Tajariol <[email protected]> * [Fixes #12326] Assets: implement migration for old uploaded files * [Fixes #12326] Assets: implement migration for old uploaded files * [Fixes #12326] Assets: implement migration for old uploaded files * [Fixes #12326] rollback requirements --------- Co-authored-by: etj <[email protected]> Co-authored-by: Emanuele Tajariol <[email protected]>
…2411) * [Fixes #12124] GNIP 100: Assets (#12335) * [Fixes #12124] GNIP 100: Assets --------- Co-authored-by: etj <[email protected]> * [Fixes #12226] Directory assets (#12337) [Fixes #12226] Directory assets --------- Co-authored-by: etj <[email protected]> * [Fixes #12341] Asset download handler and link generator (#12343) * [Fixes #12341] Download handler fix * [Fixes #12341] Assets: link generation (#12342) --------- Co-authored-by: Emanuele Tajariol <[email protected]> * [Fixes #12326] Assets: implement migration for old uploaded files * [Fixes #12326] Assets: implement migration for old uploaded files * [Fixes #12326] Assets: implement migration for old uploaded files * [Fixes #12326] rollback requirements --------- Co-authored-by: etj <[email protected]> Co-authored-by: Emanuele Tajariol <[email protected]>
…2411) (#12611) * [Fixes #12124] GNIP 100: Assets (#12335) * [Fixes #12124] GNIP 100: Assets --------- * [Fixes #12226] Directory assets (#12337) [Fixes #12226] Directory assets --------- * [Fixes #12341] Asset download handler and link generator (#12343) * [Fixes #12341] Download handler fix * [Fixes #12341] Assets: link generation (#12342) --------- * [Fixes #12326] Assets: implement migration for old uploaded files * [Fixes #12326] Assets: implement migration for old uploaded files * [Fixes #12326] Assets: implement migration for old uploaded files * [Fixes #12326] rollback requirements --------- Co-authored-by: mattiagiupponi <[email protected]> Co-authored-by: etj <[email protected]> Co-authored-by: Emanuele Tajariol <[email protected]>
GNIP 100 - Assets
Overview
We need a way to identify files (local, remote, in the cloud...) per se.
There's no way at the moment to identify data files by themselves, which are only referenced by the field `ResourceBase.files'.
Also, the StorageManager is pluggable, but only allows for a single storage backend at once.
By having different subclasses of
Asset
(e.g.LocalAsset
,S3Asset
, ...) we may have a GeoNode instance handling datafiles on different data store backends.Proposed By
Assigned to Release
This proposal is for GeoNode 4.3 (?)
State
Motivation
ResourceBase
with multiple data files (think for instance about a Document having multiple PDF files for different languages).Proposal
We introduce the concept of Asset as generic data, that may be linked to a ResourceBase.
A LocalAsset represents data stored in the filesystem (either a single file or a directory tree).
The
Asset
class will replace and augment the information stored at the moment in theResourceBase.files
field.An Asset is associated with a Resource through a Link, which also tells the URL through which the Asset will be available to the GeoNode users.
Other usages for assets
Since the Asset object is quite simple, we could use it for other purposes as well; for instance, at the moment we use "unadvertised"
ResourceBase
instances for providing simple data to GeoStories (images, PDFs, ...). Instead of using such a heavy object, we could just use LocalAssets for this purpose.Also, more Assets may be associated with an existing ResourceBase; this behavior replicates what GeoNetwork is already doing, that is having multiple data resources pointed by a single metadata record.
Permissions
In the future there could be different permissions for a Resource and its linked Assets, anyway for the sake of simplicity, as a first step we may grant on the asset the very same permissions of the linked ResourceBases.
In the case we want to associate an Asset to more than one Resource, the Asset will be available if the user has download privileges on at least one of the associated Resources.
Implementation
Model:
Asset
class, and its subclassLocalAsset
ResourceBase
andLink
classesLogic:
files
withAsset
instancesDB migration:
API:
Authorization
A user has access to an
Asset
data iff such Asset is associated with at least one ResourceBase for which the user has download permissions.Backwards Compatibility
files
array can be preserved in outputFuture evolution
Decoupled uploads
A user may upload an Asset without having to associate it to a Resource.
Unassociated Assets may be used to automatically create ResourceBases and attach the asset to them.
Deprecate Documents
Once Assets gain their characterization, the Document object will not have much of a meaning, also considering that users upload as a Document any object that is not published as a Layer.
This means that we will be able to remove the Document class, and convert its instances into ResourceBases with an Asset handling the former document's data.
Cleanup uploaded files
Some old installations have the uploaded data into
/data
.The recent importer stores the uploaded data into
.../STATIC_ROOT/uploaded
, and GeoServer publishes the geotiff from that directoryThe final migration to Assets will store the files in
.../STATIC_ROOT/assets
, and GeoServer shall publish the files from there.In order to clean up such obsolete setups, a migration script could be done that:
uploaded/
toassets/
LocalAsset.location
fieldSuch a migration script should only be run by hand since we do not know if such files are also used for other purposes outside GeoNode, so the admin can choose to run it or not.
Feedback
Update this section with relevant feedbacks, if any.
Voting
Project Steering Committee:
Links
Remove unused links below.
The text was updated successfully, but these errors were encountered: