Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Aardvark as the default format for indexing records #163

Merged
merged 2 commits into from
Jul 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 26 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,18 +32,32 @@ $ gem install geo_combine
## Usage

### Converting metadata

#### Converting metadata into GeoBlacklight JSON
GeoCombine provides several classes representing different metadata standards that implement the `#to_geoblacklight` method for generating records in the [GeoBlacklight JSON format](https://opengeometadata.org/reference/):
```ruby
GeoCombine::Iso19139 # ISO 19139 XML
GeoCombine::OGP # OpenGeoPortal JSON
GeoCombine::Fgdc # FGDC XML
GeoCombine::EsriOpenData # Esri Open Data Portal JSON
GeoCombine::CkanMetadata # CKAN JSON
```
An example for converting an ISO 19139 XML record:
```ruby
# Create a new ISO19139 object
> iso_metadata = GeoCombine::Iso19139.new('./tmp/opengeometadata/edu.stanford.purl/bb/338/jh/0716/iso19139.xml')

# Convert ISO to GeoBlacklight
# Convert to GeoBlacklight's metadata format
> iso_metadata.to_geoblacklight

# Convert that to JSON
# Output it as JSON instead of a Ruby hash
> iso_metadata.to_geoblacklight.to_json
```
Some formats also support conversion into HTML for display in a web browser:
```ruby
# Create a new ISO19139 object
> iso_metadata = GeoCombine::Iso19139.new('./tmp/opengeometadata/edu.stanford.purl/bb/338/jh/0716/iso19139.xml')

# Convert ISO (or FGDC) to HTML
# Convert ISO to HTML
> iso_metadata.to_html
```

Expand Down Expand Up @@ -73,7 +87,7 @@ id_map = {
GeoCombine::Migrators::V1AardvarkMigrator.new(v1_hash: record, collection_id_map: id_map).run
```

### OpenGeoMetadata
### Downloading metadata from OpenGeoMetadata

#### Logging

Expand Down Expand Up @@ -144,6 +158,13 @@ You can also set a the Solr instance URL using `SOLR_URL`:
$ SOLR_URL=http://www.example.com:1234/solr/collection bundle exec rake geocombine:index
```

By default, GeoCombine will index only records using the Aardvark metadata format. If you instead want to index records using an older format (e.g. because your GeoBlacklight instance is version 3 or older), you can set the `SCHEMA_VERSION` environment variable:

```sh
# Only index schema version 1.0 records
$ SCHEMA_VERSION=1.0 bundle exec rake geocombine:index
```

### Harvesting and indexing documents from GeoBlacklight sites

GeoCombine provides a Harvester class and rake task to harvest and index content from GeoBlacklight sites (or any site that follows the Blacklight API format). Given that the configurations can change from consumer to consumer and site to site, the class provides a relatively simple configuration API. This can be configured in an initializer, a wrapping rake task, or any other ruby context where the rake task our class would be invoked.
Expand Down
2 changes: 1 addition & 1 deletion lib/geo_combine/harvester.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def self.ogm_api_uri

def initialize(
ogm_path: ENV.fetch('OGM_PATH', 'tmp/opengeometadata'),
schema_version: ENV.fetch('SCHEMA_VERSION', '1.0'),
schema_version: ENV.fetch('SCHEMA_VERSION', 'Aardvark'),
logger: GeoCombine::Logger.logger
)
@ogm_path = ogm_path
Expand Down
2 changes: 1 addition & 1 deletion spec/lib/geo_combine/harvester_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
require 'spec_helper'

RSpec.describe GeoCombine::Harvester do
subject(:harvester) { described_class.new(ogm_path: 'spec/fixtures/indexing', logger:) }
subject(:harvester) { described_class.new(ogm_path: 'spec/fixtures/indexing', schema_version: '1.0') }

let(:logger) { instance_double(Logger, warn: nil, info: nil, error: nil, debug: nil) }
let(:repo_name) { 'my-institution' }
Expand Down
Loading