Skip to content

Releases: metafacture/metafacture-core

Metafacture Core 3.1.0

04 May 14:58
Compare
Choose a tag to compare

Changed Behaviour

  • Fix #216 changes the behaviour of Metamorph: Collectors are now reset on flushWith even if the condition was not met.

Maven Coordinates

Metafacture core is available on Maven Central:

<dependency>
  <groupId>org.culturegraph</groupId>
  <artifactId>metafacture-core</artifactId>
  <version>3.1.0</version>
</dependency>

Changes

The release contains mostly bug fixes but also a couple of new features.

New Features
  • #229: Support for marcxml in test cases
  • #119: a tar reader (thanks to Pascal Christoph)
  • #195: a pica-xml handler and reader (thanks to Pascal Christoph & Fabian Steeg)
  • #222: a Unicode normalizer for handling the different Unicode normalization forms in streams
  • @FluxCommand annotation (see commit eac0926 for details)
  • #197: Support passing a URL to ResourceUtil.getStream(String) (thanks to Fabian Steeg)
  • #196: Support optional FilenameFilter in DirReader (thanks to Fabian Steeg)
Bug Fixes
  • #219, #227: Improved X-Include handling in Metamorph
  • #223: The only attribute of the occurrence filter now accepts lessThan and moreThan in addition to lessThen and more Then
  • #216: Fixed handling of reset in if-conditions in collectors (NOTE: This may change the behaviour of Metamorph scripts)
  • #217: Metamorph lost variable maps passed as constructor arguments.

Metafacture Runner Distribution 3.0.0

26 Jul 17:01
Compare
Choose a tag to compare

This release updates the metafacture-core dependency to version 3.0.0 Please see the release notes for metafacture-core for a list of changes.

Metafacture Core 3.0.0

12 Dec 12:55
Compare
Choose a tag to compare

This release is not compatible with the 2.x.x line of metafacture-core.

Changed Behaviour and Interfaces

  • Formeta: Only accept allowed escape sequences (commit 162c1f1)
  • Internal structure of the Metamorph classes changed:
    • Refactored the code for building pipelines in Metamorph (commit 3eca5ba)
    • Refactored code for loading morph scripts (commit 498730b)
    • Simplified from interface NamedValueSource (commit efb39d5)

Maven Coordinates

Metafacture core is available on Maven Central:

<dependency>
  <groupId>org.culturegraph</groupId>
  <artifactId>metafacture-core</artifactId>
  <version>3.0.0</version>
</dependency>

All Changes

Bug Fixes
  • Fixed #177: Condition was not reset when collector was reset (commit a20767d)
  • Fixed #178: Reset conditions if sameEntity is true (commit ed9824c)
  • Fix #179: Output message of wrapped exceptions (commit a55475c)
  • Formeta: Escape leading and trailing whitespace (commit 432aca0)
  • Fix #192: AbstractTripleSort has memory leak (commit dda2343)
  • Improve error message when decoding CSV (commit 3d6fc92)
  • Fix #204: High memory usage of Metamorph tests (commit 446d663)
New Features and Improvements
Metamorph
  • Allow macros in entity statements
    • Allow macros in entity statements (commit 6d9d79b)
    • Fix substitute variables in macro calls (commit ac7a3f7)
  • Added framework for introspecting the data processing within am Metamorph script
    See commit 17710f3 for a brief introduction how to use this feature.
    • Refactored the code for building pipelines in Metamorph (commit 3eca5ba)
    • Added source file location annotations to Metamorph DOM (commit c16833e)
    • Refactored code for loading morph scripts (commit 498730b)
    • Added source location infos to the morph pipeline (commit 7e766ad)
    • Simplified from interface NamedValueSource (commit efb39d5)
    • Add a system for interception to Metamorph (commit 17710f3)
    • Fix #205: Exception during Metamorph build (commit 3a509d9)
  • Add reverse concatenation to concat in Metamorph (commit 048c6ba)
  • Add new attributes era and removeLeadingZeros to DateFormat (commit cac5b94)
  • Add sameEntity attribute to concat metamorph statement (commit a5d5023)
Stream Modules
  • Logger and Exception catcher:
    • Fix #180: output stack trace (commit 5ab3705)
    • Adds a setter for logPrefix in exception catcher (commit 4c9d814)
  • Added new module for grouping Pica multiscript fields:
    • Added module for grouping Pica multiscript fields (commit 12028c6)
    • Simplified output of PicaMultiscriptRemodeler (commit 4ce0b4c)
  • Added an encoder for Marc21 records
    The implementation implements the full ISO 2709:2008 standard. Only the Flux module is specific
    for Marc21. Additional instances of the ISO 2709:2008 can therefore easily be added.
    • Add builder for ISO 2709:2008 records (commit 1c13d5a)
    • Add encoder for MARC21 records (commit e2c15af)
  • Add modules for reading and writing event streams from or into POJOs:
Miscellaneous
  • Support for formeta as input and output format of testcases
    • Adds reader for multiline formeta records
      • Adds reader for multiline formeta records (commit 9ad842e)
      • Fixed typo in one of the metastream readers (commit 53c36b9)
      • Added multiline support to FormetaReader (commit dfac467)
    • Allow other data formats for result type than cg-xml:
      • Allow different result types in test cases (commit 51f5c95)
      • Bug fix: Allow different result types (commit 728e69b)
  • Added utility class for method argument checking
    • Add utility class for method argument checking (commit beac8b5)
    • Add generic check to Require (commit fc5cd8a)
Code clean-up/improvements
  • Minor change: removed duplicated code in flush method (commit b06b9a3)
  • Fixed getStream method to not create the File object twice (commit bbc0d63)
  • Fixed coding style (commit 8f3caeb)
  • Added new test case for occurrence-function (commit a5df610)
  • Small code quality improvements (commit 914b58d)
  • Minor code improvements in PicaDecoder (commit e2152f4, commit 72a75ba)
  • Improved code formatting and added documentation (commit 7931ba0)
  • Removed old merge conflict in commit (commit fce4dbd)
  • Improve code style (commit 56a7698)
  • Cleanup: Remove dead code from JndiSqlMap (commit 34cffed)
  • Relax return-count check (commit 7c1612d)
  • Add comments and remove trailing whitespace (commit 97a0d54)
  • Improve StringUtils.copyToBuffer (commit 16c3bcb)
  • Update junit related PMD rules (commit 016c14d)
  • Exclude check for boolean inversion from PMD (commit 973fa51)

Metafacture Core Distribution 1.2.2

14 Mar 13:30
Compare
Choose a tag to compare

This is a bug fix release for the 1.2.x branch of metafacture-core.

Bug fixes

  • Fixed #178: Reset conditions if sameEntity is true

If a collector has an sameEntity="true" set, it is reset whenever
the current entity changes. As described in issue #178 the current
implementation fails to reset the condition during these reset
operations. This commit fixes this.

Additionally, the test cases for the if-condition have been moved
into a separate xml-file as quite a number of tests have
accumulated so that it makes sense to keep them in a separate file.

Metafacture Core Distribution 1.2.1

14 Mar 07:41
Compare
Choose a tag to compare

This is a bug fix release for the 1.2.x branch of metafacture-core.

Bug fixes

  • Fixed #177: Condition was not reset when collector was reset
    If a collector has an reset="true" attribute one would expect that this
    also resets the state of an if-condition in the collector. However, this
    was not the case. This commits fixes this.

    Please note that this may change the behaviour of existing scripts if
    they relied on the condition not being reset with the reset of the
    collector.

Metafacture Runner Distribution 2.0.0

26 Jul 17:03
Compare
Choose a tag to compare

This is the initial release of metafacture-runner. The 2.0.0 version number was chosen to keep it the versioning scheme in sync with the metafacture-core package.

Please see the release notes of the metafacture-core package for changes of the code which is now in metafacture-runner but was part of metafacture-core before.

Metafacture Core 2.0.0

11 Mar 20:25
Compare
Choose a tag to compare

This release is not compatible with the 1.x.x line of metafacture-core.

Incompatible changes

  • Removed flux executable and runtime dependencies slf4j-log4j and mysql jdbc
    driver from metafacture-core. The flux command line application is now
    maintained in the culturegraph/metafacture-runner package (see issues #131,
    #130 and #168 and commits 41329a7 and ecdafbc).
  • Removed eclipse project files from repository (see commit 27c2390)
  • Reimplemented PicaDecoder: The records are now properly parsed. The new
    implementation does not do special processing of subfield "S" like the old
    class did. Additionally, multi-line pica records are supported (see issues
    #51, #109, #112, #137 and #139 and commits 3c75b41, 9e736df, 4483e5e, 89119a6,
    ae5a08a, c0eeb04, ec81279, bd30086, 5c8002e)
  • Renamed the configure method in SimpleXmlWriter (now SimplXmlEncoder)
    into setNamespaces to reflect what its actually doing (see issue #99)
  • Renamed org.culturegraph.mf.stream.sink.SimpleXmlWriter to
    org.culturegraph.mf.stream.converter.xml.SimpleXmlEncoder (see issue #100)
  • The receiver interfaces do no longer extend LifeCycle directly but extend an
    intermediate Receiver interface (see commit 7065cc0)

New features & improvements

  • Updated dependencies to latest version (see commit 3ab7331)
  • Modified IdChangePipe to accept nested literals as ids (see commit e81b230)
  • Modified the Counter module to allow pipe lining (see commit 43c52c3)
  • Added pretty printing and configurable character escapes to the JsonEncoder
    (see commit 8cb7a08)
  • Added a dateformat function to Metamorph for converting various date formats
    (see commit 6b9b7e1)
  • Added a Metamorph function for generating timestamps (see commit 82be110)
  • Added triple-to-stream module which converts triples into a stream (without
    collecting them into records as collect-triples does (see commit 55fc144)
  • Improvements to LineSplitter: Added flux-annotations to LineSplitter and
    added it to flux commands. (see commit 34aed80)
  • Added StreamExceptionCatcher module which is the stream counterpart of
    ObjectExceptionCatcher(see commit 59ff596)

Bug fixes

  • Generate Flux parser and lexer as part of the build cycle (see commit a7b4d78)
  • The flux lexer was failing on files which had an empty comment not followed by
    a new line as their last line (see issue #147)
  • Place OreAggregationAdder and its test in same package (see issue #60)
  • Adds the ability to escape the @-character in Metamorph names (see commit
    0b470e5)
  • Replaced binary or with boolean or in StreamLiteralFormatter (see commit
    bbe340c)
  • Added fallbacks to flux.sh in case realpath is not available (see commit
    b1e1172)
  • Moved the logic for creating a buffer to allow direct access to the characters
    in a string into the StringUtil class. The code was fixed to always create a
    buffer that is large enough (see issue #161)
  • AbstractTripleSort threw NullPointerExceptions if it received a
    "memoryLow" message before the first record was processed (see issue #160)

Metafacture Core Distribution 1.2.0

04 Dec 18:59
Compare
Choose a tag to compare

The new release should be fully compatible with release 1.1.0. It contains the following new features and bug fixes:

New features

  • Added header, footer and separator settings to ObjectWriter (see issue #154)
  • New Collector EqualsFilter (see issue #149)
  • Set namespace for rdf output via remote configuration in the same way as Metamorph scripts are set (see issue #145)
  • Added RecordReader for reading records from a reader. This module is available as as-records in Flux (see issues #140, #142 )
  • Add a wrapper for WildcardTrie enabling simple character classes in source statements in Metamorph (see issues #135, #143)

Experimental feature

  • Added support for conditional activation of collectors. Additionally a set of quantifiers collectors allows to express Boolean conditions in nice and easily comprehendable way (see issues #151, #154)

Bug fixes

  • Changed JsonEncoder to not prefix output with spaces (see issue #152)
  • Partially reverted commit 23c05cf: This commit broke the plugin loading mechanism. This should be fixed again (see issue #148)
  • Bugfix for RdfMacroPipe: Empty name parameter in literal(..)-method resulted in an StringIndexOutOfBoundsException. The parameter is checked with org.apache.commons.lang.StringUtils now and so the use of empty name parameter is possible again (see issues #146, #150 )
  • Fix wrong namespace (dcterm->dcterms) (see issue #136)

Metafacture Core Distribution 1.1.0

27 Sep 12:29
Compare
Choose a tag to compare
metafacture-core-1.1.0

[maven-release-plugin]  copy for tag metafacture-core-1.1.0

Metafacture Core Distribution 1.0.1

03 Jul 09:46
Compare
Choose a tag to compare
metafacture-core-1.0.1

[maven-release-plugin]  copy for tag metafacture-core-1.0.1