Releases: gwu-libraries/sfm-ui
Releases · gwu-libraries/sfm-ui
Version 3.0.0
Bug/security fixes
- Django upgraded to 3.2.18 (supported until 2024)
Support for Twitter API v.2
- Added support for v.2 API credentials, including the bearer token (recommended) and the combination of consumer key/secret and access token/secret
- Added support (with twarc2) for harvesting and exporting from v.2 endpoints
- Due to changes in the Twitter API access model, only the v.2 search_recent and user_timeline endpoints (accessible on the new Basic Access tier) are available in production. A new environment variable, TWITTER_COLLECTION_TYPES, specifies which of the supported Twitter API endpoints are available in the app.
- Twitter v. 1.1 endpoints have been disabled, but collections previously created via these endpoints are still available for export.
Outstanding issues
Streaming API
- Streaming rules are handled as seeds; because the Streaming API supports multiple rules per request, an SFM stream collection can have multiple seeds. However, the functionality to limit exports to a subset of active/deleted seeds does not work for these collections. (The logic in SFM for seed-based export applies only to user-timeline collections.)
- During testing, a long-running stream harvest encountered a "Read timed out" error from the Twitter API, as a result of which, no further Tweets could be collected until the harvest was voided in the UI and restarted. Consulted with the twarc developers; the cause of the error remains unclear, but it may be related to the following:
- Streaming harvests involve a periodic restart of the twarc.stream() process (every 30 minutes). This logic is designed to prevent excessively large WARC files (since a new WARC is created only at the start of the twarc.stream() process).
- The twarc developers posit that this regular interruption of the twarc stream could cause problems. The stream is designed to be run continuously. Apparently, the v.2 API is less responsive than the v.1 API, so it's possible that the API might be giving a timeout error if the previous connection hasn't fully closed by the time twarc tries to open a new one.
- If that is the problem – and it's hard to know for sure – then introducing a sleep before restarting could be effective; however, that could result in missed Tweets (a risk already posed by restarting the stream every 30 minutes).
Processing container
- The processing container needs to be upgraded. The image fails to build because of dependency conflicts with the new versions of certain libraries in sfm-utils. We didn't tackle this work during this release because it will probably also involve upgrading the Python and Ubuntu versions used in the image. Since the processing container doesn't directly interact with other components, it should be fine to use for now with the 2.5.0 image for legacy collections, etc. But to use with collections harvested from the v. 2 API, an upgrade will be necessary.
Version 2.5.0
Changes in this release:
- Upgrades Python version from 3.6 to 3.8 (#1071)
- Completes configurability of RabbitMQ port (#1086)
- Fixes display error with harvest stats (thanks, @sebastian-nagel!) (#1089)
Documentation updates:
- Updates directory ownership options in installation docs (thanks, @sebastian-nagel!) (#1091)
- Fixes Readthedocs configuration (#1092)
Version 2.4.0
This release contains required configuration updates for existing SFM instances. It is important to review the sfm-docker release notes carefully before upgrading from versions before 2.4
Changes in this release:
- This release introduces support for hosting data volumes on different filesystems, rather than as subdirectories in a single
sfm-data
directory (#1051). This allows RabbitMQ, Postgres, and SFM data for exports, containers, and collection sets to be separately configured. Thank you, @SvenLieber, for code contributions to add this feature! For existing SFM instances, please read carefully the sfm-docker release notes for required configuration changes. - Allows seeds to be deleted or undeleted while the collection is not active. Thanks for reporting this bug, @SvenLieber! (#1052)
- Upgrades Django to 2.2.24 and updates djangorestframework. (#1043, #1049)
- Upgrades Twarc version to fix bug with retweet text in CSV exports. (#1042)
Documentation updates:
Version 2.3.0
Changes in this release:
- Upgrades to Django 2.2 and related dependencies (#993). See sfm-docker release notes for steps for mandatory upgrade of Postgres to version 9.6.
- Improves accessibility of the SFM user interface (#1022, #1023, #1024, #1025, #1034, #1037)
- Upgrades to Bootstrap 4 (#987)
- New datetime widget for calendar fields (#1011)
- Optional configurable cookie consent popup (#1009). See sfm-docker release notes for instructions on enabling and customizing.
- Optional configurable GW footer (#1003). See sfm-docker release notes for instructions on enabling.
Documentation changes include:
Version 2.2.0
Version 2.1.0
Changes in this release:
- Upgraded Python libraries (#955 and #959)
- Improved AWS deployment support:
- Support for deployment with AWS Elastic Load Balancer through refinements to
ALLOWED_HOSTS
(#960) (Contributed by @justinlittman) - Support for AWS Simple Email Service by allowing a separate mail-from address (
SFM_MAIL_FROM
) indocker-compose.yml
(#967). This is backwards compatible, so it will still work usingEMAIL_HOST_USER
, even if noSFM_MAIL_FROM
is configured. (Contributed by @justinlittman)
- Support for deployment with AWS Elastic Load Balancer through refinements to
- Added queue length threshold configurations for the SFM UI component and for Twitter REST harvesters (#950)
- Improved privacy for monitor view of harvester status visibility (#956)
- Bugfixes:
- Fixed Change Log "Fields" column (#952)
- Fixed credential view erroneously showing as deleted (#949)
- Fixed
serializecollectionset
management command (#945) - Fixed import of harvest warnings, errors, and info messages from serialized collection (#947)
- Fixed export to export correct size when requesting 1,000,000; fixed export page message for 100,000 size exports (#957)
- Fixed unit test for notifications (#948)
Version 2.0.2
Various minor tweaks:
- Fixed serialization / deserialization and other management commands.
- Fixed display issue with credentials on collection detail page.
- Made SFM UI queue length configurable.
Version 2.0.1
No changes.
Version 2.0.0
- Upgraded to python 3, django 2, and assorted other libraries.
- Removed finalware and replaced with management commands.
- Added management commands for deleting web harvests.
- Fixes favicon link.
Version 1.12.0
Significant changes in this release:
- Support for deactivating credentials.
- Added paging to REST API.
- Fixed links to Twitter docs.
- Added public links field to collection to help with properly citing datasets.
- Added handling / configuration of automatic seed deletion.
- Removing handling / configuration of web harvester.
- Add support for filtering by warc created date to REST_API.
- Fixed defect in export segment size parameter.
- Fixed defect in downloading exports on Safari.
- Removed pinning of transitive dependencies.
Documentation changes include:
- Fixed links to Twitter docs.
- Add citation guidance page.
- Updated processing container docs to reflect changes / additions.
- Corrected smoke test instructions.
- Deprecated web harvester and ELK.
- Updated Twitter data dictionary to reflect change in Twitter export.
- Update Export documentation to add detail about time zones.