Skip to content

Releases: pydio/cells

Release Candidate for Cells v4

19 Jul 09:48
Compare
Choose a tag to compare
Pre-release

Release Candidate for Cells v4

Cells v4 is a major leap forward for clustered deployments. It features a brand new microservices engine for unparalleled performance, and a new dependency management making it much easier to delegate core services (such as configs, registry, caches, etc.) to cloud standards.

Moving to Go modules at last

The V4 branch was a long-term development project that started with our desire to adopt Go modules. Cells was written at a time when modules were not yet part of the language (we used the vendor folder...). This made the migration to modules complex: Go's auto-migration tool was not usable for us (it simply crashed), and we ended up recreating the code base from scratch using modules, re-adding our libraries and dependencies one by one.

As expected, this redesign led to a big reduction in our dependencies, a huge simplification of the architecture and, as a result, a lot of interesting features!

To make a long story short:

  • We got rid of the microservices framework we were using (Micro), regaining control of the gRPC layer.
  • We updated the main dependencies to their latest versions, including Caddy, Hydra and Minio.
  • We make a better use of resources within a process by sharing "servers" (http|gRPC) between "services". This greatly reduces the number of network ports used by Cells. This has a huge impact on performance, typically for single-node deployments.
  • These "servers" and "services" are much better managed and can be more easily started/stopped on different nodes, making cluster deployments easier than ever (see below).

MongoDB as a drop-in replacement for all services based on on-file storages

Historically we've been using BoltDB and BleveSearch and we like it. They are pure GO key/value stores and indexers and it allows Cells to provide full indexing functionality without external dependencies. By default, the search engine, activity stream or logs use these JSON document shops to provide rich, out-of-the-box functionality. But these stores are disk and memory intensive, and while they are suitable for small and medium-sized deployments, they create bottlenecks for large deployments.

We therefore looked at alternatives for implementing new 'drivers' for the data abstraction layer of each of these services, and chose MongoDB as a feature-rich, scalable and indexed JSON document store. All services using BoltDB/Bleve as storage now offer an alternative MongoDB implementation, a migration path from one another, and the ability to scale horizontally drastically.

File-based storage is still a very good option for small/medium instances, avoiding the need to manage another dependency, but the Cells installation steps will now offer to configure a MongoDB connection to avoid the need to migrate data in the long term. Note that Mongo does not replace MySQL storages, that DB is still required for Cells.

Cluster Me Please!

Cells was developed from day one as a set of microservices, but we had to face the fact that deploying Cells in a multi-node, highly available environment was extremely complex and almost nobody could really make it work... The v4 was the perfect time to tackle this problem!

We took a step back, learned our lesson from v1 to v3, and looked closely at cloud-native DevOps best practices (yes K8s, we're looking at you). The main objective was: how to create a fully stateless instance of Cells (image, container, you name it...) that can be easily distributed and replicated.

Similar to the move from BoltDB to Mongo, we implemented DAOs to decouple and externalize many layers, making Cells V4 finally cloud-ready. To achieve that without re-inventing the wheel, Cells V4 stands on the shoulders of giants :

  • ETCD for configs and services registry
  • NATS for message broadcasting (pub/sub)
  • REDIS for shared cache
  • MONGO for JSON documents
  • HASHICORP VAULT for secrets and certificates management

Again, all these are optional and Cells can still be deployed as a standalone, dependency-free binary on a Rasperry Pi (even the older 32bits versions) !

Migration and testing

Single-node deployments

Upgrade process is standard and should be straight-forward (and we would really love to hear from you on that).
There are a couple of important notes during this upgrade :

  • Hydra JWKs will be regenerated in the DB, with effect of invalidating all existing authentication token. You will be logged out after upgrade, and if you are using Personal Access Tokens, you will have to regenerate new ones.
  • Cells Sites Bind URL should not use a domain name but should declare a binding PORT, eventually including the Network Interface IP. If you have connection issue after upgrade, make sure to edit sites (cells configure sites) to bind to e.g. 0.0.0.0:8080 instead of domain.name:8080

Migrating Bolt/Bleve storages to Mongo

Migrate from existing on-file storage to MongoDB using the following steps:

  • Install MongoDB, currently tested against version 5.0.X, prepare a Mongo database for cells data
  • Stop cells, as Bolt/Bleve files must not be opened by the application during the migration process
  • Use cells admin config db add command to configure a connection:
    • Setup connection using mongo connection string like mongodb://user:pass@ip:port/dbname
    • Accept prompt for using this connection as default document DSN
    • Accept prompt to perform data migration from existing bolt/bleve files to mongo. This can take some time.
  • Now restart Cells. You should see "Successfully pinged and connected to MongoDB" in the logs.
  • As search engine data are not migrated, you have to relaunch indexation on the pydio.grpc.search service using cells admin resync --service=pydio.grpc.search --path=/

Now you should be good to go. Try searching for * in Cells search engine, you should have blazing fast results.

Cluster deployments

We will provide dedicated blog posts on this topic very soon. You can already have a look at this sample Docker Compose file that shows the required dependencies and how to specify their endpoints to Cells using environment variables.

Change log

You can find a summary of the change log here.

Bugfix for v3

11 Jul 08:22
Compare
Choose a tag to compare

This is a bugfix release for v3 branch.

  • Fix possible UX error when sharing a folder with users
  • Fix edge-cases on move from flat to structured datasource
  • Fix incorrect root node handling in specific cases (deny on personal files)
  • Fix regression to users with too many roles attached
  • Add CELLS_BROKER_TRYPUB environment variable for high-load systems
  • Backport SQL Meta Filtering from v4 branch for better performances

Change log

You can find a summary of the change log here.

Bugfixes for v3

04 May 16:01
Compare
Choose a tag to compare

This release fixes small issues for the v3 branch.

Bug fixed

  • CLI full-S3 configuration panic
  • Goroutine leak in GRPC gateway (sync)
  • Config wrongly saved when updating smtp password
  • Typo in environment variable (tls_key_file)
  • Versions files not properly removed from storage

Cells Enterprise

  • 2FA plugin new configurations to force users to setup 2FA unless they cannot access the platform.

Change log

You can find a summary of the change log here.

Hotfix for 3.0

22 Mar 09:18
Compare
Choose a tag to compare

This release brings minor improvements: fix webDAV download bandwidth, share dialog scrolling issue, search engine re-indexation for high volumes.

Change log

You can find a summary of the change log here.

Bugfix for v3

28 Feb 10:39
Compare
Choose a tag to compare

Cells 3.0.5 fixes an issue with loginCI parameter for pydio.grpc.user micro-service (allowing case-insensitive login).

Cells Enterprise Distribution 3.0.5 provides experimental support for SPNEGO/Kerberos authentication, new security policy condition types and Cells Flows anko script tools.

Change log

You can find a summary of the change log here.

Bugfixes

31 Jan 10:31
Compare
Choose a tag to compare

This is release ships minor fixes for the following issues:

  • Wrong check when setting user metadata on readonly files
  • Adapt CollaboraOnline editor to support CODE new endpoints (version 21)
  • Datasource format migration for huge datasources: raise timeout limit, add a resume flag if copy is interrupted.
  • [ED] Fix roles creation for users dynamically created via External connectors (OAuth, OIDC, ect.).

Change log

You can find a summary of the change log here.

Hotfix for 3.0

07 Dec 16:52
Compare
Choose a tag to compare

Fix specific env HTTP_PROXY support

Change log

You can find a summary of the change log here.

Bugfixes for 3.0.2

29 Nov 14:02
Compare
Choose a tag to compare

The release fixes start edge-case and improves OIDC Attributes mapping

Change log

You can find a summary of the change log here.

First Bugfix Release for V3 Branch

09 Nov 10:08
Compare
Choose a tag to compare

First bugfix release for V3 branch.

  • Issue with CopyObject on Encrypted + Flat Datasource (data is ok but key has a wrong ref)
  • Fix auto SSL redirect for 0.0.0.0 binding case, support this for web installer
  • Fix admin-assigned roles manual ordering
  • 100% Italian Support (thanks to Italo Vignoli)
  • [Ent] LDAP/AD new configs (sync users only, anonymous bind)
  • [Ent] Fix status code for multipart PutObjectPart when quota reached (422)

Change log

You can find a summary of the change log here.

Major Version

26 Oct 08:07
Compare
Choose a tag to compare

Cells V3 provides a new data source format that dramatically improves access speed. We’ve rethought our microservices deep in the core of the code to help streamline security implementation, supercharge searchability via new metadata, and make compliance monitoring and reporting easier.

New Datasource Format

An often-asked Pydio users requirement was to "keep the tree structure" of files visible on the storage (and modify these directly without going through Pydio). To achieve this, Cells datasource relies on a unidirectional synchronization between the storage and the internal index. It brings its share of issues and performance limitations when the number of files is getting high, and sometimes a "resync" of the datasource is required to fix index issues.

Introducing a new datasource format, that keeps the tree structure only in Cells internal indexes, and stores files as a flat structure on the storage. This is more inline with the "object storage" design, and brings huge performance gains as no more "sync" is required to maintain the indexes. This is now the preferred format, unless a direct access to the files on the storage is necessary. The ability to recover the tree structure is guaranteed by tools providing import/export of the structure on-file.

This new format is now applied to "internal" buckets previously used for storing thumbnails and versions, thus providing more flexibility, as they now appear as standard datasources (see Versioning below). Finally, S3 (and compatible) support is improved and Cells S3 gateway (exposing Cells files as an S3-compatible storage) was made more vanilla with the support of listBuckets request and better Content-Type extraction.

UX

Tons of bugfixes and small improvements were done in all areas of the GUI (JavaScript is now built on top of React 17). Files presentation now offer a new "Gallery" view perfect for displaying folders of photos ("masonry layout").

Many new metadata types are available, with more meaningful fields for boolean, dates, integers, progress bars, etc... They are indexed and searchable and can be used in Cells Flows as well. The search interface has been fully redesigned for improved searchability, filling the whole screen and giving the ability to preview files from any workspace without leaving the search results view.

Cells Flows new features

Cells Flows jobs can now listen to many events, have dedicated conditional branches to handle them, and group similar jobs into one. For example, previous "thumbnails creation" and "thumbnails deletion" are now grouped into one job. In conjunction with the new metadata types, Cells Flows now allows the creation of advanced jobs such as Validation/Approval scenarios, including reminder, alerts and files tagging.

Cells Convert Tools is a brand new Docker image that can be spawned on a dedicated server and used in conjunction with Cells Flows to generate thumbnails and PDF previews of most file formats. Check it out!

Security & ACLs

[ED] Authentication can now be hardened with DUO Security multi-factor service.
[ED] New security policies permissions can be applied to disable specific actions (DELETE, DOWNLOAD, UPLOAD) and fine-tune permissions.
Security fixes from the 2.2.X branch are reported.

Redesigned versioning

Now that the "versions" datasource uses the new flat storage standard, it is easier to shard versions on multiple locations, enable encryption, and many more. Versioning policies now provide a specific behaviour for handling revisions after the original file is definitely deleted. By default, they are moved inside a dedicated folder in the versions datasources, allowing administrators to easily find the last revision and restore it on behalf of a user.

Scalability / Provisioning

Performances are improved on many aspects, from the new datasource format to a redesign of the message bus that now does not require NATS anymore on a single-node deployment. Focus has been put on in-life maintenance of the system : housekeeping logs and activities, better log formats, caching, CPU and RAM consumption lowered.

Installation can be more easily automatized with support of new fields in YAML/JSON configs to set multiple sites and custom configuration keys at startup.

Contributions

This release comes with a fully translated Português do Brasil version thanks to Claudio Ferreira (@filhocf).

If you want to help us and participate by adding translation to your language, it is really easy: just navigate to Pydio Cells project in Crowdin, create an account and get started!

Change log

You can find a summary of the change log here.