Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchy migration: creation of image folders for toplevel-processes #4269

Open
andre-hohmann opened this issue Mar 12, 2021 · 8 comments
Open
Labels
documentation improvement migration migration from previous Kitodo versions

Comments

@andre-hohmann
Copy link
Collaborator

andre-hohmann commented Mar 12, 2021

Problem

After the the migration of serial publications (hierarchy migration), image folders are created for the toplevel-processes, as for example: periodical, multivolume, ...

It seems as if the folders are created also for toplevel-processes, which are created directly in Kitodo.Production.

Question

The toplevel-processes should not contain images. Is it necessary to create the image folders? Could it cause any problems?

See also: #4268, #4267, #4270, #4271

@matthias-ronge
Copy link
Collaborator

I don't think the empty folders can cause a problem.

@henning-gerhardt
Copy link
Collaborator

I don't think the empty folders can cause a problem.

It is not a problem but a resource issue. A few empty directories per process did maybe no count in a high value but they consuming resources in the underlaying file system (f.e. ext3/4 i-node entries) and if you have a backup system even entries in the backup system as they must monitored and included into the backup and restore process. But if you have thousand or ten-thousand hierarchy processes with many more empty directories than you must calculate with a different amount of "wasted" resources in many places
The other "thing" is, that this empty directories can maybe cause irritations on a case of disaster recovery or other discovering of error scenarios. You will look many times if your disaster recovery was wrong or this directories was empty before you have this scenario.

@matthias-ronge
Copy link
Collaborator

Perhaps you can modify your backup so that only existing files go into it to take up space, and empty folders are not backed up. I don't know anything about Ext-¾ I in particular. The computation of west resources should however be possible, if it comes to that.

Nor do I want there to be a case of irritation because of the empty directories that recover from other error scenario discoveries. Yes, I can often look at my disaster recovery, which is wrong in these empty directories, and before that, I also have this scenario. Maybe @Kathrin-Huber should decide that.

@henning-gerhardt
Copy link
Collaborator

Perhaps you can modify your backup so that only existing files go into it to take up space, and empty folders are not backed up.

No, that is not possible as the consequences of this change are dramatic.

I don't know anything about Ext-¾ I in particular. The computation of west resources should however be possible, if it comes to that.

The amount if wasted resources is hard to determinate before a migration as I don't know how many new processes are created on migration nor did I know how many processes are created in the future. I did get only the information that the amount of i-nodes is running out and I must raise this amount. But than it could be to late to react. So I try to not to waste any resources.

It should not so hard to decide if creation of this directories is really necessary if you create or migrate a hierarchy process.

@matthias-ronge
Copy link
Collaborator

I can't say that either. But I would estimate an empty directory to be roughly 16 bytes. If you have eight empty directories per parent process, for 200,000 processes, that would be roughly 2.4 MB. Yes, it's unnecessary, but it's not the biggest problem.

@henning-gerhardt
Copy link
Collaborator

I can't say that either. But I would estimate an empty directory to be roughly 16 bytes. If you have eight empty directories per parent process, for 200,000 processes, that would be roughly 2.4 MB. Yes, it's unnecessary, but it's not the biggest problem.

Like already mentioned in the other issue: size and how to store information about directories, files is strong depended on the used file system. Maybe the 16 byte are correct for your used filesystem but maybe totally wrong for others.

For me empty and unused directories are not necessary, a waste of resources in different places and I will stop to discuss this with you. You did not agree with my opinion, this is fine as I even not agree with your opinion.

@andre-hohmann
Copy link
Collaborator Author

It seems that a consensus is not possible.
From the users perspective it works - therefore i will change the label from "question" to "documentation".

@matthias-ronge
Copy link
Collaborator

I think it is both: For the question, the answer is that the folders are technically not causing any problems at the moment. Nevertheless, it would be an improvement if no folders were created for processes without a workflow, because they are not used.

@solth solth removed the 3.x label Jul 7, 2022
@andre-hohmann andre-hohmann added the migration migration from previous Kitodo versions label Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation improvement migration migration from previous Kitodo versions
Projects
None yet
Development

No branches or pull requests

4 participants