Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slow editor load on large projects (v2) #95678

Merged

Conversation

Hilderin
Copy link
Contributor

This PR includes the modifications of #95672 and the modifications in core/io/file_access.cpp from #93064.
This PR includes more modifications then #95672 and more possibilities of side effects. That's why I created a different PR.

The main problem while loading a project is the parsing and reading of all the .import and .md5 files. On a project of 60000 files, it means 120000 files. Worst was that each .import file was read 3 times in EditorFileSystem::_test_for_reimport because ResourceFormatImporter::_get_path_and_type was called internally 2 times at least. Once in ResourceFormatImporter::are_import_settings_valid and another time in ResourceImporterLayeredTexture::are_import_settings_valid while calling ResourceFormatImporter::get_singleton()->get_resource_metadata(p_path)

I did these modifications in EditorFileSystem::_test_for_reimport:

  • Removed the if (!FileAccess::exists(p_path + ".import")), usually, the file should exists and there's a FileAccess::open on the next line that returns an error if the file does not exists.
  • The metadata is read directly instead of needing to call ResourceFormatImporter::get_singleton()->get_resource_metadata(p_path) which re-read the .import file
  • are_import_settings_valid is called directly on the importer with the metadata preventing a call to ResourceFormatImporter::get_singleton()->are_import_settings_valid which re-read the .import file

I tested a project with all the images from https://github.com/PMDCollab/SpriteCollab/tree/master/sprite which contains 61332 png.

Before this PR:

  • Loading the project in debug: 1m50sec
  • Loading the project in 4.3 stable: 1m40sec

After this PR:

  • Loading the project in debug: 50sec
  • Loading the project in production build and optimize=speed lto=full: 30sec

Note: I got some variations when testing on the loading times. If I restart the editor quickly after another, the loading time is a bit faster. My guess is that some files are in the disk cache the second time around.

@Saul2022
Copy link

Saul2022 commented Aug 17, 2024

15 seconds to open the whole tps demo forge ( that icludes the time to enable the cronometer and then going to godot to open the project (ON ANDROID s23+) , though i noticed that while super fast now scan after reloading got stuck at 89 sometimes or numbers like that before reopening the scene. Also this is an artifact build so prob the end result could be faster ? ( not sure if artifacts are optimized..)

Meanwhile on 4.3 stable it 25 seconds upon reloading or opening project.

Edit: Considering this is now GAS GAS GAS level of speed, it might be possible to speedup the transition from the godot logo to the editor as that's also gets kind of affected on bigger projects.

Also tried the whole sprite collab project (master, because why not)

1 minute for the godot logo transition to the editor

8:30 minutes for the whole scan of the project

And then got frozen minutes and closed it , because it took too long( though even the file manage was like this taking over 15 minutes just to extract it.

@smix8
Copy link
Contributor

smix8 commented Aug 17, 2024

Can confirm that this makes the rescan from e.g. renaming a file in a large project (100.000+ files) significantly faster, like 10000% faster.

Maybe unrelated to this PR but what is still super slow after this is the very first time editor start every day. For some reason it loads all the resource files again instead of just checking the meta data or md5. E.g. all the mesh assets in the project get reloaded from disk. Since the project has TB of mesh data this takes many minutes. Basically my old issue #47053

@Hilderin
Copy link
Contributor Author

though i noticed that while super fast now scan after reloading got stuck at 89 sometimes or numbers like that before reopening the scene

That should be fixed by #93064, the issue is caused by the fact that the editor freezes after the file scan and the progress bar got stuck at the position where is was at the last frame.

@Hilderin
Copy link
Contributor Author

Maybe unrelated to this PR but what is still super slow after this is the very first time editor start every day. For some reason it loads all the resource files again instead of just checking the meta data or md5. E.g. all the mesh assets in the project get reloaded from disk. Since the project has TB of mesh data this takes many minutes. Basically my old issue #47053

Unfortunately, I was never able to reproduce this problem on my side even if I tried on different OS and project :(

@Saul2022
Copy link

Saul2022 commented Aug 17, 2024

, the issue is caused by the fact that the editor freezes after the file scan.

Alright i remembered when i tested it for other issue and that was fixed so it’s fine.

Aside from it and maybe a bit off topic, you have any idea on how to improving further the loading times in eg project with such massive data like the whole project ( instead of just the sprites) linked that had to wait 8 minutes just to scan every file and then it keeps not responding( and there wasn’t any scene that had to be reopened, so that’s because of the files.

@Hilderin
Copy link
Contributor Author

I'm not sure what linked project you are referring? I was not able to access the linked project from the original issue. As for the editor freeze, the PR #93064 should fix that. The first time a project is opened, it has to scan for files and after that it needs to analyze what to do with them. Currently, without #93064, there's no progress on this analysis process and with a lot of files and specially on Android, it could take a while. This PR also has some improvements on performance during this analysis phase but I would not expect to reduce dramatically the first loading time. I'm not really sure to understand correctly the problem without more details. Maybe create another issue with more details on your machine, your project, etc....?

@Saul2022
Copy link

Saul2022 commented Aug 17, 2024

I'm not sure what linked project you are referring? I was not able to access the linked project from the original issue.

I mean with the https://github.com/PMDCollab/SpriteCollab/tree/master Whole project downloading the zip file,

As for the editor freeze, the PR #93064 should fix that.

Sorry for confusion though i kind of confirmed it did with the popup dialog

Maybe create another issue with more details on your machine, your project, etc....?

Might do it when this and the dock v2 prs are merged to have the window, because it still kind of weird that after scanning completely all files it takes so long to respond after the full scan when there’s no scene to do anything, just image files on the file system , maybe that just related to the import system making adjustments for every single file.

@Hilderin
Copy link
Contributor Author

Might do it when this and the dock v2 prs are merged to have the window, because it still kind of weird that after scanning completely all files it takes so long to respond after the full scan when there’s no scene to do anything, just image files on the file system , maybe that just related to the import system making adjustments for every single file.

Perfect. I also think it's because it's processing every file. It should be easier to pinpoint the problem once these PRs are merged. Thanks.

@Saul2022
Copy link

Perfect. I also think it's because it's processing every file. It should be easier to pinpoint the problem once these PRs are merged. Thanks.

No problem i have seen many people complain for a long time on godot bad losdong times on big projects with lots of gbs, either seen people experience freezes, editor quitting itself or crash after huge load times. As for processing every file , maybe as all this images uses the same sethings( because they all are pngs or jpegs) just have one check to change all the image properties at once without having to call it individually for them, not sure how it might work.

@reduz
Copy link
Member

reduz commented Aug 18, 2024

This is really outstanding work, but there is something that is worrying me in general.
Technically, EditorFileSystem is designed so that those files (.import or anything else) are only parsed on first editor load and then never again, since the file date and the information contained should be cached.

Basically, the expected behavior is that the second time you open the editor, the filesystem cache file is read, all files are simply checked for date if modified and if not nothing is done.

Still, I've seen reports or comments from users that even renaming a file is causing massive amount of work, so I have the feeling that this caching system has broken at some point and contributors kind of assume that "slow" is the way it should be expected to work.

I would really suggest investigating that the caching system is working to ensure we are not just band-aiding around a more fundamental bug (even if, of course, this optimization is still super welcome).

@Saul2022
Copy link

Saul2022 commented Aug 18, 2024

, EditorFileSystem is designed so that those files (.import or anything else) are only parsed on first editor load and then never again, since the file date and the information contained should be cached.

It kind of does no? Since on second load scan and opening stuff is faster than on the first. For example first time on trying to import sponza ivy glrf ( over270 mb ) it took 18:3 seconds to load the whole editor while on second time it took 12-11 seconds after reloading the editor on the stable version, being the thing that holds like 6 seconds on both times the godot logo transition to the editor

@Hilderin
Copy link
Contributor Author

Basically, the expected behavior is that the second time you open the editor, the filesystem cache file is read, all files are simply checked for date if modified and if not nothing is done.

That makes a lot more sense! Thanks for your feedback. I'll check that, I have a couple of ideas! Probably in another PR just to not mess with this one and to be careful.

@Hilderin
Copy link
Contributor Author

After some investigation, the main issue comes from calling _test_for_reimport on each asset file, which reads the .import and md5 files, etc. This method was added in commit bb83c7d to fix issue #13135.

If I understand the original issue correctly, the logic should be:

  • If the modification date changed:
    • If the hash changed:
      • Reimport

Currently, the logic is:

  • If the modification date changed AND the hash has not changed:
    • Reimport

From what I understand, the original issue seems to point out that sometimes the last modification date could be different even if the file was not modified, which is probably the case in issue #47053, and we have confirmation of this on some Android devices (#94416). In that case, comparing the hashes could prevent a useless asset reimportation. The only missing part is a way to check if the content of the .import file is different to prevent issue #60730.

Is my understanding correct?

@reduz
Copy link
Member

reduz commented Aug 19, 2024

@Hilderin Sorry I can't type much as I am on the phone. I think the idea is that the import file is only reparsed if the date changed. Then for imported assets, first date is checked and if it fails, then md5 is checked. Ultimately if both fail a reimport happens.

@Hilderin
Copy link
Contributor Author

Hilderin commented Aug 21, 2024

I finally decided to implement all the modifications in this PR. Following @reduz suggestion, I investigated more deeply why we needed to read all these files on startup.

Issues that I found:

  • All the .import files were always read, even if the last modified date did not change, when the project setting editor/import/reimport_missing_imported_files is true, which is its default value.
  • If editor/import/reimport_missing_imported_files is true, it was necessary to read the .import file to obtain the destination files of the import to validate if they still exist.
  • In 2017, when the original MD5 was added, the method _test_for_reimport did not check the importer, importation settings, or MD5 while scanning. Only when the last modification date of the asset file or the .import file were different.
  • Over the years, with each commit, additional steps were added in the scan process, such as reading the MD5 file, reading the importer from the resource type to check the importer version, and validating the import settings during the first scan, which gradually slowed down the scan.

What I suggest in this PR:

The algorithm now works like this:

  • Load the last filesystem_cache8 if present.
  • Scan process (first scan and scan for changes):
    • For each file, add the asset for an ACTION_FILE_TEST_REIMPORT:
      • If the last modification date of the asset file has changed
      • Or if the .import file is missing
      • Or if the last modification date of the .import file has changed
      • Or if editor/import/reimport_missing_imported_files is true and at least one of the destination files is missing
  • Process each file action ACTION_FILE_TEST_REIMPORT (in _update_scan_actions):
    • Add the file for reimportation:
      • If the .import file is missing
      • Or if the MD5 for the .import file is different
      • Or if the importer name in the .import file does not exist
      • Or if the importer version is greater
      • Or if the importer settings are invalid
      • Or if the MD5 of the source asset file is different
      • Or if the MD5 of the destination files is different
  • Reimport each file marked for reimportation.
  • Save the filesystem_cache8.

Time of the scans:

  • Before this PR: 1 min+
  • After these modifications: 8-9 sec.

@Saul2022
Copy link

I finally decided to implement all the modifications in this PR. Following @reduz suggestion, I investigated more deeply why we needed to read all these files on startup.

Wouldn't it be better to separate it on another pr so it easier to merge this one first as is safer and then take the turbo speedup one for another.

Time of the scans:

  • Before this PR: 1 min+
  • After these modifications: 8-9 sec.

Can confirm , literal on a project with 5k files total ( https://github.com/user-attachments/files/16075644/test_base_mesh.zip ) the file scan is instant getting to 100 % just after the logo transition ( which is already pretty fast now) the scene load the last what holding back, but that's for a future pr. Great job!

@Hilderin Hilderin force-pushed the fix-slow-load-on-large-project-v2 branch from c1e39e9 to 8ff22ad Compare September 20, 2024 15:53
@Hilderin
Copy link
Contributor Author

Rebased to fix merge conflicts.

@H2xDev
Copy link

H2xDev commented Sep 23, 2024

I confirm as well. The fix is working good on real project with 4.5gb of assets (about 15k files). Thank you!

Copy link
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, it works as expected. Code looks good to me.

Benchmark on the TPS demo:

Benchmark 1: bin/godot.linuxbsd.editor.x86_64 ~/Documents/Godot/tps-demo/project.godot --quit      <-------- This PR
  Time (mean ± σ):      3.464 s ±  0.534 s    [User: 2.220 s, System: 0.723 s]
  Range (min … max):    2.971 s …  4.253 s    10 runs
 
Benchmark 2: bin/godot.linuxbsd.editor.x86_64.master ~/Documents/Godot/tps-demo/project.godot --quit
  Time (mean ± σ):      3.943 s ±  0.224 s    [User: 2.198 s, System: 0.731 s]
  Range (min … max):    3.310 s …  4.045 s    10 runs

Summary
  bin/godot.linuxbsd.editor.x86_64 ~/Documents/Godot/tps-demo/project.godot --quit ran
    1.14 ± 0.19 times faster than bin/godot.linuxbsd.editor.x86_64.master ~/Documents/Godot/tps-demo/project.godot --quit

@Hilderin Hilderin force-pushed the fix-slow-load-on-large-project-v2 branch 2 times, most recently from 42e3d10 to fb3817f Compare September 24, 2024 23:29
@H2xDev
Copy link

H2xDev commented Sep 25, 2024

Could it be cherrypicked for 4.3.x?

@Hilderin Hilderin force-pushed the fix-slow-load-on-large-project-v2 branch from fb3817f to d17fa79 Compare September 25, 2024 21:22
Copy link
Member

@KoBeWi KoBeWi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get 4s speedup for my project load, but most of the startup time is outside scanning the filesystem. The scanning itself is noticeably faster though.

Did some testing and didn't find bugs.

@Hilderin Hilderin force-pushed the fix-slow-load-on-large-project-v2 branch from d17fa79 to 21f7c8a Compare September 25, 2024 21:36
@akien-mga akien-mga merged commit ef75473 into godotengine:master Sep 26, 2024
19 checks passed
@akien-mga
Copy link
Member

Thanks! Awesome work 🎉

@fgheorghe
Copy link

Lurker, but my project has 100k assets and this will greatly improve my workflow. Thank you for this! Awesome work indeed!!

@Saul2022

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Editor takes a long time to load project with large amount of media files