Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ungarbled binary resources #1131

Merged
merged 4 commits into from
Sep 14, 2024
Merged

Ungarbled binary resources #1131

merged 4 commits into from
Sep 14, 2024

Conversation

veloman-yunkan
Copy link
Collaborator

Fixes #1129

Added a test that checks that the static resources returned by the
server have content that matches their cacheid. This test currently
fails because some binary resources (e.g. png images) are garbled by the
dos2unix conversion.
Copy link

codecov bot commented Sep 11, 2024

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 41.43%. Comparing base (f5c91cc) to head (c8524b9).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
src/server/internalServer.cpp 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1131   +/-   ##
=======================================
  Coverage   41.43%   41.43%           
=======================================
  Files          59       59           
  Lines        4245     4245           
  Branches     2323     2323           
=======================================
  Hits         1759     1759           
  Misses        990      990           
  Partials     1496     1496           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@veloman-yunkan veloman-yunkan marked this pull request as ready for review September 11, 2024 14:08
@veloman-yunkan
Copy link
Collaborator Author

This PR fixes the bug introduced by #1113. More importantly it provides protection (in the form of a unit test) against similar future problems.

There are a few things that I don't like about this PR:

  • It works by invoking an external command (a small python script) to perform a HTTP request and compute the hash of the response. I chose that approach in order to avoid having to introduce a dependency on a library providing support for SHA1 computation.
  • The list of static resources checked by the new test (as well a few old ones) is hardcoded and must be manually updated when new resources are added.
  • The list of file extensions considered to be binary (and thus to be excluded from dos2unix conversion) is hardcoded and must be updated when new types of binary resources are introduced.

@veloman-yunkan
Copy link
Collaborator Author

@kelson42 How shall we proceed with this PR without @mgautierfr? I am most interested in feedback on the proposed method of protection against bugs similar to the one being fixed.

@kelson42
Copy link
Collaborator

kelson42 commented Sep 11, 2024

@veloman-yunkan We have to deal without his feedback. But I have two remarks:

... so that all resources have extensions and can be automatically
categorized as binary or text based on extension (coming next).
Now if a static resource of a new type is added the build will
fail unless the list of known file type extensions is updated.
@veloman-yunkan
Copy link
Collaborator Author

  • The list of file extensions considered to be binary (and thus to be excluded from dos2unix conversion) is hardcoded and must be updated when new types of binary resources are introduced.

Now when new types of resources are added that won't go unnoticed - the build will fail unless the list of known extensions is updated.

@veloman-yunkan
Copy link
Collaborator Author

@veloman-yunkan We have to deal without his feedback. But I have two remarks:

  • I'm concerned by the growing difficulty of updating these static files

With the newly committed change no extra difficulty regarding maintenance of static resource files is being introduced (as long as we are concerned only with things that have to be memorized). The second point in my earlier comment is just a reiteration of a (slight) problem that existed before this PR. There is a way to add a test that would mostly eliminate that issue but IMHO we shouldn't spend any effort on it until it proves to be a real rather than a hypothetical problem.

I reread the README but don't think that it needs to be updated. If the pressure to do so mounts, we better enhance our test suite as mentioned in the previous paragraph.

That may solve some of the addressed concerns (though at the cost of a significantly larger effort) but should not be considered as a substitute for what can be checked on the unit-test level.

Our mechanism for embedding resources has been significantly enhanced since when the original issue was filed. That on the one hand makes that ticket more justified since the system has become more complex and perhaps can be optimized. On the other hand the task has become more difficult too and maybe we shouldn't touch what works reasonably well until we notice that we have to spend excessive hours on solving issues rooted in the current implementation of how we deal with static resources.

@kelson42 kelson42 merged commit 327fec1 into main Sep 14, 2024
14 of 15 checks passed
@kelson42 kelson42 deleted the ungarbled_binary_resources branch September 14, 2024 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Binary resources in libkiwix have been broken by dos2unix conversion
2 participants