Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: memcpy of whole structures should have a macro wrapper #9328

Closed
jmarantz opened this issue Dec 12, 2019 · 16 comments · Fixed by #13770
Closed

util: memcpy of whole structures should have a macro wrapper #9328

jmarantz opened this issue Dec 12, 2019 · 16 comments · Fixed by #13770
Assignees
Labels
beginner Good starter issues! help wanted Needs help! tech debt

Comments

@jmarantz
Copy link
Contributor

jmarantz commented Dec 12, 2019

Per discussion in #9306 we should have a check_format check for memcpy, with carve-outs for the buffer-appending case in that PR, and for a macro definition we add for copying blocks in a way that obviously cannot have memory bounds violations, e.g.:

#define MEMCPY(dst, src)  memcpy(dst, src, sizeof(*(src)))

or maybe

#define MEMCPY(dst, src)  memcpy(dst, src, std::min(sizeof(*(src), sizeof(*(dst))))

or maybe

#define MEMCPY(dst, src)  \
 do { \
    static_assert(sizeof(*(src)) == sizeof(*(dst))); \
    memcpy(dst, src, sizeof(*(src))); \
  } while (false)
@jmarantz
Copy link
Contributor Author

The macro is easy enough but there are 56 current violation of this outside of tests. One option is to check in the macro and one or two uses, but whitelist all the existing uses and incrementally remove them.

@jmarantz
Copy link
Contributor Author

diff --git a/tools/check_format.py b/tools/check_format.py
index a7bb5daf74..989493f944 100755
--- a/tools/check_format.py
+++ b/tools/check_format.py
@@ -43,6 +43,9 @@ REAL_TIME_WHITELIST = ("./source/common/common/utility.h",
                        "./test/test_common/utility.cc", "./test/test_common/utility.h",
                        "./test/integration/integration.h")
 
+MEMCPY_WHITELIST = ("./source/common/common/mem_block_builder.h",
+                    "./source/common/common/safe_memcpy.h")
+
 # Files in these paths can use MessageLite::SerializeAsString
 SERIALIZE_AS_STRING_WHITELIST = (
     "./source/common/config/version_converter.cc",
@@ -287,6 +290,10 @@ def whitelistedForRealTime(file_path):
   return file_path in REAL_TIME_WHITELIST
 
 
+def whitelistedForMemcpy(file_path):
+  return file_path in MEMCPY_WHITELIST
+
+
 def whitelistedForSerializeAsString(file_path):
   return file_path in SERIALIZE_AS_STRING_WHITELIST
 
@@ -585,6 +592,8 @@ def checkSourceLine(line, file_path, reportError):
       if comment == -1 or comment > grpc_init_or_shutdown:
         reportError("Don't call grpc_init() or grpc_shutdown() directly, instantiate " +
                     "Grpc::GoogleGrpcContext. See #8282")
+  if not whitelistedForMemcpy(file_path) and not "test/" in file_path and "memcpy(" in line:
+    reportError("Don't call memcpy() directly; use SAFE_MEMCPY or MemBlockBuilder.")
 
 
 def checkBuildLine(line, file_path, reportError):

@twghu
Copy link
Contributor

twghu commented Jan 22, 2020

Looking at this

  • review existing use of memcpy
  • modify to use macro
  • partial buffer copies are a candidate for using MemBlockBuilder.

@htuch
Copy link
Member

htuch commented Jun 4, 2020

@adisuissa also pointed out that there is an Abseil span-based pattern that works nicely here at https://abseil.io/tips/93

@jmarantz
Copy link
Contributor Author

jmarantz commented Jun 5, 2020

MemBlockBuilder uses absl::Span fwiw

@rialg
Copy link
Contributor

rialg commented Oct 25, 2020

Hi 👋 @jmarantz, I would like to work on this issue.

@jmarantz
Copy link
Contributor Author

Sounds good @gria1 -- please give it a go! Happy to review.

@rialg
Copy link
Contributor

rialg commented Oct 25, 2020

Great, thanks! So as to clarify ideas, it is expected the SAFE_MEMCPY macro to be added and changes to the use of memcpy(...) for either this or MemBlockBuilder in the source. Then, a set of tests (in the check_format.py script) must be included for a couple of this changes. Did I missed something? Does the macro ought to be written in its own header file or is it prefered to be under source/common/common/mem_block_builder.h

@jmarantz
Copy link
Contributor Author

I think I'd put that new macro in source/common/common/macros.h.

@rialg
Copy link
Contributor

rialg commented Oct 26, 2020

Hi, I am facing some difficulties when trying to compile the project locally using the docker container (with envoyproxy/envoy-build-ubuntu:b480535e8423b5fd7c102fd30c92f4785519e33a as its image). It seems there is an issue with some external dependency (mainly, zlib), so I was wondering whether you have seen this before.

$./ci/run_envoy_docker.sh './ci/do_ci.sh bazel.dev' 

(env. output omitted)

INFO: Analyzed target //source/exe:envoy-static (722 packages loaded, 38899 targets configured).
INFO: Found 1 target...
INFO: From CcCmakeMakeRule external/envoy/bazel/foreign_cc/zlib/include [for host]:

ERROR: /build/tmp/_bazel_envoybuild/b570b5ccd0454dc9af9f65ab1833764d/external/envoy/bazel/foreign_cc/BUILD:343:21: Error while validating output TreeArtifact File:[[<execution_root>]bazel-out/host/bin]external/envoy/bazel/foreign_cc/zlib/include : Failed to resolve relative path include/zconf.h inside TreeArtifact /build/tmp/_bazel_envoybuild/b570b5ccd0454dc9af9f65ab1833764d/execroot/envoy/bazel-out/host/bin/external/envoy/bazel/foreign_cc/zlib/include. The associated file is either missing or is an invalid symlink.
ERROR: /build/tmp/_bazel_envoybuild/b570b5ccd0454dc9af9f65ab1833764d/external/envoy/bazel/foreign_cc/BUILD:343:21: TreeArtifact external/envoy/bazel/foreign_cc/copy_zlib/zlib was not created
ERROR: /build/tmp/_bazel_envoybuild/b570b5ccd0454dc9af9f65ab1833764d/external/envoy/bazel/foreign_cc/BUILD:343:21: not all outputs were created or valid
Target //source/exe:envoy-static failed to build
ERROR: /build/tmp/_bazel_envoybuild/b570b5ccd0454dc9af9f65ab1833764d/external/com_github_cncf_udpa/udpa/annotations/BUILD:5:19 not all outputs were created or valid
INFO: Elapsed time: 640.918s, Critical Path: 6.80s
INFO: 90 processes: 82 internal, 8 processwrapper-sandbox.
FAILED: Build did NOT complete successfully

@jmarantz
Copy link
Contributor Author

I haven't but I confess I do not use docker. Not sure if that's related.

@rialg
Copy link
Contributor

rialg commented Oct 26, 2020

Anyway, another more relevant question hehe 😅 , personally, I am struggling a bit when deciding whether to choose SAFE_MEMCPY or MemBlockBuilder. For instance, would you (@jmarantz) consider the following a valid usage of either of those? And in case it is not, could you provide a simple example to clear our some ideas?
(Thxs in advanced 👍 )

diff --git a/source/extensions/filters/udp/dns_filter/dns_parser.cc b/source/extensions/filters/udp/dns_filter/dns_parser.cc
index 763424a63..6d6509164 100644
--- a/source/extensions/filters/udp/dns_filter/dns_parser.cc
+++ b/source/extensions/filters/udp/dns_filter/dns_parser.cc
@@ -5,6 +5,8 @@
 #include "common/network/address_impl.h"
 #include "common/network/utility.h"
 
+#include "common/common/mem_block_builder.h"
+
 #include "extensions/filters/udp/dns_filter/dns_filter_utils.h"
 
 #include "ares.h"
@@ -171,7 +173,9 @@ bool DnsMessageParser::parseDnsObject(DnsQueryContextPtr& context,
       state = DnsQueryParseState::Flags;
       break;
     case DnsQueryParseState::Flags:
-      ::memcpy(static_cast<void*>(&header_.flags), &data, field_size);
+      MemBlockBuilder<void*> mem_builder(field_size);
+      mem_builder.appendOne(static_cast<void*>(&data));
+      SAFE_MEMCPY(&header_.flags, mem_builder.release());
       state = DnsQueryParseState::Questions;
       break;
     case DnsQueryParseState::Questions:
@@ -741,7 +745,7 @@ void DnsMessageParser::buildResponseBuffer(DnsQueryContextPtr& query_context,
   buffer.writeBEInt<uint16_t>(response_header_.id);
 
   uint16_t flags;
-  ::memcpy(&flags, static_cast<void*>(&response_header_.flags), sizeof(uint16_t));
+  SAFE_MEMCPY(&flags, static_cast<void*>(&response_header_.flags));
   buffer.writeBEInt<uint16_t>(flags);
 
   buffer.writeBEInt<uint16_t>(response_header_.questions);

@jmarantz
Copy link
Contributor Author

I think SAFE_MEMCPY seems appropriate, if the types match. And in that case we should be able to drop the void* cast. I'm skeptical of the safety of anything that requires the void* cast.

If we need to we should be able to put the void* cast into the SAFE_MEMCPY macro definition, after first having validated that the source/dest sizes are the same.

rialg pushed a commit to rialg/envoy that referenced this issue Nov 7, 2020
@antoniovicente
Copy link
Contributor

Is this something that can be accomplished by a template function instead of a macro?

@jmarantz
Copy link
Contributor Author

Agreed: I don't know why I didn't suggest that in the first place. We should use a template function rather than a macro for this. I think (other than renames for style) it will otherwise be a drop-in replacement.

@rialg
Copy link
Contributor

rialg commented Nov 11, 2020

Ok, I will try to make the replacement in the PR.

jmarantz pushed a commit that referenced this issue Mar 16, 2021
Initial migration from memcpy to SAFE_MEMCPY Macro and MemBlockBuilder.

Fixes #9328

Signed-off-by: grial <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beginner Good starter issues! help wanted Needs help! tech debt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants