Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiving back EBMS:0009 when sending via Peppol AS4 to oxalis (potentially just new version). #236

Closed
Praedo4 opened this issue Feb 7, 2024 · 19 comments
Labels
Under review Issues currently being reviewed
Milestone

Comments

@Praedo4
Copy link

Praedo4 commented Feb 7, 2024

Hi Arun and the Oxalis team,

I'm posting an issue here since I'm not aware of better way to communicate to ask a question.

Our Peppol AP is recently started having failed communication in production with other APs that are running oxalis (at least two different APs so far). Nothing has changed on our side and we still have no problems with the rest of APs. We are not using oxalis on our side, but the issue is reported from the receiving party that does run oxalis. The error we get back is 'EBMS:0009'. Error details below.

This seems to be related to the digital signature verification, could it be related to #200 ? Maybe we can try to reproduce it on a test environment. I think our support department already reached out to the other APs, but it would be good to know if you can assist them/us to resolve this issue asap.

I also mention in the title that this is potentially only the case for the latest version, because at least one of the APs we have this issue, we know recently upgraded to the most recent version of oxalis (6.4.0 to our knowledge). Also, we know that plenty of APs that run oxalis and this issue is not observed.

eb:ErrorDetailjavax.xml.crypto.dsig.TransformException: java.lang.RuntimeException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: javax.xml.crypto.dsig.TransformException: java.lang.RuntimeException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: javax.xml.crypto.dsig.TransformException: java.lang.RuntimeException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: java.lang.RuntimeException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: javax.crypto.AEADBadTagException: Tag mismatch!
cause: Tag mismatch!</eb:ErrorDetail>

@artjomsk
Copy link

artjomsk commented Feb 8, 2024

Hi!
I don't think that this is related to #200 issue, because when ciphers are not set up with Java 17 the Accesspoint fails to send every document to any receiver. But in your case I understand that those other APs have communication issues with your Accesspoint only.
Regarding version I couldn't really help, we're using 6.2.0. But you typically can check the version of Oxalis used by some AP by getting endpoint address (for ex. can take from lookup result) and manually changing /as4 to /status in browser address bar. The status servlet is included in Oxalis and shows both Oxalis version, Java version and other useful info.

@Praedo4
Copy link
Author

Praedo4 commented Feb 12, 2024

hi @artjomsk ,

Thank you for finding time to comment on this. Well, I thought so for #200 also, but I am not sure if in some ways Java configurations could come into play where communication can still happen between APs running oxalis (and maybe phase4), but there is trouble in communication with us, since it's not Java based.

Checking the two APs we get trouble with I got:

version.oxalis: 6.4.0
version.java: 17.0.2
mode: PRODUCTION
certificate.subject: CN=xxx,OU=PEPPOL PRODUCTION AP,O=yyy,C=NL
certificate.issuer: CN=PEPPOL ACCESS POINT CA - G2,O=OpenPEPPOL AISBL,C=BE
certificate.expired: false
build.id: c5ccfb7db2d07e44c43e5331acd5cd7dee8ec59b
build.tstamp: 09.12.2023 @ 07:36:49 UTC

and

version.oxalis: 4.1.1
version.java: 1.8.0_382
mode: PRODUCTION
lookup.locator.hostname: edelivery.tech.ec.europa.eu
certificate.subject: CN=xxx,OU=PEPPOL PRODUCTION AP,O=yyy,C=NL
certificate.issuer: CN=PEPPOL ACCESS POINT CA - G2,O=OpenPEPPOL AISBL,C=BE
certificate.expired: false
build.id: 2c4b36ab050ff68971e719004ce454195e5c3da1
build.tstamp: 06.02.2020 @ 10:35:50 UTC

So there goes my theory about this issue having something to do only with the latest oxalis release...

@dladlk
Copy link

dladlk commented Feb 21, 2024

Did you succeed with clarifying the reason of the issue? It could be a wrong certificate of receiver used during sending (e.g. cached SMP lookup results or updated certificate at AP but not updated in SMP).

@Praedo4
Copy link
Author

Praedo4 commented Feb 21, 2024

Hi @dladlk, no clarity on this issue yet. Well, if this is about the signing and signature verification, then there is no lookup happening. As the sending party uses its Peppol certificate for signing and embeds the certificate in XML to be used for verification. The receiver only verifies that XML signature is in tact and the certificate used for signing is a valid (non revoked) Peppol certificate.

It must be something about the different XML signature verification algorithm used or something along those lines..

Are you experiencing a similar issue?

@dladlk
Copy link

dladlk commented Feb 21, 2024

No, we do not experience it - just curious and try to help.
AEADBadTagException happens at Cipher, so relates to encryption/decryption. As I understand, you posted the response from Oxalis receiver - and it means that there was a lookup on your side 100% to identifier this Oxalis server. In lookup results you get a certificate and a URL of the server. This certificate can be different to what is expected by the server under this URL.
Did you post a full stacktrace you get in response? It looks slightly cut... Looking at full stack trace we can detect, on which phase it happens. Also I do not see a big security issue posting here the identifier of receiver you have issues with - so we can see which SMP setup is there... Even hiding SPID in NL like CN=xxx,OU=PEPPOL PRODUCTION AP,O=yyy,C=NL is a little overhead for me :)

@aaron-kumar
Copy link
Member

aaron-kumar commented Mar 7, 2024

Hi Arun and the Oxalis team,

I'm posting an issue here since I'm not aware of better way to communicate to ask a question.

Our Peppol AP is recently started having failed communication in production with other APs that are running oxalis (at least two different APs so far). Nothing has changed on our side and we still have no problems with the rest of APs. We are not using oxalis on our side, but the issue is reported from the receiving party that does run oxalis. The error we get back is 'EBMS:0009'. Error details below.

This seems to be related to the digital signature verification, could it be related to #200 ? Maybe we can try to reproduce it on a test environment. I think our support department already reached out to the other APs, but it would be good to know if you can assist them/us to resolve this issue asap.

I also mention in the title that this is potentially only the case for the latest version, because at least one of the APs we have this issue, we know recently upgraded to the most recent version of oxalis (6.4.0 to our knowledge). Also, we know that plenty of APs that run oxalis and this issue is not observed.

eb:ErrorDetailjavax.xml.crypto.dsig.TransformException: java.lang.RuntimeException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: javax.xml.crypto.dsig.TransformException: java.lang.RuntimeException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: javax.xml.crypto.dsig.TransformException: java.lang.RuntimeException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: java.lang.RuntimeException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!
cause: javax.crypto.AEADBadTagException: Tag mismatch!
cause: Tag mismatch!</eb:ErrorDetail>


Maybe we can try to reproduce it on a test environment. I think our support department already reached out to the other APs, but it would be good to know if you can assist them/us to resolve this issue asap.

Are you able to reproduce error?

I also mention in the title that this is potentially only the case for the latest version, because at least one of the APs we have this issue, we know recently upgraded to the most recent version of oxalis (6.4.0 to our knowledge)

So you are saying that things were working fine until receiving AP was using older Oxalis version (which version?). Is it possible for you to check with receiving AP as what else changed (change in java version, enabling of TLSv1.3 or addition/overriding by some conflicting library, etc ...) at their end when receiving AP upgraded from previous version of Oxalis to 6.4.0?

@aaron-kumar
Copy link
Member

aaron-kumar commented Mar 7, 2024

hi @artjomsk ,

Thank you for finding time to comment on this. Well, I thought so for #200 also, but I am not sure if in some ways Java configurations could come into play where communication can still happen between APs running oxalis (and maybe phase4), but there is trouble in communication with us, since it's not Java based.

Checking the two APs we get trouble with I got:

version.oxalis: 6.4.0
version.java: 17.0.2
mode: PRODUCTION
certificate.subject: CN=xxx,OU=PEPPOL PRODUCTION AP,O=yyy,C=NL
certificate.issuer: CN=PEPPOL ACCESS POINT CA - G2,O=OpenPEPPOL AISBL,C=BE
certificate.expired: false
build.id: c5ccfb7db2d07e44c43e5331acd5cd7dee8ec59b
build.tstamp: 09.12.2023 @ 07:36:49 UTC

and

version.oxalis: 4.1.1
version.java: 1.8.0_382
mode: PRODUCTION
lookup.locator.hostname: edelivery.tech.ec.europa.eu
certificate.subject: CN=xxx,OU=PEPPOL PRODUCTION AP,O=yyy,C=NL
certificate.issuer: CN=PEPPOL ACCESS POINT CA - G2,O=OpenPEPPOL AISBL,C=BE
certificate.expired: false
build.id: 2c4b36ab050ff68971e719004ce454195e5c3da1
build.tstamp: 06.02.2020 @ 10:35:50 UTC

So there goes my theory about this issue having something to do only with the latest oxalis release...


So there goes my theory about this issue having something to do only with the latest oxalis release...

Unfortunately your theory do not justifying whatever examples you gave . You mentioned "Checking the two APs we get trouble with I got:" , where you mentioned "oxalis: 4.1.1" which was released on "06.02.2020" & running on "1.8.0_382" and "oxalis: 6.4.0" which was released on "09.12.2023" & running on "17.0.2".

It is also to be noted that no other AP using Oxalis 6.4.0 reported this issue. So I recommend, please jointly investigate the issue and find out issue or share with us more details possibly complete stacktrace to reproduce this issue.

If during your investigation, you found any issue with interoperability of two libraries then please report back with details.

@Praedo4
Copy link
Author

Praedo4 commented Mar 7, 2024

Hi @dladlk and @aaron-kumar ,

First of all, thank you both for stopping by to comment on this. Let me reply to everything one by one.

No, we do not experience it - just curious and try to help. AEADBadTagException happens at Cipher, so relates to encryption/decryption. As I understand, you posted the response from Oxalis receiver - and it means that there was a lookup on your side 100% to identifier this Oxalis server. In lookup results you get a certificate and a URL of the server. This certificate can be different to what is expected by the server under this URL. Did you post a full stacktrace you get in response? It looks slightly cut... Looking at full stack trace we can detect, on which phase it happens. Also I do not see a big security issue posting here the identifier of receiver you have issues with - so we can see which SMP setup is there... Even hiding SPID in NL like CN=xxx,OU=PEPPOL PRODUCTION AP,O=yyy,C=NL is a little overhead for me :)

I've checked internally and it was decided that it makes no sense for us to mention any details of the other parties without their permission. We've also notified both service providers (SPs) that we opened this GitHub issue, so they are also free to join the conversation if they feel like it.

We've been in contact with the SP that recently upgraded to 6.4.0 and we have the issue with to investigate the issue further. I've shared the certificate details that is used for encrypting the message (based on the data from the SMP lookup) and shared the dump of the signed & encrypted message we sent them. We also double checked all the signing & encryption algorithms that are used on our side and they are 1 for 1 as in the specification, but we also provided these details.

Last thing we heard from them is that they will try to reproduce the issue and come back to us. We are also open to reproduce the issue in the test environment with them and we communicated it.

During these conversations they also shared the full error log they are able to see. Previously I shared the error log that was sent back in the AS4 error response:

2024-02-07 14:06:47,505  WARN [org.apache.cxf.phase.PhaseInterceptorChain] Interceptor for {http://inbound.as4.oxalis.network/}As4ProviderService has thrown exception, unwinding now

org.apache.cxf.binding.soap.SoapFault: A security error was encountered when verifying the message

     at org.apache.cxf.ws.security.wss4j.WSS4JUtils.createSoapFault(WSS4JUtils.java:240)

     at org.apache.cxf.ws.security.wss4j.WSS4JInInterceptor.handleMessageInternal(WSS4JInInterceptor.java:382)

     at org.apache.cxf.ws.security.wss4j.WSS4JInInterceptor.handleMessage(WSS4JInInterceptor.java:213)

     at org.apache.cxf.ws.security.wss4j.PolicyBasedWSS4JInInterceptor.handleMessage(PolicyBasedWSS4JInInterceptor.java:123)

     at org.apache.cxf.ws.security.wss4j.PolicyBasedWSS4JInInterceptor.handleMessage(PolicyBasedWSS4JInInterceptor.java:76)

     at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)

     at org.apache.cxf.transport.MultipleEndpointObserver.onMessage(MultipleEndpointObserver.java:98)

     at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265)

     at org.apache.cxf.transport.servlet.ServletController.invokeDestination(ServletController.java:234)

     at org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:208)

     at org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:160)

     at org.apache.cxf.transport.servlet.CXFNonSpringServlet.invoke(CXFNonSpringServlet.java:225)

     at org.apache.cxf.transport.servlet.AbstractHTTPServlet.handleRequest(AbstractHTTPServlet.java:304)

     at org.apache.cxf.transport.servlet.AbstractHTTPServlet.doPost(AbstractHTTPServlet.java:217)

     at javax.servlet.http.HttpServlet.service(HttpServlet.java:665)

     at org.apache.cxf.transport.servlet.AbstractHTTPServlet.service(AbstractHTTPServlet.java:279)

     at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:290)

     at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:280)

     at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:184)

     at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:89)

     at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)

     at io.opentracing.contrib.web.servlet.filter.TracingFilter.doFilter(TracingFilter.java:189)

     at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)

     at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:121)

     at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:133)

     at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)

     at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)

     at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552)

     at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)

     at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)

     at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)

     at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)

     at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)

     at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)

     at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)

     at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)

     at org.eclipse.jetty.server.Server.handle(Server.java:516)

     at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)

     at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)

     at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)

     at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)

     at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)

     at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)

     at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)

     at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)

     at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)

     at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)

     at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)

     at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409)

     at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)

     at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)

     at java.base/java.lang.Thread.run(Thread.java:833)

Caused by: org.apache.wss4j.common.ext.WSSecurityException: javax.xml.crypto.dsig.TransformException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!

     at org.apache.wss4j.dom.processor.SignatureProcessor.verifyXMLSignature(SignatureProcessor.java:408)

     at org.apache.wss4j.dom.processor.SignatureProcessor.handleToken(SignatureProcessor.java:230)

     at org.apache.wss4j.dom.engine.WSSecurityEngine.processSecurityHeader(WSSecurityEngine.java:340)

     at org.apache.cxf.ws.security.wss4j.WSS4JInInterceptor.handleMessageInternal(WSS4JInInterceptor.java:326)

     ... 50 common frames omitted

Caused by: javax.xml.crypto.dsig.XMLSignatureException: javax.xml.crypto.dsig.TransformException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!

     at org.apache.jcp.xml.dsig.internal.dom.DOMReference.transform(DOMReference.java:541)

     at org.apache.jcp.xml.dsig.internal.dom.DOMReference.validate(DOMReference.java:380)

     at org.apache.jcp.xml.dsig.internal.dom.DOMXMLSignature.validate(DOMXMLSignature.java:274)

     at org.apache.wss4j.dom.processor.SignatureProcessor.verifyXMLSignature(SignatureProcessor.java:381)

     ... 53 common frames omitted

Caused by: javax.xml.crypto.dsig.TransformException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!

     at org.apache.wss4j.dom.transform.AttachmentContentSignatureTransform.processAttachment(AttachmentContentSignatureTransform.java:242)

     at org.apache.wss4j.dom.transform.AttachmentContentSignatureTransform.transform(AttachmentContentSignatureTransform.java:121)

     at org.apache.jcp.xml.dsig.internal.dom.DOMTransform.transform(DOMTransform.java:166)

     at org.apache.jcp.xml.dsig.internal.dom.DOMReference.transform(DOMReference.java:451)

     ... 56 common frames omitted

Caused by: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!

     at java.base/javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:148)

     at java.base/javax.crypto.CipherInputStream.read(CipherInputStream.java:261)

     at org.apache.wss4j.common.util.AttachmentUtils$1.read(AttachmentUtils.java:537)

     at org.apache.cxf.attachment.DelegatingInputStream.read(DelegatingInputStream.java:90)

     at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:244)

     at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:284)

     at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:343)

     at java.base/java.io.FilterInputStream.read(FilterInputStream.java:132)

     at java.base/java.io.FilterInputStream.read(FilterInputStream.java:106)

     at org.apache.wss4j.dom.transform.AttachmentContentSignatureTransform.processAttachment(AttachmentContentSignatureTransform.java:216)

     ... 59 common frames omitted

Caused by: javax.crypto.AEADBadTagException: Tag mismatch!

     at java.base/com.sun.crypto.provider.GaloisCounterMode$GCMDecrypt.doFinal(GaloisCounterMode.java:1395)

     at java.base/com.sun.crypto.provider.GaloisCounterMode.engineDoFinal(GaloisCounterMode.java:432)

     at java.base/javax.crypto.Cipher.doFinal(Cipher.java:2152)

     at java.base/javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:145)

     ... 68 common frames omitted

@dladlk
Copy link

dladlk commented Mar 7, 2024

By stacktrace I see that the error happens during decryption of payload in attachment to calculate its digest to compare it with the one included in signature:

at org.apache.wss4j.dom.transform.AttachmentContentSignatureTransform.processAttachment(AttachmentContentSignatureTransform.java:216)

And it happens when input stream to read is finished (attempt to read more data returns -1) - so Cipher tries to finalize decryption:

at java.base/javax.crypto.Cipher.doFinal(Cipher.java:2152)

I would suggest for receiver to check the size of the transferred payload (e.g. via ingress logs of POST size), maybe it exceeds some limits on their ingress, or check CXF temporary folder where "big" payloads (more than 127KB of zipped/encrypted data) are cached to file system - something wrong can be with the file. Also you can try to see if small payloads are sent successfully but big failing etc...

@Praedo4
Copy link
Author

Praedo4 commented Mar 7, 2024

Are you able to reproduce error?

So far we consistently have this issue with these two parties in Production and none of them reacted to try to reproduce it in the test environment, but we are open to it.

So you are saying that things were working fine until receiving AP was using older Oxalis version (which version?). Is it possible for you to check with receiving AP as what else changed (change in java version, enabling of TLSv1.3 or addition/overriding by some conflicting library, etc ...) at their end when receiving AP upgraded from previous version of Oxalis to 6.4.0?

So, I've looked up the conversations with this AP dating from Nov 2023 when they still ran Oxalis 4.1.0 and Java 8. We were getting a 500 response back without any proper AS4 error message. When we reached out, the finding on their side was some warnings (not errors) logged with javax.crypto.AEADBadTagException: Tag mismatch. This was never fixed and they were planning to upgrade to a more recent Oxalis release that supports reporting in 2024, so this was put on hold. After the system upgrade on their side, the issue has changed to the one that I reported.

I should also mention, that as a temporary workaround for the end of 2023, we could use an old oxalis release on our side (4.1.1) and it was able to send documents to this party. We've compared everything in the soap envelope (both Messaging and Security) between the outgoing messages from the 4.1.1 oxalis an our current solution and could not find any meaningful difference. It was also configured to use the same Peppol certificate and use the same SML ofc. We also have not had such issues with other APs before or after (except one more party that was discovered later and I mention in this GitHub issue) with our current solution.

Unfortunately your theory do not justifying whatever examples you gave . You mentioned "Checking the two APs we get trouble with I got:" , where you mentioned "oxalis: 4.1.1" which was released on "06.02.2020" & running on "1.8.0_382" and "oxalis: 6.4.0" which was released on "09.12.2023" & running on "17.0.2".

It is also to be noted that no other AP using Oxalis 6.4.0 reported this issue. So I recommend, please jointly investigate the issue and find out issue or share with us more details possibly complete stacktrace to reproduce this issue.

If during your investigation, you found any issue with interoperability of two libraries then please report back with details.

Indeed, once I could lookup the oxalis version of the other AP, who turned out to be using 4.1.1, it was clear that it is not something specific for a latest oxalis release. I suppose it has something to do with configuration or network setup one way or another as it is not consistent with just Oxalis / Java version by the looks of it.

I will keep you updated if we have any findings

@Praedo4
Copy link
Author

Praedo4 commented Mar 7, 2024

I would suggest for receiver to check the size of the transferred payload (e.g. via ingress logs of POST size), maybe it exceeds some limits on their ingress, or check CXF temporary folder where "big" payloads (more than 127KB of zipped/encrypted data) are cached to file system - something wrong can be with the file. Also you can try to see if small payloads are sent successfully but big failing etc...

Hmm, I will communicate this to the other party, thanks for a possible pointer or at least something to investigate. However, I've just checked and the last invoice that was used to reproduce the issue was like 71KB with SBDH and the full AS4 message that was actually sent was around 50KB after encryption and compression. That doesn't sound that large.

Also not so long ago we have again verified our latest version against the Peppol Testbed v2 and we passed all the test, including the large file test scenario which is almost 10MB for the attachment :) So I don't expect any problems there from our side.

@dladlk
Copy link

dladlk commented Mar 7, 2024

Then the size is not the case, I agree :) The only thing we can conclude - decryption fails after reading the full payload.

@aaron-kumar aaron-kumar added this to the 7.x.x milestone Mar 18, 2024
@karelkryda
Copy link

karelkryda commented Apr 26, 2024

Hello,
it looks like we've run into the same problem.

Our infrastructure:

  • Amazon Linux 2023.4.20240401
  • Java version 11.0.23
  • Tomcat version 9.0.87
  • Oxalis server version 6.5.0

This is all hidden behind the AWS Api Gateway (it listens on our domain and provides the HTTPS connection), then requests are forwarded to the AWS ELB listening on the HTTP port. This allows us to have the Oxalis AS4 plugin accessible directly on our domain without having to add /oxalis/as4. Everything seems to be working and so we tried to go through Peppol AP onboarding to get a production certificate. Unfortunately, immediately on the second functionality test TC2A.2B: AS4 message reception we encountered a problem where we get the above mentioned error after receiving the message, i.e.:

org.apache.cxf.phase.PhaseInterceptorChain.doDefaultLogging Interceptor for {http://inbound.as4.oxalis.network/}As4ProviderService has thrown exception, unwinding now

org.apache.cxf.binding.soap.SoapFault: A security error was encountered when verifying the message

Caused by: org.apache.wss4j.common.ext.WSSecurityException: javax.xml.crypto.dsig.TransformException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!

We already tried the Java Temurin distribution instead of Corretto, with no result. This issue is limiting for us, as we are unable to get a production Peppol certificate, let alone operate as an Access Point for our customers.

@aaron-kumar, Would it be possible on your part to investigate this issue earlier than milestone 7.0.0?

Thank you in advance for your feedback.

@Praedo4
Copy link
Author

Praedo4 commented Apr 26, 2024

I've been planning to give an update for a while, but here it is finally.

TLDR: for us the issue is still there when sending to one other Peppol access point

Following the discussions here in February/March, we've concluded an internal analysis of this issue on our Access Point (AP). We've found that indeed there is one other AP that we consistently have trouble sending documents to and we get an error from them. We've reached out to them to try and look into it.

Also, there were two other APs that we have had a few documents sent with similar errors, but this issue has been resolved after some time. Meanwhile, there were no updates on our side, so it must've been something changed on the receiving end. We've reached out to both APs and unfortunately didn't get any helpful details on what could've been the solution.

Now, with the AP we still get trouble, we have a temporary workaround to use an old oxalis 4.1.1 that can still deliver messages to them. We were also able to reproduce the same issue in a test environment (using SMK) but haven't reached any conclusion on what is the issue or how we can solve it.

Also, when they deployed Oxalis 6.5.0 in the test environment, the reported error changed to 'Code: EBMS:0004, Message: EBMS:0004 Other PEPPOL:NOT_SERVICED' instead of 'javax.xml.crypto.dsig.TransformException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!' were saw previously. When checking the difference between 6.4.0 and 6.5.0 and clear that it's just the extra error handling that was added, but it doesn't really help to track down the issue.

@aaron-kumar aaron-kumar added the Under review Issues currently being reviewed label May 3, 2024
@aaron-kumar aaron-kumar modified the milestones: 7.x.x, 6.x.x May 3, 2024
@aaron-kumar aaron-kumar moved this to Open Issues- Review Required in Oxalis Public Roadmap May 3, 2024
@karelkryda
Copy link

karelkryda commented May 3, 2024

hi,
our problem was caused by the AWS API gateway that modifies the binary data payload into UTF-8 encoded JSON strings by default. The solution was to set */* as Binary media types in the API settings of the API gateway. This solved our problem and we were able to pass Peppol onboarding.

I am attaching the source of the above finding:

@aaron-kumar
Copy link
Member

I've been planning to give an update for a while, but here it is finally.

TLDR: for us the issue is still there when sending to one other Peppol access point

Following the discussions here in February/March, we've concluded an internal analysis of this issue on our Access Point (AP). We've found that indeed there is one other AP that we consistently have trouble sending documents to and we get an error from them. We've reached out to them to try and look into it.

Also, there were two other APs that we have had a few documents sent with similar errors, but this issue has been resolved after some time. Meanwhile, there were no updates on our side, so it must've been something changed on the receiving end. We've reached out to both APs and unfortunately didn't get any helpful details on what could've been the solution.

Now, with the AP we still get trouble, we have a temporary workaround to use an old oxalis 4.1.1 that can still deliver messages to them. We were also able to reproduce the same issue in a test environment (using SMK) but haven't reached any conclusion on what is the issue or how we can solve it.

Also, when they deployed Oxalis 6.5.0 in the test environment, the reported error changed to 'Code: EBMS:0004, Message: EBMS:0004 Other PEPPOL:NOT_SERVICED' instead of 'javax.xml.crypto.dsig.TransformException: java.io.IOException: javax.crypto.AEADBadTagException: Tag mismatch!' were saw previously. When checking the difference between 6.4.0 and 6.5.0 and clear that it's just the extra error handling that was added, but it doesn't really help to track down the issue.

@Praedo4 : If reported error changed to 'Code: EBMS:0004, Message: EBMS:0004 Other PEPPOL:NOT_SERVICED' then it is certain that they are Not using right certificate. There is some problem with validation of certificate.

In general, EBMS_0009 is generic type of error thrown with possible message.

@aaron-kumar
Copy link
Member

aaron-kumar commented May 5, 2024

hi, our problem was caused by the AWS API gateway that modifies the binary data payload into UTF-8 encoded JSON strings by default. The solution was to set */* as Binary media types in the API settings of the API gateway. This solved our problem and we were able to pass Peppol onboarding.

I am attaching the source of the above finding:

* [StackOverflow thread](https://stackoverflow.com/a/59383951)

* [AWS documentation](https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-console.html)

Thanks @karelkryda for sharing as it will be helpful for other especially those using AWS.

With this, we also want to highlight the point that Oxalis user do not just "fire & forget"/"dump" issue in Github but they must continue investigating it further with difference perspective. It is difficult if not impossible to simulate all kind of external/environment factor to reproduce issue.

@aaron-kumar
Copy link
Member

Based on discussion so far, reported issue seems to be either linked with external entities or usage of certificate. This is Not an Oxalis issue. Do report back if you can prove otherwise.

Note: We strongly recommend access point using Oxalis version prior than 6.x.x to upgrade to latest version otherwise you are Non-compliant as per OpenPeppol specifications.

@aaron-kumar
Copy link
Member

As per discussion so far, it seems that issue is due other environmental factors.
Changing it to discussion for now but we can change it to issue if evidence will be provided to make it an issue in Oxalis.

@OxalisCommunity OxalisCommunity locked and limited conversation to collaborators May 12, 2024
@aaron-kumar aaron-kumar converted this issue into discussion #246 May 12, 2024
@aaron-kumar aaron-kumar moved this from Open Issues- Review Required to Completed in Oxalis Public Roadmap Jun 11, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Under review Issues currently being reviewed
Projects
Status: Completed
Development

No branches or pull requests

5 participants