Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ndpi_detection_process_packet is not using ndpi_id_struct anymore. #1452

Closed
vitalivanov opened this issue Feb 22, 2022 · 12 comments
Closed

ndpi_detection_process_packet is not using ndpi_id_struct anymore. #1452

vitalivanov opened this issue Feb 22, 2022 · 12 comments
Labels

Comments

@vitalivanov
Copy link
Contributor

In recent ndpi library ndpi_id_struct is removed. Previously we used:

struct ndpi_id_struct *ndpi_src = NULL;
struct ndpi_id_struct *ndpi_dst = NULL;
...
ndpi_detection_process_packet(ndpi_struct, &item->ndpi_flow, ip, iplen, ms, ndpi_src, ndpi_dst);
to detect packet in flow and combine flows from and to client and server.

Now it's changed to:
ndpi_detection_process_packet(ndpi_struct, &item->ndpi_flow, ip, iplen, ms);

How to combine flows now?

@vitalivanov
Copy link
Contributor Author

@IvanNardi @lucaderi

@IvanNardi
Copy link
Collaborator

I am not sure if I am getting your question right. Anyway:

  1. struct ndpi_id_struct was used (not by ndpiReader, by the way) to correlate different flows to/from the same endpoints
  2. If your code really set ndpi_src and ndpi_dst to NULL, this feature was disabled anyway
  3. This feature requires a lot of memory, it was used only by some old protocols in plain text and it has never been update for years (see Purpose of struct ndpi_id_struct #1279)
  4. Current nDPI versions correlate flows using LRU caches internally

I hope this clarifies your question

@vitalivanov
Copy link
Contributor Author

  1. I tried to say that we do use these structures. And there is a code in between stated with ...
    We did used this to and from approach to combine flows from different ports.

  2. We did use it. Null is just init code.

  3. So all these old protocols are still going to work with new LRU caches based mechanism?
    We have internal automated tests and old code works perfectly. New library fails in some cases. We send pcaps for tests from ndpi codebase and fail scenarios are youtube traffic randomly.

So, my question is. If we remove ndpi_id_struct completely we do not need to do anything to preserve old behavior? It will be automatic based on new LRU cache approach?

@IvanNardi
Copy link
Collaborator

@vitalivanov, ok for points 1 and 2; sorry but I missed the fact that you really used them.

3. So all these old protocols are still going to work with new LRU caches based mechanism?

No. The idea was that these old protocols (if still used) are probably changed in the last years (i.e. using encryption, for example) and therefore the struct ndpi_id_struct stuff, even if present, it is not triggered anymore.
Bottom line: we removed one of the detection logic of these protocols.

Could this change trigger some regressions? Possibly. Unfortunately nDPI doesn't have any traces of these protocols really triggering this feature, so I can't say for sure.

As said in #1279, the only protocols affected by this change should be: DirectConnect, IRC, Gnutella and Jabber.
Youtube should definitely not be impacted.

If you can share the pcaps failing with current code, I can take a look at them.

@IvanNardi
Copy link
Collaborator

@vitalivanov, any news?

@vitalivanov
Copy link
Contributor Author

vitalivanov commented Mar 10, 2022

I added .gif extention just to be able to load it here. Just substitute .gif to .pcap:
youtube pcap

This file on versions 3.4-stable nDPI produced different results with ndpiReader in terms of detected packets as compare to latest 4.2-stable.

@vitalivanov
Copy link
Contributor Author

@IvanNardi, any news?

@IvanNardi
Copy link
Collaborator

@vitalivanov :sorry for the delay.
I checked your trace and I didn't find any particular difference in Youtube classification between 3.4-stable and current dev: in 3-4 there are 8 flows classified as Youtube; in dev there are 14 flows (the original 8 + 6 over QUIC; there is no QUIC version 1 support in 3.4).

Could you point to any specific session with unexpected/wrong/changed classification, please?
1452-3.4.txt
1452-dev.txt

@vitalivanov
Copy link
Contributor Author

Thanks @IvanNardi
for our automation testing and dev purposes we use ndpiReader to compare results of traffic processing inside our application and ndpiReader both feed with pcaps.
What happened here i did not update ndpiReader for new library.

When I migrate to ndpi 4.2 I see now that ndpiReader can not provide output in json format as it did it before:

./test/bin/ndpiReader -h
Welcome to nDPI 2.7.0-1418-97bdfe2

ndpiReader -i <file|device> [-f <filter>][-s <duration>][-m <duration>]
          [-p <protos>][-l <loops> [-q][-d][-h][-t][-v <level>]
          [-n <threads>][-w <file>][-c <file>][-j <file>][-x <file>]

Usage:
...
  -j <file.json>            | Specify a file to write the content of packets in .json format

new library and ndpiReader can not do that because of:

commit 51cfdfb0d80a7bbcc11bc3b95d1696d8dae900c2
Author: Luca Deri <[email protected]>
Date:   Sun Nov 17 17:51:45 2019 +0100

    Removed unused JSON-C code

@lucaderi how do you guys decide that it is unused???
It's used and we depend on it...

Can we have that back?

@vitalivanov
Copy link
Contributor Author

@lucaderi i see some serialization code... is there anything that can be reused to get similar stats as before?

@vitalivanov
Copy link
Contributor Author

vitalivanov commented Mar 31, 2022

@lucaderi @IvanNardi
I created two pr's for 4.2-stable and dev branches:

#1508
#1509

All the details inside...
It's tested and working ok for me

@vitalivanov
Copy link
Contributor Author

thanks.
this one can be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants