Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_regex method returns different results for the same input #231

Open
HilaZiv opened this issue Jul 7, 2024 · 2 comments · May be fixed by #235
Open

to_regex method returns different results for the same input #231

HilaZiv opened this issue Jul 7, 2024 · 2 comments · May be fixed by #235
Labels

Comments

@HilaZiv
Copy link

HilaZiv commented Jul 7, 2024

Hello, I'm using version 8.3.0 and would like to compare different DFAs based on their regular expressions. When I convert a DFA to GNFA and use to_regex() method, I sometimes get different results for the same DFA when executing the code more then once. Is there a way to make this process deterministic, so I would get the same regular expression every time?
Thanks in advance,
Hila.

I'm using the following code when I sometimes receive
(012(012|12)*(03|3)|03)*(012(012|12)*(01?|1?)|(01?)?)
and sometimes
(0(12(12)*(30|0)|30)*(12(12)*(3|1?)|(3|1?)))?

gt_symbols = {'Connect': '0', 'Send_msg': '1', 'Ack': '2', 'Close': '3'}
gt_dfa = DFA.from_nfa(NFA(
    states={'q0', 'q1', 'q2'},
    input_symbols={gt_symbols['Connect'], gt_symbols['Send_msg'], gt_symbols['Close'], gt_symbols['Ack']},
    transitions={
        'q0': {gt_symbols['Connect']: {'q1'}},
        'q1': {gt_symbols['Close']: {'q0'}, gt_symbols['Send_msg']: {'q2'}},
        'q2': {gt_symbols['Ack']: {'q1', 'q0'}}
    },
    initial_state='q0',
    final_states={'q0', 'q1', 'q2'}
))
regex = GNFA.from_dfa(gt_dfa).to_regex()
print(regex)
@HilaZiv HilaZiv changed the title to_regex method gives different results for the same input to_regex method returns different results for the same input Jul 7, 2024
@caleb531
Copy link
Owner

caleb531 commented Jul 7, 2024

Interesting. @eliotwrobson I wonder if this has to do with some assumption in the GNFA logic about the order of set elements, when the order of set elements in Python can change between interpreter runs. Thoughts?

@caleb531 caleb531 added the bug label Jul 7, 2024
@eliotwrobson
Copy link
Collaborator

@caleb531 yes, I think this has something to do with the fact that the algorithm removes items from a set in a non-deterministic order in case of ties: https://github.com/caleb531/automata/blob/main/automata/fa/gnfa.py#L343

I'd have to play around with the code a little bit to get a better feel for what's going on. However, I don't think this is technically a bug, when I checked the different regexs output by the algorithm, they were equivalent.

@eliotwrobson eliotwrobson linked a pull request Jul 27, 2024 that will close this issue
@eliotwrobson eliotwrobson linked a pull request Jul 27, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants