Skip to content

A dk.brics FSM to regular-expression-string converter

License

Notifications You must be signed in to change notification settings

julianthome/autorex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

autorex

With autorex, you can convert an automaton back into its regular expression string representation. The translation is based on the state-elimination algorithm which is neatly described in Sipser, Michael. Introduction to the Theory of Computation. Vol. 2. Boston: Thomson Course Technology, 2006, Chapter 1 (Starting from page 70). The implementation is based on the dk.brics automaton package.

With dk.brics, you can apply automata operations such as concatenation, union, intersection, etc. on input automata to derive an output automaton. Once a regular expression string has been converted into the dk.brics automaton representation, dk.brics does not offer the functionality to convert it back to a regular expression String. This package provides tooling that converts any dk.brics output automaton back to its regular expression string representation.

Status

License: MIT Language Maven Linux Build Status Test Coverage Code Climate

Maven Integration

autorex is available on maven central. You can integrate it by using the following dependency in the pom.xml file. Note, that the maven releases do not necessarily contain the newest changes that are available in the repository. The maven releases are kept in sync with the tagged releases. The API documentation for every release is available. here. However, the content of this documentation, in particular the code examples and usage scenarios, is always aligned with the master branch of this repository. Hence, it might be that the latest autorex features are not yet available through the maven package.

<dependency>
    <groupId>com.github.julianthome</groupId>
    <artifactId>autorex</artifactId>
    <version>1.0</version>
</dependency>

Usage

State Elimination

autorex has a very simple API for state elimination provided by the class autorex. The following example gives an intuition how autorex can be used. After creating some dk.brics automata, and after performing a couple of operations on them, the result is stored in the d automaton. Afterwards, we can simply invoke the getRegexFromAutomaton method on d in order to get the corresponding regular expression string.

Automaton a = new RegExp("(abc)+[0-9]{1,3}[dg]*").toAutomaton();
Automaton b = new RegExp("12345678").toAutomaton();
Automaton c = new RegExp(".{0,5}").toAutomaton();
Automaton d = a.union(b).intersection(c); // output automaton

String s0 = Autorex.getRegexFromAutomaton(d); // obtain regular expression String
System.out.println("Regex String: " + s0);

autorex will give you the following regular expression string which is equivalent to the result of the computation (the automaton d) in the example above.

(((abc[0-9])([0-9])))|((abc[0-9])((g|d)|.{0}))

Automaton Transformations

autorex can also be used in order to transform a given automaton. At the moment, we support the following transformation from a source automaton to a target automaton:

  • camel-case automatons that are case insensitive
  • substring automatons that accept all the substring of the source automaton
  • suffix automatons that accepts all the suffixes of a given source automaton

The following code snippet shows how the transformation API can be used:

String s = "hello my name is Alice";
Automaton a = new RegExp(s).toAutomaton();
Automaton ccas = Autorex.getCamelCaseAutomaton(a);
Automaton substr = Autorex.getSubstringAutomaton(a);
Automaton sfx = Autorex.getSuffixAutomaton(a);

For more examples, please have a look at the provided test cases or at the javadoc documentation of the class autorex.

Licence

The MIT License (MIT)

Copyright (c) 2016 Julian Thome [email protected]

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.