Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workcraft Parsing #36

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ GraphML files are currently:

### [Workcraft](https://www.workcraft.org/)
Workcraft files are currently:
- Parsing: **Unimplemented**
- Parsing: Directed Graphs Only.
- Writing: **Unimplemented**

## Graph Library support
Expand Down
1 change: 1 addition & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## pangraph-0.2.1
* Bump Algebraic Graphs from 0.1.* to 0.2.*
* Add Worcraft parsing via Pangraph.Workcraft.Parser
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo "Worcraft"


## pangraph-0.2.0
* Addition of conversion and revert for FGL.
Expand Down
2 changes: 2 additions & 0 deletions pangraph.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ library
, Pangraph.Gml.Ast
, Pangraph.Gml.Parser
, Pangraph.Gml.Writer
, Pangraph.Workcraft.Parser
, Pangraph.Workcraft.Internal
build-depends: base >= 4.8 && < 5
, algebraic-graphs == 0.2.*
, attoparsec == 0.13.*
Expand Down
9 changes: 8 additions & 1 deletion src/Pangraph/Example.hs
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module Pangraph.Example where

import Pangraph.Examples.SampleGraph(smallGraph)

import qualified Pangraph.GraphML.Parser as GraphML
Expand All @@ -10,8 +10,15 @@ import qualified Pangraph.Gml.Writer as Gml

import qualified Pangraph.Containers as Containers

import qualified Pangraph.Workcraft.Parser as Workcraft

import qualified Data.ByteString as BS

main :: IO ()
main = do
workcraftXML <- BS.readFile "./examples/graphs/model.xml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also support parsing directly from work files?

print $ Workcraft.parse workcraftXML

-- Serialise and re-parse the graph in to GraphML.
let Just gGraphML = (GraphML.parse . GraphML.write) smallGraph
-- Test equality
Expand Down
47 changes: 37 additions & 10 deletions src/Pangraph/Internal/HexmlExtra.hs
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,51 @@ import Data.ByteString (ByteString)
import Text.XML.Hexml
import Pangraph

-- * A module containing some exclusivly Hexml helper functions and some which have common interfacing functions.

-- | Find the ['Node'] with the final in the ['ByteString'] after following the
-- 'Node' names recursively.
-- * A module containing containing extra Hexml functions as well as functions to ease conversions between the libraries.

type HexmlAttribute = Text.XML.Hexml.Attribute
type PangraphAttribute = Pangraph.Attribute

-- | Find the ['Node'] with the final name in the ['ByteString'] after following the 'Node' names recursively, including the root node's name.
-- Given `[]` returns '[]'.
followChildren :: Node -> [ByteString] -> [Node]
followChildren h [] = [h]
followChildren h bs = (concatMap recurse . childrenBy h) (head bs)
followChildren _ [] = []
followChildren h bs = if head bs == name h
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be easier to read this if you use guard expressions, e.g. | head bs == .... Even better, avoid using partial functions like head by pattern-matching (otherwise a future refactoring could introduce a bug).

then concatMap (`recurse` tail bs) $ childrenBy h (bs !! 1)
else error $ "Followchildren failed (Did you include the root name?):\n" ++ show h ++ "\n Bytestring: \n" ++ show bs
where
recurse :: Node -> [Node]
recurse n = followChildren n (tail bs)
recurse :: Node -> [ByteString] -> [Node]
recurse _ [] = error "Recurse called on empty Bytestring"
recurse h' [_] = [h']
recurse h' bs' = concatMap (`recurse` tail bs') $ childrenBy h' (bs' !! 1)

-- An unsafe version of the 'Text.XML.Hexml.parse' upon failure throws error to stderr.
-- An unsafe version of the 'Text.XML.Hexml.parse' upon failure throws error to stderr, showing the resulting error message.
hexmlParse :: ByteString -> Node
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we use unsafe prefix for unsafe functions -- let's keep doing this for consistency.

hexmlParse file = case parse file of
Right t -> t
Left l -> error $ "HexML parser failed:\n" ++ show l

-- | Converts a between the two libaries Attribute types.
convertAtt :: Text.XML.Hexml.Attribute -> Pangraph.Attribute
convertAtt :: HexmlAttribute -> PangraphAttribute
convertAtt a = (attributeName a, attributeValue a)


-- | Extend `H.attributeBy` to lists.
attributesBy :: [ByteString] -> Node -> [Maybe HexmlAttribute]
attributesBy bs h = map (attributeBy h) bs

-- | Traverse the Hexml node tree using `followChildren` apply a filter to its results and apply a function to the remaining nodes.
-- The resulting `Node`s are then filtered and have a function applied.
-- The function `(Node -> a)` may then return the data from this node or its children.
followChildrenFilterThenApplyFunction
:: Node
-> [ByteString]
-> (Node -> Bool)
-> (Node -> a)
-> [a]
followChildrenFilterThenApplyFunction
root nodePath filterPredicate callerFunc =
( map callerFunc
. filter filterPredicate
. followChildren root
) nodePath
70 changes: 70 additions & 0 deletions src/Pangraph/Workcraft/Internal.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
{-# LANGUAGE OverloadedStrings #-}

module Pangraph.Workcraft.Internal where

import Data.ByteString (ByteString)

import qualified Text.XML.Hexml as H
import qualified Pangraph.Internal.HexmlExtra as H

import Pangraph
import Pangraph.Internal.ProtoGraph

import Data.Maybe (fromJust)

newtype WorkcraftAST = WorkcraftAST H.Node
deriving (Show)

unwrap :: WorkcraftAST -> H.Node
unwrap (WorkcraftAST n) = n

render :: WorkcraftAST -> ByteString
render (WorkcraftAST h) = H.render h

instance BuildPangraph WorkcraftAST where
-- <node class="org.workcraft.plugins.graph.Vertex" ref="v0">
-- <Vertex symbol=""/>
-- </node>
getProtoVertex (WorkcraftAST h) = let
-- Check the class attribute of the node.
isVertex :: H.Node -> Bool
isVertex h'= (H.attributeValue <$> H.attributeBy h' "class") == Just "org.workcraft.plugins.graph.Vertex"
-- Extract the fields from the node returning the attributes for a pangraph.
extractVertexData :: H.Node -> [Attribute]
extractVertexData = map (H.convertAtt . fromJust) . H.attributesBy ["ref"]
-- Traverse the node tree filtering other nodes and extracting data from them.
traverseNodeTree :: [[Attribute]]
traverseNodeTree = H.followChildrenFilterThenApplyFunction h xmlNodePath isVertex extractVertexData
-- Build the protoVertices
in map makeProtoVertex traverseNodeTree

-- <node class="org.workcraft.dom.math.MathConnection" ref="con0">
-- <MathConnection first="v1" second="v0"/>
-- </node>
getProtoEdge (WorkcraftAST h) = let
-- Check the class property.
isEdge :: H.Node -> Bool
isEdge h'= (H.attributeValue <$> H.attributeBy h' "class") == Just "org.workcraft.dom.math.MathConnection"
-- Extract values from the attributes and also the relevant values from children.
extractEdgeData :: H.Node -> [Attribute]
extractEdgeData h' = let
-- Find the child node for this Edge
child = (head $ H.followChildren h' ["node", "MathConnection"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, can we get rid of partial functions like head? If the list is guaranteed to be non-empty, use Data.List.NonEmpty.

-- Concat the lists and ready them for Pangraph constructors.
in map (H.convertAtt . fromJust)
(H.attributesBy ["ref"] h'
++ H.attributesBy ["first", "second"] child)
-- Construct the ProtoEdges.
in map makeProtoEdge $ H.followChildrenFilterThenApplyFunction h xmlNodePath isEdge extractEdgeData

-- A wrapper for hexmlParse to get around the dummy root returned.
hexmlParse :: ByteString -> WorkcraftAST
hexmlParse = WorkcraftAST . getRealRoot . H.hexmlParse

-- Get the correct root.
getRealRoot :: H.Node -> H.Node
getRealRoot h = H.children h !! 1

-- The nodes to follow for the payload.
xmlNodePath :: [ByteString]
xmlNodePath = ["model", "root", "node"]
26 changes: 26 additions & 0 deletions src/Pangraph/Workcraft/Parser.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{-# LANGUAGE OverloadedStrings #-}

module Pangraph.Workcraft.Parser (
parse
) where

import Pangraph
import Pangraph.Internal.ProtoGraph
import Pangraph.Workcraft.Internal

import Data.Maybe (fromJust)

import Data.ByteString(ByteString)

-- | Provided with the 'model.xml' from a Workcraft archive generates the Pangraph of a directed Graph.
parse :: ByteString -> Maybe Pangraph
parse bs = let
vf :: ProtoVertex -> VertexID
vf v = f "ref" (protoVertexAttributes v)
ef :: ProtoEdge -> (VertexID, VertexID)
ef e = let
att = protoEdgeAttributes e
in (f "first" att, f "second" att)
f :: ByteString -> [Attribute] -> VertexID
f a = fromJust . lookup a
in buildPangraph (hexmlParse bs) vf ef