Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pangraph is not on Hackage! #7

Closed
deepfire opened this issue Jan 17, 2017 · 23 comments
Closed

pangraph is not on Hackage! #7

deepfire opened this issue Jan 17, 2017 · 23 comments
Assignees

Comments

@deepfire
Copy link

pangraph is sufficiently amazing, that its absence on hackage is.. noticeable! : -)

Are there plans for such a release?

@snowleopard
Copy link
Member

snowleopard commented Jan 18, 2017

Oh, wow, this is unexpected :-)

The library is in a very draft state at the moment -- I'm not sure it's ready to be used by others. We do use in our graph analysis project though and so far it was fit for our purpose.

@deepfire How do you intend to use it? I presume, to parse GraphML files, as I don't think pangraph can do anything else at this point.

@thisiswhereitype Do you plan to release the library on Hackage soon? If yes, we could discuss a release plan here. As a starting point, I'd advise to switch to an existing XML parser to ease future maintenance. (The current implementation was more of a Haskell-learning exercise!)

@deepfire
Copy link
Author

deepfire commented Jan 18, 2017

@snowleopard, GraphML parsing, indeed. Although at the moment I'm discovering that things aren't quite that simple in the GraphML land.. lack of a true standard being one problem.

@snowleopard
Copy link
Member

@deepfire Yes, GraphML is a bit messy. Fortunately, in our project we only need to extract nodes and edges without any labels such as weight, etc. What are your requirements?

@deepfire
Copy link
Author

@snowleopard, ideally, I'd love to be able to use some generic, widely accepted graph format for knowledge representation -- but initially, "just" reading and writing yEd files would be exactly enough.

@snowleopard
Copy link
Member

I see. Could you link to a precise definition of the yEd file format?

@snowleopard
Copy link
Member

Thanks! That looks like quite a lot of stuff...

The idea behind pandoc is to convert graphs between various formats, but since the overlap between all formats is very small -- just the connectivity info -- most of these conversions will be lossy. So, I'm not quite sure whether it makes sense to make a fully-featured parser for yEd a part of pandoc. I think it makes sense to have it as a separately maintained Hackage library, which will be used by pandoc to extract the transferable connectivity data.

@deepfire What do you think?

@deepfire
Copy link
Author

@snowleopard, personally, the interesting parts (outside the overlap you mention) are:

  1. node coordinates & dimensions
  2. plaintext node labels -- here are options, since yEd provides for multiple labels per node

Also, if I understand it correctly, the minimal transferable connectivity data is available as generic GraphML tags -- no yEd-specific nodes need to be descended into.

@snowleopard
Copy link
Member

@deepfire Thanks! This sounds reasonable. Coordinates are useful for many graph representations, and so are node labels.

Let's wait for a response from @thisiswhereitype. I believe is currently in full exam mode :)

@thisiswhereitype
Copy link
Collaborator

@deepfire @snowleopard and I have discussed and agreed the library was not ready for release on Hackage as it is still in need of some polishing as mentioned briefly above. The yEd idea looks interesting but I must finish my exams before I can take time to code. Please let me know if you have any questions.

@snowleopard
Copy link
Member

snowleopard commented Jan 20, 2017

Just to document here the list of things to be done before the release:

  • Switch to an existing XML library.
  • Switch the API from Strings to ByteStrings.
  • Make pangraph datatypes abstract, isolating them from users, and using nodes :: Pangraph -> [Node], and nodeLabel :: Node -> ByteString etc. to access the graph.
  • Implement yEd parsing as a useful test case.

@thisiswhereitype
Copy link
Collaborator

thisiswhereitype commented Feb 5, 2017

I have been investigating how to implement the successor to the Graph types in my rewrite considering the above comment. I was thinking data Pangraph = Graph [Node] [Arc] where Node Arc are just a list of ByteString attributes. But is Data.Array a viable option as it can avoid indexing a O(n) list which would allow the library to operate on a large graph?

Also I encountered a few negative opinions about ByteString :
https://mail.haskell.org/pipermail/haskell-cafe/2010-August/082033.html
http://stackoverflow.com/questions/7357775/text-or-bytestring
Considering this is it worth providing an API to multiple string types? While still working with ByteString underneath due to its speed.

@deepfire
Copy link
Author

deepfire commented Feb 5, 2017

The string landscape will change really, really drastically, once Backpack arrives -- due to a whole new dimension of modularity, so a personal guess is that it's not worth the added complexity for now.

@deepfire
Copy link
Author

deepfire commented Feb 5, 2017

Another point is that arrays are more-or-less awful for arbitrary mutation, whereas maps are better -- and Data.IntMap is particularly good -- IIRC it's very optimized.

@thisiswhereitype
Copy link
Collaborator

thisiswhereitype commented Feb 5, 2017

I will just use a ByteString interface for now, can always add more if needed. Reading about Int.Map it seems there will speed gains there. My concern was how will this effect the API but looking through Int.Map has toList methods. Int.Maps are the way to go it seems.

@snowleopard
Copy link
Member

I will just use a ByteString interface for now, can always add more if needed

@thisiswhereitype Sounds like a good plan.

My concern was how will this effect the API

As a first step I suggest to stick to the API I described here: #7 (comment)

Do the simplest useful thing first. You can expand later if need be, plus we'll iterate during the code review.

@thisiswhereitype
Copy link
Collaborator

I have been working on the agreed plan and after trying out ideas I have today committed the core types and getters API. https://github.com/thisiswhereitype/pangraph/blob/rewrite/src/Pangraph.hs
I have had a good go at using Hexml and Bytestring. Both of which I am working on to implement.

@thisiswhereitype
Copy link
Collaborator

thisiswhereitype commented Mar 7, 2017

I have another commit up on my development branch. The Pangraph API is updated with the discussion on the previous commit. Due to git being hard I also committed the some other parts I was working on. My plan is to abstract the concept of parsing an XML based file into an internal API for extracting a Pangraph type from it. This will take a little longer then writing a parser for Hexml -> Pangraph for graphML but it should hopefully save time in the long run for implementing other XML graph formats. You can find the beginnings of the code in src/Pangraph/XMLTemplate.hs

@thisiswhereitype
Copy link
Collaborator

thisiswhereitype commented Apr 23, 2017

I have just released a commit to my branch which meets the following agreed goals. Namely:

  • [ x] Switch to an existing XML library.
  • [ x] Switch the API from Strings to ByteStrings.
  • [ x] Make pangraph datatypes abstract, isolating them from users, and using nodes :: Pangraph -> [Node], and nodeLabel :: Node -> ByteString etc. to access the graph.
  • Implement yEd parsing as a useful test case.

It should be quick to add a parse template for Workcraft and I will investigate yEd. Also I need to rewrite most of the readme and Haddock seems to be necessary for a Hackage release. @deepfire @snowleopard
https://github.com/thisiswhereitype/pangraph/tree/rewrite

@snowleopard
Copy link
Member

@thisiswhereitype This is a good start, but I've added a bunch of comments to your commit here:

thisiswhereitype@30f9f45

@thisiswhereitype
Copy link
Collaborator

thisiswhereitype commented Jun 22, 2017

I have addressed the all the comments on the commit except the abstraction of Attributes, Keys and Values Which I will do next. Also related is discussion ongoing in #11 regarding algebraic graphs.
my forks master branch

@thisiswhereitype thisiswhereitype self-assigned this Jun 22, 2017
@thisiswhereitype thisiswhereitype modified the milestone: pangraph-0.1 Jun 26, 2017
@thisiswhereitype
Copy link
Collaborator

https://hackage.haskell.org/package/pangraph 🎉

@deepfire
Copy link
Author

deepfire commented Feb 3, 2018

Congratulations!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants