You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The RDF parser makes multiple passes through both bnode and simple triples many times. And each time
it does so the code is repeated -- something like this.
for(k, v)in std::mem::take(&mutself.bnode){match v.as_slice(){[[_,Term::RDF(VRDF::First), val],[_,Term::RDF(VRDF::Rest),Term::BNode(bnode_id)],// Some sequences have a Type List, some do not
..
] => {let some_seq = self.bnode_seq.remove(bnode_id);ifletSome(mut seq) = some_seq {
seq.push(val.clone());self.bnode_seq.insert(k.clone(), seq);
extended = true;}else{self.bnode.insert(k, v);}}
_ => {self.bnode.insert(k, v);}};}
I think we can replace this whole thing with a partition_map (or a map followed by a partition which would avoid bringing in itertools as a dependency).
L would be the thing (Atom, Axiom what ever) R would be Result<A>
L would then be added to whatever, R would be added back to where ever thing in self it came from in the first place
Would have then:
let(axiom, triples) = triples.partition_map(
|t|
match&t {[blah, blah, blah] => {// some_ok! would need to return Left or Right(Err), which should be okaysome_ok!(Left(Axiom(blah, blah, blah))}None => {// Can we take the t here? Ok(Right(&t))}})
self.axioms.extend(axiom);// We would have to clone the triple here because we only have a referenceself.triples.extend(triples.map(|e| e?.clone()));// or if we want to capture errors rather than crap out which is something// we can't do at the momentlet triples, errors = self.triples.partition(is_okay)
self.triples.extend(triples)
self.errors.extend(errors)
which is rather neater than what we have at the moment
The text was updated successfully, but these errors were encountered:
To get myself acquainted with the rdf::reader module, I am currently refactoring a few bits that could be replaced by e.g. deriving traits or using methods already implemented in vocab (see #97 and #99).
I agree with your observation.
I am also wondering if the code could be refactored to leverage the fact that the triple parser is streaming, by keeping a HashMap of triples sets indexed by subject and deserializing whenever the set contains e.g. all the triples necessary to instantiate a declaration.
I've started looking at this. There are definately some cleanups that I can do. The idea of partition_map may or may not work; so far it makes some things simpler, but a lot of loops are modifying other data structures by side-effect. I think I will re-write a few more functions so I'm clear whether it makes sense or not.
The problem with streaming, is that there are many places where you have to parse the whole stream first, to make the interpretation of later triples. This is, for example, why I parse all declarations (in the entire import closure) first, because many of the axiom patterns depend on understanding whether for example, the kind of a property. So I suspect that streaming RDF not going to bring a lot of benefits.
The RDF parser makes multiple passes through both bnode and simple triples many times. And each time
it does so the code is repeated -- something like this.
I think we can replace this whole thing with a
partition_map
(or amap
followed by a partition which would avoid bringing in itertools as a dependency).Partition map looks like this:
L
would be the thing (Atom
,Axiom
what ever)R
would beResult<A>
L
would then be added to whatever,R
would be added back to where ever thing in self it came from in the first placeWould have then:
which is rather neater than what we have at the moment
The text was updated successfully, but these errors were encountered: