Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

??- returns only the last tuple of a sequence #294

Open
ghost opened this issue Nov 1, 2015 · 19 comments
Open

??- returns only the last tuple of a sequence #294

ghost opened this issue Nov 1, 2015 · 19 comments

Comments

@ghost
Copy link

ghost commented Nov 1, 2015

The following input on cascalog.playground:

(??-
   (<- [?p ?age]
       (age ?p ?age)))

returns

 [["luanne" 36] ["luanne" 36] ["luanne" 36]  ["luanne" 36] ["luanne" 36] ["luanne" 36] ["luanne" 36] ["luanne" 36] ["luanne" 36] ["luanne" 36]  ]```

However, running 
```clojure
  (?- (stdout)
   (<- [?p ?age]
       (p/age ?p ?age)))

gives the correct result (10 unique names and ages).

@sritchie
Copy link
Collaborator

sritchie commented Nov 1, 2015

What cascading version and cascalog versions are you using? This reminds me of an iterator bug we fixed a while ago.


Sent from Mailbox

On Sat, Oct 31, 2015 at 6:15 PM, Timothy Galebach
[email protected] wrote:

The following input on cascalog.playground:

(??-
   (<- [?p ?age]
       (age ?p ?age)))

returns

 [["luanne" 36] ["luanne" 36] ["luanne" 36]  ["luanne" 36] ["luanne" 36] ["luanne" 36] ["luanne" 36] ["luanne" 36] ["luanne" 36] ["luanne" 36]  ]```
However, running 
```clojure
  (?- (stdout)
   (<- [?p ?age]
       (p/age ?p ?age)))

gives the correct result (10 unique names and ages).

Reply to this email directly or view it on GitHub:
#294

@ghost
Copy link
Author

ghost commented Nov 1, 2015

I'm using cascalog 2.1.1.

I haven't explicitly declared anything wrt cascading; I've just been following the project's readme to get started. Relevant portion of project.clj below:

  :dependencies [[org.clojure/clojure "1.7.0"]
                 [cascalog "2.1.1"]]
  :profiles { :dev {:dependencies [[org.apache.hadoop/hadoop-core "1.2.1"]]}}
  :jvm-opts ["-Xms768m" "-Xmx768m"])

@sritchie
Copy link
Collaborator

sritchie commented Nov 1, 2015

Yeah, this is fixed in 3.0.0-SNAPSHOT, which I think I the latest version off of master. Want to give that a shot? We're due for a new release for sure.


Sent from Mailbox

On Sat, Oct 31, 2015 at 5:22 PM, Timothy Galebach
[email protected] wrote:

I'm using cascalog 2.1.1.
I haven't explicitly declared anything wrt cascading; I've just been following the project's readme to get started. Relevant portion of project.clj below:

  :dependencies [[org.clojure/clojure "1.7.0"]
                 [cascalog "2.1.1"]]
  :profiles { :dev {:dependencies [[org.apache.hadoop/hadoop-core "1.2.1"]]}}
  :jvm-opts ["-Xms768m" "-Xmx768m"])

Reply to this email directly or view it on GitHub:
#294 (comment)

@ghost
Copy link
Author

ghost commented Nov 1, 2015

Same issue occurs with these dependencies:

  :dependencies [[org.clojure/clojure "1.7.0"]
                 [cascalog/cascalog-core "3.0.0-SNAPSHOT"]]

Is there a working project.clj I could take a look at? Once this gets resolved I'm guessing it will come down to a documentation issue, and I'm happy to submit a pull request for that. I also had some initial frustrations because the documentation doesn't mention needing to run (bootstrap-emacs) in cider, so that should probably be fixed as well.

@sritchie
Copy link
Collaborator

sritchie commented Nov 1, 2015

For some reason my internet connection's preventing me from launching a repl (by blocking dependency downloads in leiningen), but I THINK, based on a different bug, I have a guess about what's causing this. Can you give this branch a try?

#295

Check out the discussion here: #251

Along with this fix: #280

for some more background on the issue. Also, Any updates on documentation you want to send over would be huge.

@ghost
Copy link
Author

ghost commented Nov 1, 2015

Trying that branch now, trying to build it and put in the local repo, but running into the issue that the sub-modules (cascalog-checkpoint, midje, etc) depend on cascalog-core, so I'm not able to compile them initially. I don't usually structure projects like this--how do you compile this structure?

@sritchie
Copy link
Collaborator

sritchie commented Nov 1, 2015

Ah, sorry- first, run "lein sub install" in the base directory. Thanks for trying this out!


Sent from Mailbox

On Sun, Nov 1, 2015 at 12:45 PM, Timothy Galebach
[email protected] wrote:

Trying that branch now, trying to build it and put in the local repo, but running into the issue that the sub-modules (cascalog-checkpoint, midje, etc) depend on cascalog-core, so I'm not able to compile them initially. I don't usually structure projects like this--how do you compile this structure?

Reply to this email directly or view it on GitHub:
#294 (comment)

@ghost
Copy link
Author

ghost commented Nov 1, 2015

OK, that works for compilation/local repo installation. Unfortunately the bug still persists. If it's helpful, the log output in the repl says that Cascading 2.5.3 is being used currently.

Thanks for the help so far! Have a project I'm transitioning over to hadoop as it's grown a lot, and I'd really like to go with cascalog on it, so hopefully can sort this out.

@sritchie
Copy link
Collaborator

sritchie commented Nov 6, 2015

This looks very related to #292. The folks over at that ticket figured out that this issue only shows up with Clojure 1.7.0.

@ghost
Copy link
Author

ghost commented Nov 6, 2015

OK, I'll try going back to 1.6, thanks!

@ghost
Copy link
Author

ghost commented Nov 7, 2015

That fixed it. I'm going to submit a pull request for docs that are a bit more current in a bit.

@metasoarous
Copy link

This just bit me as well; Can confirm that switching to 1.6 fixes the issue, but it would be nice to have a 1.7 compatible fix.

@sritchie
Copy link
Collaborator

@metasoarous totally hear you. I'm happy to review any pull requests from folks who want to take this on! I'm not using Cascalog for my work these days, so I don't have time to fix bugs like this myself, but I am available on a consulting basis to fix bugs or add features.

@metasoarous
Copy link

Hi @sritchie: I appreciate the offer. Right now, 1.7 isn't critical for us, but if it becomes necessary we'll keep that in mind. I mostly just wanted to add a second data point for posterity's sake :-)

@jiyouyou125
Copy link
Contributor

http://dev.clojure.org/jira/browse/CLJ-1738

1.7 Compatibility Notes: iterator-seq change, it could help ?

Direction of this ticket changed at Rich's request.

Prior description capture here:

Clojure code that uses iterator-seq to wrap Java iterators that return the same mutable object on every call are broken by the chunked iterator-seq changes from CLJ-1669.

Some examples where this occurs:

Hadoop ReduceContextImpl$ValueIterator
Mahout DenseVector$AllIterator/NonDefaultIterator
LensKit FastIterators
Cause: In 1.6, the iterator-seq wrapper could be used with these to consume a sequence over these iterators element-by-element. In 1.7 RC1, iterator-seq produces a chunked sequence. Because next() is called 32 times on the iterator before the first value can be retrieved from the seq, and the same mutable object is returned every time, code doing this now receives different (incorrect) results.

Approach: Switch iterator-seq back to non-chunked and change eduction to use the chunking iterator-seq strategy as that was the original target. Retain the use of the chunked iterator seq in sequence over the TransformerIterator.

@jiyouyou125
Copy link
Contributor

only ??- ??<- use iteraltor-seq

@sritchie
Copy link
Collaborator

sritchie commented Dec 2, 2015

@Nightlord this is really interesting, and probably the reason for the bug. Looks like a change like this may work:

(defn iter-seq [iter f]
  (if (.hasNext iter)
    (lazy-seq
      (cons (f (.next iter))
            (iter-seq iter f)))))

@jiyouyou125
Copy link
Contributor

@sritchie it fix ??-, maybe not enough good, but sure it's problem.
#296

@jiyouyou125
Copy link
Contributor

@sritchie fix ??-, ci build problem, add profile 1.6,1.7.

build success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants