-
Notifications
You must be signed in to change notification settings - Fork 878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promise please #249
Comments
will add in next version. |
Hi! There is an estimated release date? |
@cernadasjuan No, actually. It is hard to support the promise API, because it is quite different from global function callback. |
If I may make a suggestion: why not just add an event emitter for the response? For example:
|
@camrymps It is another choice, but has nothing to do with promise, am I right? Event is not so straightforward compare to the callback, for instance if we have two or more different parsers it will be specified by the callback, you can handle them easily, but if you use event how could you do? |
@mike442144 You are correct. Callbacks are often easier to deal with, generally. I'm only offering this as a suggestion as events may be a better solution than promises for some people, even though promises are the best way to go. Anyway, an event scenario could look something like this: var crawler = new Crawler();
function doSomething(res) {
console.log(res);
}
function doSomethingElse(res) {
console.log(res);
}
crawler.on('response', (err, res) => {
[res.options.parser](res);
});
crawler.queue({ uri: 'https://google.com', parser: 'doSomething' });
crawler.queue({ uri: 'https://yahoo.com', parser: 'doSomethingElse' }); |
That's great! I think the only problem is the calling scope. it is similar with definition from scrapy, which requires a framework.
If not,say, we can force developers to define a class with member method, and crawler call them in callbacks. That's a great idea, let's see if we can do more. |
exports.fix = new Promise((resolve, reject) => {
const loop = sources['stuff'].urls.map((url) => {
c.queue([{
uri: url,
userAgent: userAgent,
referer: referer,
callback: exec
}]);
async function exec(err, res, done) {
if (err || res.statusCode !== 200) {
throw new Error(err)
}
const $ = res.$;
done();
}
});
c.once('error', (error) => reject(error));
c.once('drain', () => resolve(Promise.all(loop)));
}); Snippet from my own project. This code doesn't work, it's just to give you an idea on how to create the wrapper. |
Thanks @robjane, the problem is not how to create the wrapper, if you try your best to understand the conversations above. |
function getPageAsync(urls) {
return new Promise((resolve, reject) => {
const loop = urls.map((url) => {
return new Promise((resolve, reject) => {
c.queue([{
uri: url,
/* userAgent: userAgent,
referer: referer, */
callback: async function (err, res, done) {
if (err || res.statusCode !== 200) {
reject('err')
throw new Error(err)
}
const $ = res.$;
resolve($)
done();
}
}]);
})
});
c.once('error', (error) => reject(error));
c.once('drain', () => {
Promise.all(loop).then(results => {
resolve(results)
})
})
});
} |
If I may offer my opinion - I don't believe Promises should be added to this library. Far too often, libraries try to do too much and become feature bloated. There's an endless supply of libraries that allow wrapping callbacks which can be used with this beautiful library. I much prefer to use minimal and simple libraries. Instead, you could focus more on Plugin support and extending this library to meet user's needs for that. |
@bugs181 Can't agree more! what's import in this case is that |
Does anyone know how I can utilize the "done" in the callback to execute other functions when the crawler finishes its own work? Is it like done(otherFunc()) or so? Thanks. |
I recommend starting a new Github issue. You're asking an unrelated question which is something generally frowned upon. We refer to this as "thread hijacking". |
No description provided.
The text was updated successfully, but these errors were encountered: