-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Wiki retreval service #324
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the inline comment. Other LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about the functions provided in this PR as follows:
- Why do we provide
wiki_get_infobox
,wiki_get_all_wikipedia_tables
,wiki_get_page_images_with_captions
andwiki_get_page_content_by_paragraph
respectively rather than combining them into one function that provides wikipedia search functionality, and returns all available information? For example:
{
"infobox": ...
"content": ... # the tables should be in the content
"image": ...
}
If I was a user and want to search in wikipedia, a simple wikipedia_search
function will be enough, and too much categorization can create a barrier to understanding. For example:
- What do the
infobox
andtables
stand for? - For users, how to decide which function to use?
-
The functionality partly overlaps with that of the service function digest_webpage. Try to reuse it rather than reimplementing.
-
I wonder how do we handle the fuzzy search in wikipedia. For example, sometimes wikipedia will return a candidate list rather than skip to the corresponding web page as follows.
|
# Conflicts: # src/agentscope/service/__init__.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
name: Pull Request
about: Create a pull request
Description
wiki.py implements Wikipedia retrieval including text, category list, infobox, image and table.
wiki_test.py implement unit tests.
Checklist
Please check the following items before code is ready to be reviewed.