Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added non-git source puller functionality #194

Open
wants to merge 45 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
ea87f2b
Command-line argument repo_dir is changed
sean-morris Jun 24, 2021
10385bb
Added non-git source puller functionality
sean-morris Jun 24, 2021
ab80daf
Added async functionality to non-git archives
sean-morris Aug 11, 2021
71ca2f4
Update nbgitpuller/plugin_helper.py
sean-morris Nov 3, 2021
ae66e53
Update nbgitpuller/hookspecs.py
sean-morris Nov 3, 2021
8934f5f
renamed and simplified the test_files
sean-morris Nov 4, 2021
ac2072c
added README to plugins
sean-morris Nov 4, 2021
a84096d
added docstring to progress_loop function
sean-morris Nov 4, 2021
86fd7bf
Update tests/test_download_puller.py
sean-morris Nov 4, 2021
c686651
Update tests/test_download_puller.py
sean-morris Nov 4, 2021
f8e04f1
Removed Downloader Plugins from Repo
sean-morris Nov 6, 2021
958b0b1
Added Custom Exception for Bad Provider
sean-morris Nov 6, 2021
2048e8d
Merge branch 'main' of https://github.com/jupyterhub/nbgitpuller
sean-morris Nov 8, 2021
398a03f
merged from master and fixed conflicts
sean-morris Nov 8, 2021
9a8fcab
Removed unused import from test file
sean-morris Nov 8, 2021
78e31c3
Added packages to dev-requirements.txt
sean-morris Nov 8, 2021
a131b93
Moved the two constants and REPO_PARENT_DIR out of __init__.py
sean-morris Nov 10, 2021
55da5e1
Revert some trivial formatting changes
consideRatio Nov 17, 2021
0ca6cf9
Apply suggestions from code review
sean-morris Nov 17, 2021
9e808e5
Changes from code review
sean-morris Nov 17, 2021
8d63ee4
Apply suggestions from code review
sean-morris Nov 19, 2021
deecc7b
Removed setTerminalVisibility from automatically opening in UI
sean-morris Nov 23, 2021
a9e08c4
Reverted a mistaken change to command-line args
sean-morris Nov 23, 2021
09c9249
Hookspecs renamed and documented
sean-morris Nov 23, 2021
0085fab
Hookspecs name and seperate helper_args
sean-morris Nov 23, 2021
88ec806
Renamed for clarity
sean-morris Nov 24, 2021
8592d1f
Seperated actual query_line_args from helper_args
sean-morris Nov 24, 2021
21d8f0f
fixed conflicts
sean-morris Nov 24, 2021
ab5dd10
Fixed tests
sean-morris Nov 24, 2021
e8ae5ca
Removed changes not meant to merged
sean-morris Nov 26, 2021
56ad1ee
Apply suggestions from code review
sean-morris Nov 29, 2021
af567ca
Refactored docstrings
sean-morris Nov 29, 2021
782a35b
Refactored docstrings
sean-morris Nov 29, 2021
d034d37
Merge branch 'non-git' of https://github.com/sean-morris/nbgitpuller …
sean-morris Nov 29, 2021
9729464
Fix temp download dir to use the package tempfile
sean-morris Nov 30, 2021
602ef01
provider is now contentProvider in the html/js/query parameters
sean-morris Nov 30, 2021
3ebdc7e
The download_func and download_func_params brought in separately
sean-morris Nov 30, 2021
e22d076
Moved the handle_files_helper in Class
sean-morris Dec 1, 2021
3b14405
Moved downloader-plugin util to own repo
sean-morris Dec 20, 2021
613f863
Moved downloader-plugin util to own repo
sean-morris Dec 20, 2021
5f39c68
Merge branch 'non-git' of https://github.com/sean-morris/nbgitpuller …
sean-morris Dec 20, 2021
f618560
Removed nested_asyncio from init.py
sean-morris Jan 11, 2022
367f3c7
Moved downloader-plugin handling to puller thread
sean-morris Jan 15, 2022
8893970
Moved downloader plugins handling to pull.py
sean-morris Jan 19, 2022
7590c38
Access downloader-plugin results from plugin instance variable
sean-morris Jan 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 23 additions & 14 deletions nbgitpuller/plugin_helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,18 +121,18 @@ async def execute_unarchive(ext, temp_download_file, temp_download_repo):
yield e


async def download_archive(repo_path, temp_download_file):
async def download_archive(repo=None, temp_download_file=None):
"""
This requests the file from the repo(url) given and saves it to the disk

:param str repo_path: the git repo path
:param str repo: the git repo path
:param str temp_download_file: the path to save the requested file to
"""
yield "Downloading archive ...\n"
try:
CHUNK_SIZE = 1024
async with aiohttp.ClientSession() as session:
async with session.get(repo_path) as response:
async with session.get(repo) as response:
with open(temp_download_file, 'ab') as fd:
count_chunks = 1
while True:
Expand Down Expand Up @@ -184,8 +184,8 @@ async def handle_files_helper(helper_args, query_line_args):
back to the origin

:param dict helper_args: key-value pairs including the:
- download function
- download parameters in the case
- download_func download function
- download_func_params download parameters in the case
consideRatio marked this conversation as resolved.
Show resolved Hide resolved
that the source needs to handle the download in a specific way(e.g. google
requires a confirmation of the download)
- extension (e.g. zip, tar) ] [OPTIONAL] this may or may not be included. If the repo name contains
Expand All @@ -203,7 +203,7 @@ async def handle_files_helper(helper_args, query_line_args):
provider = query_line_args["contentProvider"]
repo_parent_dir = helper_args["repo_parent_dir"]
origin_repo = f"{repo_parent_dir}{CACHED_ORIGIN_NON_GIT_REPO}{provider}/{url}/"
temp_download_dir = tempfile.TemporaryDirectory(dir="/tmp")
temp_download_dir = tempfile.TemporaryDirectory()
# you can optionally pass the extension of your archive(e.g zip) if it is not identifiable from the URL file name
# otherwise the extract_file_extension function will pull it off the repo name
if "extension" not in helper_args:
Expand All @@ -223,12 +223,20 @@ async def gener():
yield c

download_func = download_archive
download_args = query_line_args["repo"], temp_download_file
if "dowload_func" in helper_args:
download_func = helper_args["dowload_func"]
download_args = helper_args["dowload_func_params"]

async for d in download_func(*download_args):
download_args = {
"repo": query_line_args["repo"],
"temp_download_file": temp_download_file
}
# you can pass your own download function as well as download function parameters
# if they are different from the standard download function and parameters. Notice I add
# the temp_download_file to the parameters
if "download_func" in helper_args:
download_func = helper_args["download_func"]
if "download_func_params" in helper_args:
helper_args["download_func_params"]["temp_download_file"] = temp_download_file
download_args = helper_args["download_func_params"]

async for d in download_func(**download_args):
yield d

async for e in execute_unarchive(ext, temp_download_file, temp_download_dir.name):
Expand All @@ -245,11 +253,12 @@ async def gener():
yield "\n\n"
yield "Process Complete: Archive is finished importing into hub\n"
yield f"The directory of your download is: {dir_names[0]}\n"
temp_download_dir.cleanup() # remove temporary download space

except Exception as e:
logging.exception(e)
raise ValueError(e)

finally:
temp_download_dir.cleanup() # remove temporary download space
try:
async for line in gener():
helper_args["download_q"].put_nowait(line)
Expand Down