Updated module for handling large datsets #79

DarshAgrawal14 · 2024-10-07T14:13:43Z

Fixed Issue : #2
I initially started with chunking approach divided the dataset into chunks but due creation of chunks the class was becoming complex and slow
Therefore i used Dask.dataframes instead of pandas this increases memory efficiency and can handle any dataset without hindering with other functions

Darsh Agrawal , GSSoC 24 extd contributor

tried chunking the dataset but the it was getting complex therefore used dask dataframe for better memory efficiency for loading and handling large datasets

ombhojane

give steps for testing

DarshAgrawal14 · 2024-10-08T16:27:18Z

Actually i added this function in the original module , but since then significant changes have been made such as methods like .analyse which got inlcuded when i pulled origin before commiting , after changes were made to repo i think the module itself is not working , if you want can send you test.py to test my method , because currently the module itself is not working for me

ombhojane · 2024-10-09T16:20:21Z

yes @DarshAgrawal14, I can see merge conflicts. please do a git pull and have your code changes there

DarshAgrawal14 · 2024-10-10T07:19:44Z

@ombhojane i have resolved the issue , the PR is ready to merge

DarshAgrawal14 · 2024-10-12T07:58:41Z

@ombhojane , If you require any changes please let me know..

ombhojane

thanks for contributing!

Updated module for handling large datsets

c6066c3

tried chunking the dataset but the it was getting complex therefore used dask dataframe for better memory efficiency for loading and handling large datasets

ombhojane reviewed Oct 8, 2024

View reviewed changes

DarshAgrawal14 and others added 2 commits October 10, 2024 12:40

Merge branch 'main' into main

d0c84a7

Update requirements.txt

3e9808f

ombhojane approved these changes Oct 12, 2024

View reviewed changes

ombhojane assigned DarshAgrawal14 Oct 12, 2024

ombhojane added enhancement New feature or request gssoc-ext level3 hacktoberfest-accepted labels Oct 12, 2024

ombhojane merged commit eaaefb0 into ombhojane:main Oct 12, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated module for handling large datsets #79

Updated module for handling large datsets #79

DarshAgrawal14 commented Oct 7, 2024 •

edited

Loading

ombhojane left a comment •

edited

Loading

DarshAgrawal14 commented Oct 8, 2024

ombhojane commented Oct 9, 2024

DarshAgrawal14 commented Oct 10, 2024

DarshAgrawal14 commented Oct 12, 2024

ombhojane left a comment

Updated module for handling large datsets #79

Updated module for handling large datsets #79

Conversation

DarshAgrawal14 commented Oct 7, 2024 • edited Loading

ombhojane left a comment • edited Loading

Choose a reason for hiding this comment

DarshAgrawal14 commented Oct 8, 2024

ombhojane commented Oct 9, 2024

DarshAgrawal14 commented Oct 10, 2024

DarshAgrawal14 commented Oct 12, 2024

ombhojane left a comment

Choose a reason for hiding this comment

DarshAgrawal14 commented Oct 7, 2024 •

edited

Loading

ombhojane left a comment •

edited

Loading