Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(datasets): lazily load datasets in init files #277

Merged
merged 29 commits into from
Jul 31, 2023

Conversation

deepyaman
Copy link
Member

@deepyaman deepyaman commented Jul 22, 2023

Description

  1. Make imports faster, especially if heavy dependencies for unused imports (in the same dataset category) are installed. E.g. if pandas.ExcelDataSet has dependencies (that are installed) that are very expensive to load, but you only need pandas.CSVDataSet, you won't spend time importing pandas.ExcelDataSet dependencies.
  2. Provide more meaningful error messages. If you don't have the dependencies for pandas.ExcelDataSet installed, you should get an error corresponding to the missing modules, not a cryptic message saying that the ExcelDataSet attribute doesn't exist.

Development notes

Valid import behavior

On perf/datasets/lazy-loader branch:

Output of python -X importtime -c'import kedro_datasets.pandas;kedro_datasets.pandas.CSVDataSet':

import time: self [us] | cumulative | imported package
import time:       229 |        229 |   _io
import time:        32 |         32 |   marshal
import time:       330 |        330 |   posix
import time:       815 |       1405 | _frozen_importlib_external
import time:       574 |        574 |   time
import time:       186 |        759 | zipimport
import time:       163 |        163 |     _codecs
import time:      1011 |       1173 |   codecs
import time:      1268 |       1268 |   encodings.aliases
import time:      2957 |       5398 | encodings
import time:       841 |        841 | encodings.utf_8
import time:        95 |         95 | _signal
import time:        53 |         53 |     _abc
import time:       659 |        711 |   abc
import time:       822 |       1533 | io
import time:        38 |         38 |       _stat
import time:       429 |        467 |     stat
import time:      1190 |       1190 |     _collections_abc
import time:       422 |        422 |       genericpath
import time:       686 |       1107 |     posixpath
import time:       967 |       3730 |   os
import time:       729 |        729 |   _sitebuiltins
import time:      1464 |       1464 |   _distutils_hack
import time:       148 |        148 |   sitecustomize
import time:        34 |         34 |   usercustomize
import time:      4779 |      10881 | site
import time:      1426 |       1426 |   kedro_datasets
import time:        63 |         63 |       itertools
import time:       399 |        399 |       keyword
import time:        97 |         97 |         _operator
import time:       463 |        560 |       operator
import time:       431 |        431 |       reprlib
import time:        40 |         40 |       _collections
import time:      1491 |       2982 |     collections
import time:       491 |        491 |     collections.abc
import time:       642 |        642 |         types
import time:        34 |         34 |         _functools
import time:       680 |       1355 |       functools
import time:       594 |       1949 |     contextlib
import time:       917 |        917 |       enum
import time:        35 |         35 |         _sre
import time:       507 |        507 |           sre_constants
import time:       485 |        991 |         sre_parse
import time:       491 |       1516 |       sre_compile
import time:        42 |         42 |       _locale
import time:       391 |        391 |       copyreg
import time:       670 |       3534 |     re
import time:      1533 |      10487 |   typing
import time:        41 |         41 |       _ast
import time:       950 |        990 |     ast
import time:       761 |        761 |       warnings
import time:       515 |       1276 |     importlib
import time:       292 |        292 |       importlib._abc
import time:       489 |        780 |     importlib.util
import time:       772 |        772 |           _opcode
import time:       463 |       1235 |         opcode
import time:       559 |       1794 |       dis
import time:       384 |        384 |       importlib.machinery
import time:       525 |        525 |           token
import time:       748 |       1273 |         tokenize
import time:       487 |       1760 |       linecache
import time:      1043 |       4979 |     inspect
import time:       465 |       8489 |   lazy_loader
import time:      3573 |      23973 | kedro_datasets.pandas
import time:       376 |        376 |   traceback
import time:       401 |        401 |     _weakrefset
import time:       454 |        854 |   weakref
import time:        19 |         19 |     _string
import time:       677 |        696 |   string
import time:       669 |        669 |   threading
import time:        22 |         22 |   atexit
import time:      2026 |       4640 | logging
import time:        63 |         63 |       org
import time:        15 |         77 |     org.python
import time:        13 |         90 |   org.python.core
import time:       493 |        583 | copy
import time:       497 |        497 |   fnmatch
import time:        53 |         53 |     _winapi
import time:        41 |         41 |     nt
import time:        39 |         39 |     nt
import time:        35 |         35 |     nt
import time:        35 |         35 |     nt
import time:       652 |        852 |   ntpath
import time:        34 |         34 |   errno
import time:       496 |        496 |     urllib
import time:       796 |       1291 |   urllib.parse
import time:      1213 |       3886 | pathlib
import time:       908 |        908 |       _csv
import time:       918 |       1825 |     csv
import time:       547 |        547 |     email
import time:      2038 |       2038 |       binascii
import time:       441 |        441 |         zlib
import time:       372 |        372 |           _compression
import time:      1541 |       1541 |           _bz2
import time:       474 |       2386 |         bz2
import time:      1684 |       1684 |           _lzma
import time:       414 |       2097 |         lzma
import time:       536 |       5459 |       shutil
import time:       434 |        434 |         _struct
import time:       333 |        767 |       struct
import time:       728 |       8990 |     zipfile
import time:       871 |        871 |     textwrap
import time:       222 |        222 |         uu
import time:       182 |        182 |         quopri
import time:       848 |        848 |             math
import time:       582 |        582 |               _bisect
import time:       473 |       1055 |             bisect
import time:       853 |        853 |             _random
import time:       781 |        781 |             _sha512
import time:       469 |       4004 |           random
import time:      1221 |       1221 |             _socket
import time:       710 |        710 |               select
import time:       593 |       1302 |             selectors
import time:       918 |        918 |             array
import time:       766 |       4206 |           socket
import time:       754 |        754 |             _datetime
import time:       808 |       1562 |           datetime
import time:       672 |        672 |               locale
import time:       654 |       1325 |             calendar
import time:       371 |       1696 |           email._parseaddr
import time:       282 |        282 |               base64
import time:       304 |        585 |             email.base64mime
import time:       328 |        328 |             email.quoprimime
import time:       424 |        424 |             email.errors
import time:       314 |        314 |             email.encoders
import time:       413 |       2062 |           email.charset
import time:       340 |      13868 |         email.utils
import time:       600 |        600 |           email.header
import time:       336 |        936 |         email._policybase
import time:       327 |        327 |         email._encoded_words
import time:       169 |        169 |         email.iterators
import time:       419 |      16119 |       email.message
import time:       269 |        269 |         importlib.metadata._functools
import time:       294 |        563 |       importlib.metadata._text
import time:       390 |      17071 |     importlib.metadata._adapters
import time:       397 |        397 |     importlib.metadata._meta
import time:       448 |        448 |     importlib.metadata._collections
import time:       184 |        184 |     importlib.metadata._itertools
import time:       580 |        580 |     importlib.abc
import time:       942 |      31851 |   importlib.metadata
import time:      1018 |       1018 |           _json
import time:       469 |       1486 |         json.scanner
import time:       421 |       1906 |       json.decoder
import time:       385 |        385 |       json.encoder
import time:      1323 |       3613 |     json
import time:       802 |       4414 |   fsspec._version
import time:       355 |        355 |       concurrent
import time:       402 |        402 |       concurrent.futures._base
import time:       384 |       1140 |     concurrent.futures
import time:       719 |        719 |           _heapq
import time:       293 |       1012 |         heapq
import time:       389 |        389 |         _queue
import time:       252 |       1652 |       queue
import time:       412 |       2064 |     concurrent.futures.thread
import time:       405 |       3608 |   fsspec.caching
import time:       270 |        270 |   fsspec.callbacks
import time:       328 |        328 |       __future__
import time:     17757 |      17757 |         _hashlib
import time:       420 |        420 |         _blake2
import time:       282 |      18459 |       hashlib
import time:       358 |      19144 |     fsspec.utils
import time:       465 |        465 |       glob
import time:      1026 |       1026 |         configparser
import time:       133 |       1158 |       fsspec.config
import time:       122 |        122 |       fsspec.dircache
import time:       207 |        207 |       fsspec.transaction
import time:       522 |       2472 |     fsspec.spec
import time:        40 |         40 |     isal
import time:       466 |        466 |     gzip
import time:        38 |         38 |     lzmaffi
import time:        34 |         34 |     snappy
import time:        32 |         32 |       lz4
import time:         7 |         38 |     lz4.frame
import time:        29 |         29 |     zstandard
import time:       202 |      22460 |   fsspec.compression
import time:       230 |        230 |     fsspec.registry
import time:       170 |        400 |   fsspec.core
import time:       569 |        569 |           signal
import time:      1105 |       1105 |           fcntl
import time:        35 |         35 |           msvcrt
import time:       579 |        579 |           _posixsubprocess
import time:       412 |       2698 |         subprocess
import time:      2814 |       2814 |           _ssl
import time:      1418 |       4231 |         ssl
import time:       423 |        423 |         asyncio.constants
import time:       202 |        202 |             asyncio.format_helpers
import time:       200 |        401 |           asyncio.base_futures
import time:       153 |        153 |           asyncio.log
import time:       325 |        878 |         asyncio.coroutines
import time:       748 |        748 |             _contextvars
import time:       376 |       1124 |           contextvars
import time:       332 |        332 |             asyncio.exceptions
import time:       305 |        305 |             asyncio.base_tasks
import time:       496 |       1132 |           _asyncio
import time:       362 |       2617 |         asyncio.events
import time:       253 |        253 |         asyncio.futures
import time:       319 |        319 |         asyncio.protocols
import time:       381 |        381 |           asyncio.transports
import time:       303 |        684 |         asyncio.sslproto
import time:       335 |        335 |             asyncio.mixins
import time:       408 |        408 |             asyncio.tasks
import time:       281 |       1023 |           asyncio.locks
import time:       674 |       1697 |         asyncio.staggered
import time:       341 |        341 |         asyncio.trsock
import time:       538 |      14674 |       asyncio.base_events
import time:       236 |        236 |       asyncio.runners
import time:       357 |        357 |       asyncio.queues
import time:       319 |        319 |       asyncio.streams
import time:       234 |        234 |       asyncio.subprocess
import time:       175 |        175 |       asyncio.threads
import time:       229 |        229 |         asyncio.base_subprocess
import time:       366 |        366 |         asyncio.selector_events
import time:       521 |       1116 |       asyncio.unix_events
import time:       361 |      17469 |     asyncio
import time:       101 |      17569 |   fsspec.exceptions
import time:       267 |        267 |   fsspec.mapping
import time:     18680 |      99515 | fsspec
import time:       228 |        228 |       numpy._utils
import time:       831 |       1058 |     numpy._globals
import time:       331 |        331 |     numpy.exceptions
import time:       108 |        108 |     numpy._distributor_init
import time:       167 |        167 |     numpy.__config__
import time:       239 |        239 |         numpy._version
import time:        14 |         14 |         numpy._version_meson
import time:       238 |        490 |       numpy.version
import time:       369 |        369 |           numpy._utils._inspect
import time:       608 |        608 |             numpy.core._exceptions
import time:       287 |        287 |             numpy.dtypes
import time:     15953 |      16847 |           numpy.core._multiarray_umath
import time:       572 |      17787 |         numpy.core.overrides
import time:       959 |      18746 |       numpy.core.multiarray
import time:       278 |        278 |       numpy.core.umath
import time:       347 |        347 |         numbers
import time:       206 |        206 |         numpy.core._string_helpers
import time:        41 |         41 |               pickle5
import time:       614 |        614 |                 _compat_pickle
import time:      1435 |       1435 |                 _pickle
import time:       124 |        124 |                     org
import time:        32 |        156 |                   org.python
import time:        30 |        185 |                 org.python.core
import time:       834 |       3067 |               pickle
import time:       223 |       3331 |             numpy.compat.py3k
import time:       292 |       3622 |           numpy.compat
import time:       311 |        311 |           numpy.core._dtype
import time:       690 |       4622 |         numpy.core._type_aliases
import time:       508 |       5682 |       numpy.core.numerictypes
import time:       524 |        524 |               numpy.core._ufunc_config
import time:       554 |       1078 |             numpy.core._methods
import time:      1038 |       2115 |           numpy.core.fromnumeric
import time:       484 |       2599 |         numpy.core.shape_base
import time:       425 |        425 |         numpy.core.arrayprint
import time:       221 |        221 |         numpy.core._asarray
import time:      2430 |       5673 |       numpy.core.numeric
import time:       594 |        594 |       numpy.core.defchararray
import time:       333 |        333 |       numpy.core.records
import time:       233 |        233 |       numpy.core.memmap
import time:       247 |        247 |       numpy.core.function_base
import time:       109 |        109 |       numpy.core._machar
import time:       320 |        320 |       numpy.core.getlimits
import time:       280 |        280 |       numpy.core.einsumfunc
import time:      1196 |       1196 |         numpy.core._multiarray_tests
import time:      1086 |       2281 |       numpy.core._add_newdocs
import time:       466 |        466 |       numpy.core._add_newdocs_scalars
import time:       212 |        212 |       numpy.core._dtype_ctypes
import time:      2284 |       2284 |           _ctypes
import time:       544 |        544 |           ctypes._endian
import time:      1096 |       3923 |         ctypes
import time:       422 |       4345 |       numpy.core._internal
import time:       203 |        203 |       numpy._pytesttester
import time:       557 |      41042 |     numpy.core
import time:       680 |        680 |       numpy.lib.mixins
import time:       189 |        189 |           numpy.lib.ufunclike
import time:       342 |        531 |         numpy.lib.type_check
import time:      1682 |       2212 |       numpy.lib.scimath
import time:       321 |        321 |                   numpy.lib.stride_tricks
import time:       182 |        502 |                 numpy.lib.twodim_base
import time:      2124 |       2124 |                 numpy.linalg._umath_linalg
import time:       498 |        498 |                   numpy._typing._nested_sequence
import time:       138 |        138 |                   numpy._typing._nbit
import time:       498 |        498 |                   numpy._typing._char_codes
import time:       187 |        187 |                   numpy._typing._scalars
import time:        94 |         94 |                   numpy._typing._shape
import time:       788 |        788 |                   numpy._typing._dtype_like
import time:       926 |        926 |                   numpy._typing._array_like
import time:       573 |       3698 |                 numpy._typing
import time:       903 |       7226 |               numpy.linalg.linalg
import time:       172 |       7397 |             numpy.linalg
import time:       574 |       7970 |           numpy.matrixlib.defmatrix
import time:       248 |       8218 |         numpy.matrixlib
import time:       302 |        302 |           numpy.lib.histograms
import time:       753 |       1055 |         numpy.lib.function_base
import time:       470 |       9742 |       numpy.lib.index_tricks
import time:       463 |        463 |       numpy.lib.nanfunctions
import time:       662 |        662 |       numpy.lib.shape_base
import time:       442 |        442 |       numpy.lib.polynomial
import time:      1293 |       1293 |         platform
import time:       413 |       1706 |       numpy.lib.utils
import time:       319 |        319 |       numpy.lib.arraysetops
import time:       263 |        263 |         numpy.lib.format
import time:       231 |        231 |         numpy.lib._datasource
import time:       389 |        389 |         numpy.lib._iotools
import time:       410 |       1292 |       numpy.lib.npyio
import time:       172 |        172 |       numpy.lib.arrayterator
import time:       126 |        126 |       numpy.lib.arraypad
import time:        94 |         94 |       numpy.lib._version
import time:       437 |      18340 |     numpy.lib
import time:       931 |        931 |         numpy.fft._pocketfft_internal
import time:       667 |       1598 |       numpy.fft._pocketfft
import time:       855 |        855 |       numpy.fft.helper
import time:       302 |       2754 |     numpy.fft
import time:       339 |        339 |         numpy.polynomial.polyutils
import time:       434 |        434 |         numpy.polynomial._polybase
import time:      3363 |       4136 |       numpy.polynomial.polynomial
import time:       343 |        343 |       numpy.polynomial.chebyshev
import time:       273 |        273 |       numpy.polynomial.legendre
import time:       277 |        277 |       numpy.polynomial.hermite
import time:       324 |        324 |       numpy.polynomial.hermite_e
import time:       262 |        262 |       numpy.polynomial.laguerre
import time:       792 |       6404 |     numpy.polynomial
import time:       126 |        126 |               backports_abc
import time:      2286 |       2411 |             numpy.random._common
import time:       470 |        470 |               hmac
import time:       345 |        814 |             secrets
import time:      1353 |       4578 |           numpy.random.bit_generator
import time:      1374 |       1374 |           numpy.random._bounded_integers
import time:       606 |        606 |           numpy.random._mt19937
import time:      5343 |      11899 |         numpy.random.mtrand
import time:       805 |        805 |         numpy.random._philox
import time:       914 |        914 |         numpy.random._pcg64
import time:      1037 |       1037 |         numpy.random._sfc64
import time:      2792 |       2792 |         numpy.random._generator
import time:       574 |      18019 |       numpy.random._pickle
import time:       309 |      18328 |     numpy.random
import time:       425 |        425 |     numpy.ctypeslib
import time:      2833 |       2833 |       numpy.ma.core
import time:       864 |        864 |       numpy.ma.extras
import time:       370 |       4066 |     numpy.ma
import time:     19279 |     112295 |   numpy
import time:       534 |        534 |     pytz.exceptions
import time:       238 |        238 |     pytz.lazy
import time:      5800 |       5800 |     pytz.tzinfo
import time:       122 |        122 |     pytz.tzfile
import time:       757 |       7450 |   pytz
import time:       338 |        338 |     dateutil._version
import time:       469 |        807 |   dateutil
import time:      1545 |       1545 |     pandas._typing
import time:       387 |        387 |     pandas.compat._constants
import time:       382 |        382 |     pandas.compat.compressors
import time:      3998 |       3998 |                             pandas._libs.tslibs.np_datetime
import time:      3379 |       7376 |                           pandas._libs.tslibs.dtypes
import time:      5791 |       5791 |                             pandas._libs.tslibs.base
import time:      1784 |       1784 |                                 pandas._libs.tslibs.nattype
import time:       228 |        228 |                                       pandas.util._exceptions
import time:      1515 |       1515 |                                       pandas.util.version
import time:       360 |       2103 |                                     pandas.compat._optional
import time:       785 |        785 |                                         sysconfig
import time:       485 |        485 |                                         _sysconfigdata__darwin_darwin
import time:       299 |        299 |                                         _osx_support
import time:       618 |       2186 |                                       zoneinfo._tzpath
import time:       285 |        285 |                                       zoneinfo._common
import time:      1084 |       1084 |                                       _zoneinfo
import time:       548 |       4102 |                                     zoneinfo
import time:       736 |        736 |                                         six
import time:        19 |         19 |                                         six.moves
import time:       272 |        272 |                                         dateutil.tz._common
import time:       213 |        213 |                                         dateutil.tz._factories
import time:        11 |         11 |                                           six.moves.winreg
import time:       135 |        145 |                                         dateutil.tz.win
import time:       709 |       2092 |                                       dateutil.tz.tz
import time:       298 |       2390 |                                     dateutil.tz
import time:      1779 |      10373 |                                   pandas._libs.tslibs.timezones
import time:       998 |        998 |                                     pandas._libs.tslibs.ccalendar
import time:       715 |        715 |                                     _strptime
import time:       605 |        605 |                                         pandas._config.config
import time:       281 |        281 |                                         pandas._config.dates
import time:       103 |        103 |                                         pandas._config.display
import time:       225 |       1214 |                                       pandas._config
import time:       100 |       1313 |                                     pandas._config.localization
import time:      1778 |       4804 |                                   pandas._libs.tslibs.fields
import time:      3205 |      18380 |                                 pandas._libs.tslibs.timedeltas
import time:      2091 |       2091 |                                 pandas._libs.tslibs.tzconversion
import time:        48 |         48 |                                 backports_abc
import time:      3030 |      25332 |                               pandas._libs.tslibs.timestamps
import time:        41 |         41 |                               backports_abc
import time:       218 |        218 |                               dateutil.easter
import time:       100 |        100 |                                 dateutil._common
import time:       231 |        331 |                               dateutil.relativedelta
import time:       729 |        729 |                               pandas._libs.properties
import time:     11839 |      38487 |                             pandas._libs.tslibs.offsets
import time:       114 |        114 |                               backports_abc
import time:      2200 |       2200 |                                 _decimal
import time:       465 |       2664 |                               decimal
import time:       416 |        416 |                                 dateutil.parser._parser
import time:       192 |        192 |                                 dateutil.parser.isoparser
import time:       153 |        760 |                               dateutil.parser
import time:      5975 |       5975 |                               pandas._libs.tslibs.strptime
import time:      2775 |      12286 |                             pandas._libs.tslibs.parsing
import time:      3934 |      60497 |                           pandas._libs.tslibs.conversion
import time:      2032 |       2032 |                           pandas._libs.tslibs.period
import time:      1955 |       1955 |                           pandas._libs.tslibs.vectorized
import time:       512 |      72370 |                         pandas._libs.tslibs
import time:        40 |      72409 |                       pandas._libs.tslibs.nattype
import time:       865 |        865 |                       pandas._libs.ops_dispatch
import time:     11310 |      84583 |                     pandas._libs.missing
import time:      6383 |      90966 |                   pandas._libs.hashtable
import time:      2398 |       2398 |                   pandas._libs.algos
import time:     10003 |     103366 |                 pandas._libs.interval
import time:       536 |     103902 |               pandas._libs
import time:        17 |     103918 |             pandas._libs.properties
import time:       994 |     104912 |           pandas.util._decorators
import time:       322 |        322 |               pandas.core
import time:       450 |        772 |             pandas.core.util
import time:      2517 |       2517 |             pandas._libs.lib
import time:      1132 |       1132 |             pandas._libs.hashing
import time:       433 |        433 |               pandas.core.dtypes
import time:       511 |        511 |                 pandas.errors
import time:       350 |        350 |                 pandas.core.dtypes.generic
import time:       299 |       1159 |               pandas.core.dtypes.base
import time:       198 |        198 |                 pandas.core.dtypes.inference
import time:       779 |        977 |               pandas.core.dtypes.dtypes
import time:       631 |       3199 |             pandas.core.dtypes.common
import time:       465 |       8083 |           pandas.core.util.hashing
import time:       325 |     113319 |         pandas.util
import time:        16 |     113334 |       pandas.util.version
import time:       238 |     113572 |     pandas.compat.numpy
import time:        26 |         26 |         gc
import time:       854 |        854 |         pyarrow._generated_version
import time:        74 |         74 |           backports_abc
import time:        41 |         41 |           pickle5
import time:        37 |         37 |           cloudpickle
import time:       320 |        320 |           pyarrow.util
import time:        28 |         28 |           pyarrow.collections
import time:        17 |         17 |           pyarrow.enum
import time:     60419 |      60933 |         pyarrow.lib
import time:      1278 |       1278 |         pyarrow._hdfsio
import time:       324 |        324 |           pyarrow.filesystem
import time:       292 |        616 |         pyarrow.hdfs
import time:       328 |        328 |         pyarrow.ipc
import time:       225 |        225 |         pyarrow.types
import time:       715 |      64971 |       pyarrow
import time:       266 |      65236 |     pandas.compat.pyarrow
import time:       631 |     181751 |   pandas.compat
import time:      1486 |       1486 |   pandas._libs.tslib
import time:       565 |        565 |   pandas.core.config_init
import time:       419 |        419 |     pandas.core.dtypes.missing
import time:       294 |        294 |           pandas.io
import time:       550 |        844 |         pandas.io._util
import time:       797 |       1640 |       pandas.core.dtypes.cast
import time:       470 |        470 |         pandas.core.dtypes.astype
import time:       557 |       1026 |       pandas.core.dtypes.concat
import time:       346 |        346 |         pandas.core.array_algos
import time:       360 |        360 |           pandas.core.common
import time:       256 |        616 |         pandas.core.construction
import time:       499 |       1460 |       pandas.core.array_algos.take
import time:       457 |        457 |         pandas.core.indexers.utils
import time:       284 |        741 |       pandas.core.indexers
import time:      2194 |       7060 |     pandas.core.algorithms
import time:      1092 |       1092 |           unicodedata
import time:       233 |        233 |           pandas.util._validators
import time:       223 |        223 |           pandas.core.roperator
import time:       936 |        936 |                   pandas._libs.ops
import time:       196 |        196 |                   pandas.core.computation
import time:       220 |        220 |                     pandas.core.computation.check
import time:       451 |        670 |                   pandas.core.computation.expressions
import time:       238 |        238 |                   pandas.core.ops.missing
import time:       171 |        171 |                   pandas.core.ops.dispatch
import time:       199 |        199 |                   pandas.core.ops.invalid
import time:       439 |       2847 |                 pandas.core.ops.array_ops
import time:        83 |         83 |                 pandas.core.ops.common
import time:       101 |        101 |                 pandas.core.ops.docstrings
import time:        77 |         77 |                 pandas.core.ops.mask_ops
import time:        83 |         83 |                 pandas.core.ops.methods
import time:       294 |       3483 |               pandas.core.ops
import time:         9 |       3491 |             pandas.core.ops.common
import time:       278 |       3769 |           pandas.core.arraylike
import time:       319 |        319 |             pandas.compat.numpy.function
import time:       277 |        277 |             pandas.core.missing
import time:       167 |        167 |             pandas.core.array_algos.quantile
import time:       257 |        257 |             pandas.core.sorting
import time:      1015 |       2032 |           pandas.core.arrays.base
import time:       163 |        163 |             pandas.core.strings
import time:       471 |        634 |           pandas.core.strings.base
import time:       239 |        239 |             pandas.tseries
import time:       508 |        746 |           pandas.tseries.frequencies
import time:     20871 |      20871 |             pyarrow._compute
import time:       340 |        340 |             pyarrow._compute_docstrings
import time:       184 |        184 |             pyarrow.vendored
import time:       676 |        676 |                 pkgutil
import time:      1048 |       1723 |               pydoc
import time:      1125 |       2847 |             pyarrow.vendored.docscrape
import time:     12600 |      36841 |           pyarrow.compute
import time:       260 |        260 |           pandas.core.arrays.arrow._arrow_utils
import time:       274 |        274 |           pandas.core.arrays.arrow.dtype
import time:      1026 |      47125 |         pandas.core.arrays.arrow.array
import time:       494 |      47619 |       pandas.core.arrays.arrow
import time:       184 |        184 |         pandas.core.array_algos.masked_accumulations
import time:       595 |        595 |           pandas.core.nanops
import time:       203 |        203 |           pandas.core.array_algos.masked_reductions
import time:       615 |       1412 |         pandas.core.arrays.masked
import time:       705 |       2300 |       pandas.core.arrays.boolean
import time:       925 |        925 |         pandas._libs.arrays
import time:       285 |        285 |         pandas.core.accessor
import time:       188 |        188 |           pandas.core.array_algos.transforms
import time:       551 |        739 |         pandas.core.arrays._mixins
import time:       431 |        431 |         pandas.core.base
import time:       351 |        351 |         pandas.core.strings.object_array
import time:       271 |        271 |         pandas.io.formats
import time:       386 |        386 |         pandas.io.formats.console
import time:       606 |       3991 |       pandas.core.arrays.categorical
import time:       194 |        194 |           pandas.core.array_algos.datetimelike_accumulations
import time:       323 |        323 |             pandas.core.arrays.numeric
import time:       296 |        619 |           pandas.core.arrays.integer
import time:       804 |       1616 |         pandas.core.arrays.datetimelike
import time:       102 |        102 |         pandas.core.arrays._ranges
import time:       206 |        206 |         pandas.tseries.offsets
import time:       490 |       2413 |       pandas.core.arrays.datetimes
import time:       121 |        121 |       pandas.core.arrays.floating
import time:       346 |        346 |         pandas.core.arrays.timedeltas
import time:       712 |       1058 |       pandas.core.arrays.interval
import time:       284 |        284 |       pandas.core.arrays.numpy_
import time:       417 |        417 |       pandas.core.arrays.period
import time:      1546 |       1546 |             pandas._libs.sparse
import time:       129 |        129 |             pandas.core.arrays.sparse.dtype
import time:       333 |        333 |             pandas.io.formats.printing
import time:       332 |       2338 |           pandas.core.arrays.sparse.array
import time:       176 |       2514 |         pandas.core.arrays.sparse.accessor
import time:        86 |       2599 |       pandas.core.arrays.sparse
import time:       377 |        377 |       pandas.core.arrays.string_
import time:       252 |        252 |       pandas.core.arrays.string_arrow
import time:       273 |      61697 |     pandas.core.arrays
import time:       164 |        164 |     pandas.core.flags
import time:       869 |        869 |         pandas._libs.reduction
import time:       456 |        456 |         pandas.core.apply
import time:       910 |        910 |               pandas._libs.indexing
import time:       184 |        184 |                 pandas.core.indexes
import time:      1883 |       1883 |                   pandas._libs.index
import time:        59 |         59 |                     backports_abc
import time:      1318 |       1376 |                   pandas._libs.internals
import time:      1774 |       1774 |                   pandas._libs.join
import time:       195 |        195 |                   pandas.core.array_algos.putmask
import time:       211 |        211 |                   pandas.core.indexes.frozen
import time:      1522 |       1522 |                   pandas.core.strings.accessor
import time:      7325 |      14284 |                 pandas.core.indexes.base
import time:       167 |        167 |                   pandas.core.indexes.extension
import time:       376 |        542 |                 pandas.core.indexes.category
import time:       368 |        368 |                     pandas.core.indexes.range
import time:       195 |        195 |                       pandas.core.tools
import time:       264 |        459 |                     pandas.core.tools.timedeltas
import time:       467 |       1293 |                   pandas.core.indexes.datetimelike
import time:        97 |         97 |                   pandas.core.tools.times
import time:       608 |       1996 |                 pandas.core.indexes.datetimes
import time:       835 |        835 |                   pandas.core.indexes.multi
import time:       271 |        271 |                   pandas.core.indexes.timedeltas
import time:       683 |       1788 |                 pandas.core.indexes.interval
import time:       379 |        379 |                 pandas.core.indexes.period
import time:       450 |      19621 |               pandas.core.indexes.api
import time:      1336 |      21866 |             pandas.core.indexing
import time:       198 |        198 |             pandas.core.sample
import time:       155 |        155 |             pandas.core.array_algos.replace
import time:       938 |        938 |                   pandas._libs.writers
import time:       668 |       1605 |                 pandas.core.internals.blocks
import time:       428 |       2032 |               pandas.core.internals.api
import time:       154 |        154 |                 pandas.core.internals.base
import time:       350 |        503 |               pandas.core.internals.array_manager
import time:       334 |        334 |                   pandas.core.internals.ops
import time:       783 |       1116 |                 pandas.core.internals.managers
import time:       327 |       1443 |               pandas.core.internals.concat
import time:       294 |       4271 |             pandas.core.internals
import time:       551 |        551 |             pandas.core.internals.construction
import time:        96 |         96 |               pandas.core.methods
import time:       197 |        197 |                 pandas.core.reshape
import time:       376 |        572 |               pandas.core.reshape.concat
import time:       667 |        667 |                   dataclasses
import time:       524 |        524 |                   mmap
import time:        27 |         27 |                     pwd
import time:       454 |        454 |                     grp
import time:       744 |       1224 |                   tarfile
import time:       277 |        277 |                   pandas.core.shared_docs
import time:      1115 |       3805 |                 pandas.io.common
import time:       616 |       4421 |               pandas.io.formats.format
import time:       204 |       5291 |             pandas.core.methods.describe
import time:       153 |        153 |                   pandas._libs.window
import time:      1403 |       1555 |                 pandas._libs.window.aggregations
import time:       865 |        865 |                   pandas._libs.window.indexers
import time:       398 |       1263 |                 pandas.core.indexers.objects
import time:       103 |        103 |                 pandas.core.util.numba_
import time:        99 |         99 |                 pandas.core.window.common
import time:       141 |        141 |                 pandas.core.window.doc
import time:       224 |        224 |                 pandas.core.window.numba_
import time:       124 |        124 |                 pandas.core.window.online
import time:       205 |        205 |                   pandas.core._numba
import time:       191 |        191 |                   pandas.core._numba.executor
import time:      1188 |       1583 |                 pandas.core.window.rolling
import time:       536 |       5623 |               pandas.core.window.ewm
import time:       566 |        566 |               pandas.core.window.expanding
import time:       114 |       6302 |             pandas.core.window
import time:      3417 |      42048 |           pandas.core.generic
import time:       213 |        213 |           pandas.core.methods.selectn
import time:       170 |        170 |             pandas.core.reshape.util
import time:       132 |        132 |             pandas.core.tools.numeric
import time:       349 |        650 |           pandas.core.reshape.melt
import time:      1347 |       1347 |             pandas._libs.reshape
import time:       485 |        485 |             pandas.core.indexes.accessors
import time:       201 |        201 |               pandas.arrays
import time:       402 |        603 |             pandas.core.tools.datetimes
import time:       657 |        657 |             pandas.io.formats.info
import time:       950 |        950 |               pandas.plotting._core
import time:       335 |        335 |               pandas.plotting._misc
import time:       310 |       1594 |             pandas.plotting
import time:      2155 |       6839 |           pandas.core.series
import time:      4115 |      53865 |         pandas.core.frame
import time:       662 |        662 |         pandas.core.groupby.base
import time:        67 |         67 |             backports_abc
import time:      2960 |       3026 |           pandas._libs.groupby
import time:       212 |        212 |           pandas.core.groupby.numba_
import time:       176 |        176 |               pandas.core.groupby.categorical
import time:       356 |        532 |             pandas.core.groupby.grouper
import time:       430 |        961 |           pandas.core.groupby.ops
import time:       232 |        232 |           pandas.core.groupby.indexing
import time:      1290 |       5720 |         pandas.core.groupby.groupby
import time:      1194 |      62763 |       pandas.core.groupby.generic
import time:       170 |      62932 |     pandas.core.groupby
import time:       329 |     132598 |   pandas.core.api
import time:       221 |        221 |   pandas.tseries.api
import time:        76 |         76 |           pandas.core.computation.common
import time:       236 |        311 |         pandas.core.computation.align
import time:       308 |        308 |             pprint
import time:       159 |        466 |           pandas.core.computation.scope
import time:       203 |        669 |         pandas.core.computation.ops
import time:       146 |       1125 |       pandas.core.computation.engines
import time:        90 |         90 |         pandas.core.computation.parsing
import time:       575 |        664 |       pandas.core.computation.expr
import time:       219 |       2007 |     pandas.core.computation.eval
import time:       150 |       2157 |   pandas.core.computation.api
import time:       236 |        236 |     pandas.core.reshape.encoding
import time:      1046 |       1046 |         _uuid
import time:       421 |       1466 |       uuid
import time:       602 |       2068 |     pandas.core.reshape.merge
import time:       437 |        437 |     pandas.core.reshape.pivot
import time:       242 |        242 |     pandas.core.reshape.tile
import time:        99 |       3080 |   pandas.core.reshape.api
import time:       299 |        299 |     pandas.api.extensions
import time:       157 |        157 |     pandas.api.indexers
import time:       139 |        139 |         pandas.core.interchange
import time:       689 |        828 |       pandas.core.interchange.dataframe_protocol
import time:       189 |        189 |             pandas.core.dtypes.api
import time:        89 |        277 |           pandas.api.types
import time:       127 |        403 |         pandas.core.interchange.utils
import time:       272 |        675 |       pandas.core.interchange.from_dataframe
import time:       220 |       1722 |     pandas.api.interchange
import time:       219 |       2395 |   pandas.api
import time:        90 |         90 |         pandas._testing._random
import time:       385 |        385 |           tempfile
import time:       108 |        492 |         pandas._testing.contexts
import time:       257 |        839 |       pandas._testing._io
import time:        97 |         97 |       pandas._testing._warnings
import time:      1028 |       1028 |           cmath
import time:       793 |       1820 |         pandas._libs.testing
import time:       167 |       1987 |       pandas._testing.asserters
import time:        78 |         78 |       pandas._testing.compat
import time:       719 |       3718 |     pandas._testing
import time:       183 |       3901 |   pandas.testing
import time:       199 |        199 |   pandas.util._print_versions
import time:       290 |        290 |     pandas.io.clipboards
import time:        42 |         42 |           backports_abc
import time:      1837 |       1879 |         pandas._libs.parsers
import time:       192 |        192 |         pandas.io.excel._util
import time:       331 |        331 |               pandas.io.parsers.base_parser
import time:       227 |        557 |             pandas.io.parsers.arrow_parser_wrapper
import time:       128 |        128 |             pandas.io.parsers.c_parser_wrapper
import time:       192 |        192 |             pandas.io.parsers.python_parser
import time:      1228 |       2104 |           pandas.io.parsers.readers
import time:       267 |       2371 |         pandas.io.parsers
import time:       137 |        137 |         pandas.io.excel._odfreader
import time:       259 |        259 |         pandas.io.excel._openpyxl
import time:       145 |        145 |         pandas.io.excel._pyxlsb
import time:       224 |        224 |         pandas.io.excel._xlrd
import time:      1096 |       6298 |       pandas.io.excel._base
import time:       994 |        994 |         pandas._libs.json
import time:       126 |       1120 |       pandas.io.excel._odswriter
import time:       130 |        130 |       pandas.io.excel._xlsxwriter
import time:       212 |       7760 |     pandas.io.excel
import time:       136 |        136 |     pandas.io.feather_format
import time:       185 |        185 |     pandas.io.gbq
import time:       516 |        516 |     pandas.io.html
import time:       109 |        109 |         pandas.io.json._normalize
import time:       106 |        106 |         pandas.io.json._table_schema
import time:      1077 |       1291 |       pandas.io.json._json
import time:       107 |       1397 |     pandas.io.json
import time:       103 |        103 |     pandas.io.orc
import time:       243 |        243 |     pandas.io.parquet
import time:       242 |        242 |       pandas.compat.pickle_compat
import time:       274 |        515 |     pandas.io.pickle
import time:       290 |        290 |       pandas.core.computation.pytables
import time:       874 |       1164 |     pandas.io.pytables
import time:       319 |        319 |       pandas.io.sas.sasreader
import time:       150 |        468 |     pandas.io.sas
import time:        83 |         83 |     pandas.io.spss
import time:       482 |        482 |     pandas.io.sql
import time:       986 |        986 |     pandas.io.stata
import time:       684 |        684 |     pandas.io.xml
import time:       245 |      15249 |   pandas.io.api
import time:       215 |        215 |   pandas.util._tester
import time:        78 |         78 |   pandas._version
import time:       493 |     464932 | pandas
import time:       296 |        296 |     kedro
import time:       318 |        318 |           cachetools.keys
import time:       596 |        913 |         cachetools
import time:       305 |        305 |         kedro.utils
import time:       486 |       1703 |       kedro.io.core
import time:       207 |        207 |       kedro.io.memory_dataset
import time:       424 |       2334 |     kedro.io.cached_dataset
import time:       414 |        414 |       difflib
import time:       304 |        718 |     kedro.io.data_catalog
import time:       150 |        150 |     kedro.io.lambda_dataset
import time:       159 |        159 |     kedro.io.partitioned_dataset
import time:       462 |       4117 |   kedro.io
import time:         8 |       4124 | kedro.io.core
import time:       170 |        170 | kedro_datasets._io

On main branch:

Output of python -X importtime -c'import kedro_datasets.pandas;kedro_datasets.pandas.CSVDataSet':

import time: self [us] | cumulative | imported package
import time:       329 |        329 |   _io
import time:        58 |         58 |   marshal
import time:       342 |        342 |   posix
import time:       890 |       1617 | _frozen_importlib_external
import time:       284 |        284 |   time
import time:       182 |        465 | zipimport
import time:       190 |        190 |     _codecs
import time:       840 |       1029 |   codecs
import time:       727 |        727 |   encodings.aliases
import time:       715 |       2470 | encodings
import time:       267 |        267 | encodings.utf_8
import time:        56 |         56 | _signal
import time:        49 |         49 |     _abc
import time:       522 |        571 |   abc
import time:       468 |       1038 | io
import time:        50 |         50 |       _stat
import time:       369 |        418 |     stat
import time:       894 |        894 |     _collections_abc
import time:       324 |        324 |       genericpath
import time:       319 |        643 |     posixpath
import time:       888 |       2841 |   os
import time:       372 |        372 |   _sitebuiltins
import time:       710 |        710 |   _distutils_hack
import time:       201 |        201 |   sitecustomize
import time:        51 |         51 |   usercustomize
import time:      1922 |       6096 | site
import time:       376 |        376 |   kedro_datasets
import time:       688 |        688 |       itertools
import time:       344 |        344 |       keyword
import time:        59 |         59 |         _operator
import time:       407 |        466 |       operator
import time:       364 |        364 |       reprlib
import time:        57 |         57 |       _collections
import time:       865 |       2781 |     collections
import time:      1616 |       1616 |       types
import time:        51 |         51 |       _functools
import time:      2799 |       4464 |     functools
import time:       638 |       7882 |   contextlib
import time:       770 |        770 |         enum
import time:        53 |         53 |           _sre
import time:       423 |        423 |             sre_constants
import time:       452 |        875 |           sre_parse
import time:       415 |       1341 |         sre_compile
import time:        54 |         54 |         _locale
import time:       301 |        301 |         copyreg
import time:       602 |       3067 |       re
import time:       214 |        214 |             token
import time:       800 |       1014 |           tokenize
import time:       275 |       1288 |         linecache
import time:       400 |       1687 |       traceback
import time:       441 |        441 |       warnings
import time:       309 |        309 |         _weakrefset
import time:       524 |        833 |       weakref
import time:       275 |        275 |       collections.abc
import time:        22 |         22 |         _string
import time:       661 |        683 |       string
import time:       586 |        586 |       threading
import time:        25 |         25 |       atexit
import time:      1601 |       9195 |     logging
import time:        56 |         56 |           org
import time:        11 |         67 |         org.python
import time:        11 |         77 |       org.python.core
import time:       343 |        420 |     copy
import time:       298 |        298 |       fnmatch
import time:        56 |         56 |         _winapi
import time:       123 |        123 |         nt
import time:        47 |         47 |         nt
import time:        44 |         44 |         nt
import time:        43 |         43 |         nt
import time:       330 |        641 |       ntpath
import time:        37 |         37 |       errno
import time:       240 |        240 |         urllib
import time:       806 |       1046 |       urllib.parse
import time:       774 |       2794 |     pathlib
import time:      1546 |       1546 |     typing
import time:       272 |        272 |         importlib
import time:       817 |        817 |           _csv
import time:       417 |       1233 |         csv
import time:       211 |        211 |         email
import time:      1563 |       1563 |           binascii
import time:       195 |        195 |             importlib._abc
import time:       293 |        487 |           importlib.util
import time:       886 |        886 |             zlib
import time:       258 |        258 |               _compression
import time:      1502 |       1502 |               _bz2
import time:       334 |       2094 |             bz2
import time:      1761 |       1761 |               _lzma
import time:       315 |       2076 |             lzma
import time:       498 |       5553 |           shutil
import time:       776 |        776 |             _struct
import time:       234 |       1010 |           struct
import time:      1187 |       9798 |         zipfile
import time:       867 |        867 |         textwrap
import time:       246 |        246 |             uu
import time:       203 |        203 |             quopri
import time:       778 |        778 |                 math
import time:       591 |        591 |                   _bisect
import time:       197 |        788 |                 bisect
import time:       807 |        807 |                 _random
import time:       791 |        791 |                 _sha512
import time:       459 |       3621 |               random
import time:      1201 |       1201 |                 _socket
import time:       652 |        652 |                   select
import time:       480 |       1132 |                 selectors
import time:       843 |        843 |                 array
import time:       822 |       3997 |               socket
import time:       757 |        757 |                 _datetime
import time:       641 |       1398 |               datetime
import time:       617 |        617 |                   locale
import time:       499 |       1115 |                 calendar
import time:       282 |       1396 |               email._parseaddr
import time:       284 |        284 |                   base64
import time:       203 |        486 |                 email.base64mime
import time:       267 |        267 |                 email.quoprimime
import time:       378 |        378 |                 email.errors
import time:       221 |        221 |                 email.encoders
import time:       319 |       1670 |               email.charset
import time:       376 |      12455 |             email.utils
import time:       526 |        526 |               email.header
import time:       289 |        815 |             email._policybase
import time:       332 |        332 |             email._encoded_words
import time:       173 |        173 |             email.iterators
import time:       461 |      14682 |           email.message
import time:       161 |        161 |             importlib.metadata._functools
import time:       227 |        387 |           importlib.metadata._text
import time:       329 |      15397 |         importlib.metadata._adapters
import time:       295 |        295 |         importlib.metadata._meta
import time:       259 |        259 |         importlib.metadata._collections
import time:       160 |        160 |         importlib.metadata._itertools
import time:       200 |        200 |           importlib.machinery
import time:       449 |        648 |         importlib.abc
import time:       940 |      30075 |       importlib.metadata
import time:       430 |        430 |               _json
import time:       309 |        739 |             json.scanner
import time:       399 |       1137 |           json.decoder
import time:       361 |        361 |           json.encoder
import time:       248 |       1745 |         json
import time:       223 |       1968 |       fsspec._version
import time:       205 |        205 |           concurrent
import time:       378 |        378 |           concurrent.futures._base
import time:       231 |        813 |         concurrent.futures
import time:       722 |        722 |               _heapq
import time:       243 |        965 |             heapq
import time:       396 |        396 |             _queue
import time:       308 |       1669 |           queue
import time:       269 |       1938 |         concurrent.futures.thread
import time:       348 |       3098 |       fsspec.caching
import time:       266 |        266 |       fsspec.callbacks
import time:       268 |        268 |           __future__
import time:     13167 |      13167 |             _hashlib
import time:       823 |        823 |             _blake2
import time:       860 |      14848 |           hashlib
import time:       277 |      15392 |         fsspec.utils
import time:       308 |        308 |           glob
import time:      1338 |       1338 |             configparser
import time:       108 |       1446 |           fsspec.config
import time:       108 |        108 |           fsspec.dircache
import time:       198 |        198 |           fsspec.transaction
import time:       441 |       2499 |         fsspec.spec
import time:        38 |         38 |         isal
import time:       264 |        264 |         gzip
import time:        35 |         35 |         lzmaffi
import time:        33 |         33 |         snappy
import time:        32 |         32 |           lz4
import time:         7 |         38 |         lz4.frame
import time:        30 |         30 |         zstandard
import time:       203 |      18529 |       fsspec.compression
import time:       218 |        218 |         fsspec.registry
import time:       160 |        377 |       fsspec.core
import time:       482 |        482 |               signal
import time:      1015 |       1015 |               fcntl
import time:        37 |         37 |               msvcrt
import time:       791 |        791 |               _posixsubprocess
import time:       368 |       2691 |             subprocess
import time:      3130 |       3130 |               _ssl
import time:      1384 |       4514 |             ssl
import time:       292 |        292 |             asyncio.constants
import time:       113 |        113 |                   _ast
import time:       694 |        807 |                 ast
import time:       687 |        687 |                     _opcode
import time:       233 |        919 |                   opcode
import time:       378 |       1297 |                 dis
import time:       880 |       2984 |               inspect
import time:       197 |        197 |                 asyncio.format_helpers
import time:       250 |        446 |               asyncio.base_futures
import time:       154 |        154 |               asyncio.log
import time:       227 |       3809 |             asyncio.coroutines
import time:       645 |        645 |                 _contextvars
import time:       202 |        847 |               contextvars
import time:       226 |        226 |                 asyncio.exceptions
import time:       194 |        194 |                 asyncio.base_tasks
import time:       937 |       1357 |               _asyncio
import time:       350 |       2552 |             asyncio.events
import time:       288 |        288 |             asyncio.futures
import time:       226 |        226 |             asyncio.protocols
import time:      1968 |       1968 |               asyncio.transports
import time:       330 |       2297 |             asyncio.sslproto
import time:       247 |        247 |                 asyncio.mixins
import time:       460 |        460 |                 asyncio.tasks
import time:       600 |       1306 |               asyncio.locks
import time:       852 |       2158 |             asyncio.staggered
import time:       244 |        244 |             asyncio.trsock
import time:       566 |      19632 |           asyncio.base_events
import time:       195 |        195 |           asyncio.runners
import time:      1277 |       1277 |           asyncio.queues
import time:       345 |        345 |           asyncio.streams
import time:       250 |        250 |           asyncio.subprocess
import time:       194 |        194 |           asyncio.threads
import time:       214 |        214 |             asyncio.base_subprocess
import time:       364 |        364 |             asyncio.selector_events
import time:       558 |       1135 |           asyncio.unix_events
import time:       256 |      23280 |         asyncio
import time:        89 |      23368 |       fsspec.exceptions
import time:       278 |        278 |       fsspec.mapping
import time:     12227 |      90183 |     fsspec
import time:       172 |        172 |           numpy._utils
import time:       337 |        509 |         numpy._globals
import time:       243 |        243 |         numpy.exceptions
import time:        73 |         73 |         numpy._distributor_init
import time:        86 |         86 |         numpy.__config__
import time:       136 |        136 |             numpy._version
import time:        14 |         14 |             numpy._version_meson
import time:       159 |        308 |           numpy.version
import time:       186 |        186 |               numpy._utils._inspect
import time:       564 |        564 |                 numpy.core._exceptions
import time:       214 |        214 |                 numpy.dtypes
import time:     20004 |      20780 |               numpy.core._multiarray_umath
import time:       552 |      21517 |             numpy.core.overrides
import time:       622 |      22138 |           numpy.core.multiarray
import time:       342 |        342 |           numpy.core.umath
import time:       399 |        399 |             numbers
import time:       321 |        321 |             numpy.core._string_helpers
import time:        50 |         50 |                   pickle5
import time:       251 |        251 |                     _compat_pickle
import time:      2699 |       2699 |                     _pickle
import time:       266 |        266 |                         org
import time:        27 |        292 |                       org.python
import time:        59 |        351 |                     org.python.core
import time:       858 |       4157 |                   pickle
import time:       218 |       4424 |                 numpy.compat.py3k
import time:       279 |       4702 |               numpy.compat
import time:       352 |        352 |               numpy.core._dtype
import time:       348 |       5401 |             numpy.core._type_aliases
import time:      1834 |       7953 |           numpy.core.numerictypes
import time:       159 |        159 |                   numpy.core._ufunc_config
import time:       155 |        313 |                 numpy.core._methods
import time:       647 |        960 |               numpy.core.fromnumeric
import time:      2714 |       3674 |             numpy.core.shape_base
import time:      5729 |       5729 |             numpy.core.arrayprint
import time:       368 |        368 |             numpy.core._asarray
import time:       921 |      10691 |           numpy.core.numeric
import time:       690 |        690 |           numpy.core.defchararray
import time:       396 |        396 |           numpy.core.records
import time:       237 |        237 |           numpy.core.memmap
import time:       263 |        263 |           numpy.core.function_base
import time:       117 |        117 |           numpy.core._machar
import time:       326 |        326 |           numpy.core.getlimits
import time:       281 |        281 |           numpy.core.einsumfunc
import time:     12287 |      12287 |             numpy.core._multiarray_tests
import time:      1100 |      13387 |           numpy.core._add_newdocs
import time:       658 |        658 |           numpy.core._add_newdocs_scalars
import time:       244 |        244 |           numpy.core._dtype_ctypes
import time:      1975 |       1975 |               _ctypes
import time:       549 |        549 |               ctypes._endian
import time:       971 |       3495 |             ctypes
import time:       480 |       3974 |           numpy.core._internal
import time:       119 |        119 |           numpy._pytesttester
import time:       546 |      62659 |         numpy.core
import time:       256 |        256 |           numpy.lib.mixins
import time:       190 |        190 |               numpy.lib.ufunclike
import time:       283 |        472 |             numpy.lib.type_check
import time:      1068 |       1539 |           numpy.lib.scimath
import time:       256 |        256 |                       numpy.lib.stride_tricks
import time:       209 |        464 |                     numpy.lib.twodim_base
import time:      1846 |       1846 |                     numpy.linalg._umath_linalg
import time:       308 |        308 |                       numpy._typing._nested_sequence
import time:       149 |        149 |                       numpy._typing._nbit
import time:       476 |        476 |                       numpy._typing._char_codes
import time:       200 |        200 |                       numpy._typing._scalars
import time:       107 |        107 |                       numpy._typing._shape
import time:       693 |        693 |                       numpy._typing._dtype_like
import time:       813 |        813 |                       numpy._typing._array_like
import time:      1949 |       4690 |                     numpy._typing
import time:       768 |       7767 |                   numpy.linalg.linalg
import time:       168 |       7934 |                 numpy.linalg
import time:       348 |       8282 |               numpy.matrixlib.defmatrix
import time:       169 |       8450 |             numpy.matrixlib
import time:       314 |        314 |               numpy.lib.histograms
import time:       774 |       1088 |             numpy.lib.function_base
import time:       464 |      10001 |           numpy.lib.index_tricks
import time:       367 |        367 |           numpy.lib.nanfunctions
import time:       729 |        729 |           numpy.lib.shape_base
import time:       457 |        457 |           numpy.lib.polynomial
import time:      1202 |       1202 |             platform
import time:       423 |       1624 |           numpy.lib.utils
import time:       314 |        314 |           numpy.lib.arraysetops
import time:       272 |        272 |             numpy.lib.format
import time:       229 |        229 |             numpy.lib._datasource
import time:       446 |        446 |             numpy.lib._iotools
import time:       452 |       1398 |           numpy.lib.npyio
import time:       166 |        166 |           numpy.lib.arrayterator
import time:       124 |        124 |           numpy.lib.arraypad
import time:        99 |         99 |           numpy.lib._version
import time:       434 |      17501 |         numpy.lib
import time:       907 |        907 |             numpy.fft._pocketfft_internal
import time:       592 |       1499 |           numpy.fft._pocketfft
import time:       203 |        203 |           numpy.fft.helper
import time:       244 |       1945 |         numpy.fft
import time:       295 |        295 |             numpy.polynomial.polyutils
import time:       298 |        298 |             numpy.polynomial._polybase
import time:       550 |       1141 |           numpy.polynomial.polynomial
import time:       343 |        343 |           numpy.polynomial.chebyshev
import time:       279 |        279 |           numpy.polynomial.legendre
import time:       288 |        288 |           numpy.polynomial.hermite
import time:       292 |        292 |           numpy.polynomial.hermite_e
import time:       254 |        254 |           numpy.polynomial.laguerre
import time:       271 |       2864 |         numpy.polynomial
import time:       125 |        125 |                   backports_abc
import time:      1839 |       1964 |                 numpy.random._common
import time:       292 |        292 |                   hmac
import time:       371 |        663 |                 secrets
import time:      1564 |       4189 |               numpy.random.bit_generator
import time:      1421 |       1421 |               numpy.random._bounded_integers
import time:      1126 |       1126 |               numpy.random._mt19937
import time:      3440 |      10176 |             numpy.random.mtrand
import time:       860 |        860 |             numpy.random._philox
import time:       874 |        874 |             numpy.random._pcg64
import time:      1497 |       1497 |             numpy.random._sfc64
import time:      2859 |       2859 |             numpy.random._generator
import time:       311 |      16574 |           numpy.random._pickle
import time:       234 |      16808 |         numpy.random
import time:       404 |        404 |         numpy.ctypeslib
import time:      1538 |       1538 |           numpy.ma.core
import time:       719 |        719 |           numpy.ma.extras
import time:       241 |       2497 |         numpy.ma
import time:      9066 |     114651 |       numpy
import time:       222 |        222 |         pytz.exceptions
import time:       148 |        148 |         pytz.lazy
import time:       289 |        289 |         pytz.tzinfo
import time:       197 |        197 |         pytz.tzfile
import time:       772 |       1627 |       pytz
import time:       239 |        239 |         dateutil._version
import time:       255 |        494 |       dateutil
import time:      1422 |       1422 |         pandas._typing
import time:       246 |        246 |         pandas.compat._constants
import time:       156 |        156 |         pandas.compat.compressors
import time:      1221 |       1221 |                                 pandas._libs.tslibs.np_datetime
import time:      1770 |       2991 |                               pandas._libs.tslibs.dtypes
import time:      1763 |       1763 |                                 pandas._libs.tslibs.base
import time:      1542 |       1542 |                                     pandas._libs.tslibs.nattype
import time:       206 |        206 |                                           pandas.util._exceptions
import time:      1462 |       1462 |                                           pandas.util.version
import time:       237 |       1904 |                                         pandas.compat._optional
import time:       450 |        450 |                                             sysconfig
import time:       400 |        400 |                                             _sysconfigdata__darwin_darwin
import time:       276 |        276 |                                             _osx_support
import time:       574 |       1698 |                                           zoneinfo._tzpath
import time:       262 |        262 |                                           zoneinfo._common
import time:      1245 |       1245 |                                           _zoneinfo
import time:       328 |       3531 |                                         zoneinfo
import time:       615 |        615 |                                             six
import time:        18 |         18 |                                             six.moves
import time:       130 |        130 |                                             dateutil.tz._common
import time:       112 |        112 |                                             dateutil.tz._factories
import time:        11 |         11 |                                               six.moves.winreg
import time:       125 |        135 |                                             dateutil.tz.win
import time:       631 |       1638 |                                           dateutil.tz.tz
import time:       343 |       1981 |                                         dateutil.tz
import time:      1646 |       9060 |                                       pandas._libs.tslibs.timezones
import time:       877 |        877 |                                         pandas._libs.tslibs.ccalendar
import time:       519 |        519 |                                         _strptime
import time:       534 |        534 |                                             pandas._config.config
import time:       134 |        134 |                                             pandas._config.dates
import time:        93 |         93 |                                             pandas._config.display
import time:       211 |        971 |                                           pandas._config
import time:       112 |       1083 |                                         pandas._config.localization
import time:      2094 |       4572 |                                       pandas._libs.tslibs.fields
import time:      3069 |      16699 |                                     pandas._libs.tslibs.timedeltas
import time:      1859 |       1859 |                                     pandas._libs.tslibs.tzconversion
import time:        43 |         43 |                                     backports_abc
import time:      3097 |      23238 |                                   pandas._libs.tslibs.timestamps
import time:        39 |         39 |                                   backports_abc
import time:       200 |        200 |                                   dateutil.easter
import time:       109 |        109 |                                     dateutil._common
import time:       229 |        338 |                                   dateutil.relativedelta
import time:       750 |        750 |                                   pandas._libs.properties
import time:      3753 |      28314 |                                 pandas._libs.tslibs.offsets
import time:        44 |         44 |                                   backports_abc
import time:      2228 |       2228 |                                     _decimal
import time:       228 |       2456 |                                   decimal
import time:       436 |        436 |                                     dateutil.parser._parser
import time:       170 |        170 |                                     dateutil.parser.isoparser
import time:       251 |        856 |                                   dateutil.parser
import time:      2308 |       2308 |                                   pandas._libs.tslibs.strptime
import time:      2621 |       8283 |                                 pandas._libs.tslibs.parsing
import time:      1864 |      40223 |                               pandas._libs.tslibs.conversion
import time:      2435 |       2435 |                               pandas._libs.tslibs.period
import time:      1997 |       1997 |                               pandas._libs.tslibs.vectorized
import time:       265 |      47908 |                             pandas._libs.tslibs
import time:        18 |      47925 |                           pandas._libs.tslibs.nattype
import time:       824 |        824 |                           pandas._libs.ops_dispatch
import time:      1342 |      50090 |                         pandas._libs.missing
import time:      2612 |      52702 |                       pandas._libs.hashtable
import time:      3095 |       3095 |                       pandas._libs.algos
import time:      2530 |      58325 |                     pandas._libs.interval
import time:       199 |      58524 |                   pandas._libs
import time:        18 |      58542 |                 pandas._libs.properties
import time:       703 |      59245 |               pandas.util._decorators
import time:       292 |        292 |                   pandas.core
import time:       213 |        504 |                 pandas.core.util
import time:      7953 |       7953 |                 pandas._libs.lib
import time:      1271 |       1271 |                 pandas._libs.hashing
import time:       305 |        305 |                   pandas.core.dtypes
import time:       446 |        446 |                     pandas.errors
import time:       338 |        338 |                     pandas.core.dtypes.generic
import time:       166 |        949 |                   pandas.core.dtypes.base
import time:        99 |         99 |                     pandas.core.dtypes.inference
import time:       753 |        851 |                   pandas.core.dtypes.dtypes
import time:       488 |       2591 |                 pandas.core.dtypes.common
import time:       278 |      12595 |               pandas.core.util.hashing
import time:       191 |      72030 |             pandas.util
import time:        10 |      72039 |           pandas.util.version
import time:       185 |      72223 |         pandas.compat.numpy
import time:        27 |         27 |             gc
import time:       261 |        261 |             pyarrow._generated_version
import time:        54 |         54 |               backports_abc
import time:      6225 |       6225 |               pickle5
import time:       110 |        110 |               cloudpickle
import time:       372 |        372 |               pyarrow.util
import time:        66 |         66 |               pyarrow.collections
import time:        37 |         37 |               pyarrow.enum
import time:     71226 |      78087 |             pyarrow.lib
import time:      1598 |       1598 |             pyarrow._hdfsio
import time:       338 |        338 |               pyarrow.filesystem
import time:       322 |        659 |             pyarrow.hdfs
import time:       240 |        240 |             pyarrow.ipc
import time:       227 |        227 |             pyarrow.types
import time:       745 |      81840 |           pyarrow
import time:       258 |      82097 |         pandas.compat.pyarrow
import time:       298 |     156440 |       pandas.compat
import time:      1340 |       1340 |       pandas._libs.tslib
import time:       557 |        557 |       pandas.core.config_init
import time:       241 |        241 |         pandas.core.dtypes.missing
import time:       179 |        179 |               pandas.io
import time:       214 |        392 |             pandas.io._util
import time:       429 |        820 |           pandas.core.dtypes.cast
import time:       209 |        209 |             pandas.core.dtypes.astype
import time:       191 |        399 |           pandas.core.dtypes.concat
import time:       141 |        141 |             pandas.core.array_algos
import time:       279 |        279 |               pandas.core.common
import time:       241 |        519 |             pandas.core.construction
import time:       275 |        935 |           pandas.core.array_algos.take
import time:       261 |        261 |             pandas.core.indexers.utils
import time:       188 |        449 |           pandas.core.indexers
import time:       650 |       3250 |         pandas.core.algorithms
import time:      1194 |       1194 |               unicodedata
import time:       232 |        232 |               pandas.util._validators
import time:       151 |        151 |               pandas.core.roperator
import time:      1237 |       1237 |                       pandas._libs.ops
import time:       186 |        186 |                       pandas.core.computation
import time:       245 |        245 |                         pandas.core.computation.check
import time:       280 |        525 |                       pandas.core.computation.expressions
import time:       198 |        198 |                       pandas.core.ops.missing
import time:       172 |        172 |                       pandas.core.ops.dispatch
import time:       199 |        199 |                       pandas.core.ops.invalid
import time:       338 |       2851 |                     pandas.core.ops.array_ops
import time:        80 |         80 |                     pandas.core.ops.common
import time:       102 |        102 |                     pandas.core.ops.docstrings
import time:        77 |         77 |                     pandas.core.ops.mask_ops
import time:        83 |         83 |                     pandas.core.ops.methods
import time:       225 |       3415 |                   pandas.core.ops
import time:         9 |       3424 |                 pandas.core.ops.common
import time:       176 |       3600 |               pandas.core.arraylike
import time:       311 |        311 |                 pandas.compat.numpy.function
import time:       168 |        168 |                 pandas.core.missing
import time:       167 |        167 |                 pandas.core.array_algos.quantile
import time:       261 |        261 |                 pandas.core.sorting
import time:      1124 |       2029 |               pandas.core.arrays.base
import time:       154 |        154 |                 pandas.core.strings
import time:       363 |        516 |               pandas.core.strings.base
import time:       183 |        183 |                 pandas.tseries
import time:       338 |        521 |               pandas.tseries.frequencies
import time:     26405 |      26405 |                 pyarrow._compute
import time:       222 |        222 |                 pyarrow._compute_docstrings
import time:       204 |        204 |                 pyarrow.vendored
import time:       441 |        441 |                     pkgutil
import time:       975 |       1415 |                   pydoc
import time:       904 |       2319 |                 pyarrow.vendored.docscrape
import time:     12419 |      41567 |               pyarrow.compute
import time:       280 |        280 |               pandas.core.arrays.arrow._arrow_utils
import time:       371 |        371 |               pandas.core.arrays.arrow.dtype
import time:       825 |      51281 |             pandas.core.arrays.arrow.array
import time:       245 |      51526 |           pandas.core.arrays.arrow
import time:       183 |        183 |             pandas.core.array_algos.masked_accumulations
import time:       679 |        679 |               pandas.core.nanops
import time:       208 |        208 |               pandas.core.array_algos.masked_reductions
import time:       586 |       1471 |             pandas.core.arrays.masked
import time:       730 |       2384 |           pandas.core.arrays.boolean
import time:      1011 |       1011 |             pandas._libs.arrays
import time:       277 |        277 |             pandas.core.accessor
import time:       181 |        181 |               pandas.core.array_algos.transforms
import time:       430 |        610 |             pandas.core.arrays._mixins
import time:       404 |        404 |             pandas.core.base
import time:       354 |        354 |             pandas.core.strings.object_array
import time:       163 |        163 |             pandas.io.formats
import time:       228 |        228 |             pandas.io.formats.console
import time:       612 |       3657 |           pandas.core.arrays.categorical
import time:       182 |        182 |               pandas.core.array_algos.datetimelike_accumulations
import time:       314 |        314 |                 pandas.core.arrays.numeric
import time:       273 |        586 |               pandas.core.arrays.integer
import time:       779 |       1546 |             pandas.core.arrays.datetimelike
import time:       102 |        102 |             pandas.core.arrays._ranges
import time:       197 |        197 |             pandas.tseries.offsets
import time:       451 |       2296 |           pandas.core.arrays.datetimes
import time:       114 |        114 |           pandas.core.arrays.floating
import time:       329 |        329 |             pandas.core.arrays.timedeltas
import time:       692 |       1021 |           pandas.core.arrays.interval
import time:       293 |        293 |           pandas.core.arrays.numpy_
import time:       409 |        409 |           pandas.core.arrays.period
import time:      1662 |       1662 |                 pandas._libs.sparse
import time:       123 |        123 |                 pandas.core.arrays.sparse.dtype
import time:       322 |        322 |                 pandas.io.formats.printing
import time:       313 |       2419 |               pandas.core.arrays.sparse.array
import time:       169 |       2587 |             pandas.core.arrays.sparse.accessor
import time:        81 |       2668 |           pandas.core.arrays.sparse
import time:       362 |        362 |           pandas.core.arrays.string_
import time:       280 |        280 |           pandas.core.arrays.string_arrow
import time:       251 |      65256 |         pandas.core.arrays
import time:       197 |        197 |         pandas.core.flags
import time:       888 |        888 |             pandas._libs.reduction
import time:       421 |        421 |             pandas.core.apply
import time:       936 |        936 |                   pandas._libs.indexing
import time:       177 |        177 |                     pandas.core.indexes
import time:      1733 |       1733 |                       pandas._libs.index
import time:        47 |         47 |                         backports_abc
import time:      1253 |       1300 |                       pandas._libs.internals
import time:      1681 |       1681 |                       pandas._libs.join
import time:       178 |        178 |                       pandas.core.array_algos.putmask
import time:       210 |        210 |                       pandas.core.indexes.frozen
import time:      1596 |       1596 |                       pandas.core.strings.accessor
import time:      6281 |      12977 |                     pandas.core.indexes.base
import time:       169 |        169 |                       pandas.core.indexes.extension
import time:       350 |        518 |                     pandas.core.indexes.category
import time:       343 |        343 |                         pandas.core.indexes.range
import time:       129 |        129 |                           pandas.core.tools
import time:       231 |        360 |                         pandas.core.tools.timedeltas
import time:       422 |       1124 |                       pandas.core.indexes.datetimelike
import time:        90 |         90 |                       pandas.core.tools.times
import time:       575 |       1788 |                     pandas.core.indexes.datetimes
import time:       840 |        840 |                       pandas.core.indexes.multi
import time:       258 |        258 |                       pandas.core.indexes.timedeltas
import time:       681 |       1777 |                     pandas.core.indexes.interval
import time:       369 |        369 |                     pandas.core.indexes.period
import time:       341 |      17945 |                   pandas.core.indexes.api
import time:      1255 |      20135 |                 pandas.core.indexing
import time:       188 |        188 |                 pandas.core.sample
import time:       154 |        154 |                 pandas.core.array_algos.replace
import time:       977 |        977 |                       pandas._libs.writers
import time:       692 |       1668 |                     pandas.core.internals.blocks
import time:       179 |       1847 |                   pandas.core.internals.api
import time:       141 |        141 |                     pandas.core.internals.base
import time:       337 |        477 |                   pandas.core.internals.array_manager
import time:       288 |        288 |                       pandas.core.internals.ops
import time:       724 |       1012 |                     pandas.core.internals.managers
import time:       309 |       1320 |                   pandas.core.internals.concat
import time:       192 |       3834 |                 pandas.core.internals
import time:       472 |        472 |                 pandas.core.internals.construction
import time:        87 |         87 |                   pandas.core.methods
import time:       180 |        180 |                     pandas.core.reshape
import time:       359 |        538 |                   pandas.core.reshape.concat
import time:       603 |        603 |                       dataclasses
import time:       520 |        520 |                       mmap
import time:        31 |         31 |                         pwd
import time:       419 |        419 |                         grp
import time:       750 |       1199 |                       tarfile
import time:       257 |        257 |                       pandas.core.shared_docs
import time:      1064 |       3640 |                     pandas.io.common
import time:       567 |       4207 |                   pandas.io.formats.format
import time:       190 |       5022 |                 pandas.core.methods.describe
import time:       145 |        145 |                       pandas._libs.window
import time:      1384 |       1529 |                     pandas._libs.window.aggregations
import time:       807 |        807 |                       pandas._libs.window.indexers
import time:       257 |       1064 |                     pandas.core.indexers.objects
import time:        98 |         98 |                     pandas.core.util.numba_
import time:        88 |         88 |                     pandas.core.window.common
import time:       137 |        137 |                     pandas.core.window.doc
import time:       217 |        217 |                     pandas.core.window.numba_
import time:       122 |        122 |                     pandas.core.window.online
import time:       166 |        166 |                       pandas.core._numba
import time:       180 |        180 |                       pandas.core._numba.executor
import time:      1115 |       1460 |                     pandas.core.window.rolling
import time:       382 |       5093 |                   pandas.core.window.ewm
import time:       551 |        551 |                   pandas.core.window.expanding
import time:        95 |       5738 |                 pandas.core.window
import time:      7612 |      43152 |               pandas.core.generic
import time:       130 |        130 |               pandas.core.methods.selectn
import time:       152 |        152 |                 pandas.core.reshape.util
import time:       109 |        109 |                 pandas.core.tools.numeric
import time:       295 |        555 |               pandas.core.reshape.melt
import time:      1249 |       1249 |                 pandas._libs.reshape
import time:       350 |        350 |                 pandas.core.indexes.accessors
import time:       184 |        184 |                   pandas.arrays
import time:       364 |        547 |                 pandas.core.tools.datetimes
import time:       656 |        656 |                 pandas.io.formats.info
import time:       855 |        855 |                   pandas.plotting._core
import time:       273 |        273 |                   pandas.plotting._misc
import time:       196 |       1322 |                 pandas.plotting
import time:      1977 |       6100 |               pandas.core.series
import time:      3937 |      53871 |             pandas.core.frame
import time:       651 |        651 |             pandas.core.groupby.base
import time:        47 |         47 |                 backports_abc
import time:      2257 |       2304 |               pandas._libs.groupby
import time:       197 |        197 |               pandas.core.groupby.numba_
import time:       150 |        150 |                   pandas.core.groupby.categorical
import time:       331 |        480 |                 pandas.core.groupby.grouper
import time:       445 |        924 |               pandas.core.groupby.ops
import time:       221 |        221 |               pandas.core.groupby.indexing
import time:      1279 |       4924 |             pandas.core.groupby.groupby
import time:      1189 |      61941 |           pandas.core.groupby.generic
import time:       158 |      62098 |         pandas.core.groupby
import time:       228 |     131268 |       pandas.core.api
import time:       218 |        218 |       pandas.tseries.api
import time:        75 |         75 |               pandas.core.computation.common
import time:       244 |        318 |             pandas.core.computation.align
import time:       296 |        296 |                 pprint
import time:       146 |        442 |               pandas.core.computation.scope
import time:       201 |        642 |             pandas.core.computation.ops
import time:       149 |       1108 |           pandas.core.computation.engines
import time:        86 |         86 |             pandas.core.computation.parsing
import time:       570 |        656 |           pandas.core.computation.expr
import time:       211 |       1974 |         pandas.core.computation.eval
import time:       142 |       2115 |       pandas.core.computation.api
import time:       204 |        204 |         pandas.core.reshape.encoding
import time:      1045 |       1045 |             _uuid
import time:       419 |       1463 |           uuid
import time:       590 |       2052 |         pandas.core.reshape.merge
import time:       436 |        436 |         pandas.core.reshape.pivot
import time:       239 |        239 |         pandas.core.reshape.tile
import time:        94 |       3023 |       pandas.core.reshape.api
import time:       183 |        183 |         pandas.api.extensions
import time:       222 |        222 |         pandas.api.indexers
import time:       146 |        146 |             pandas.core.interchange
import time:       557 |        702 |           pandas.core.interchange.dataframe_protocol
import time:       186 |        186 |                 pandas.core.dtypes.api
import time:        83 |        268 |               pandas.api.types
import time:       117 |        384 |             pandas.core.interchange.utils
import time:       283 |        667 |           pandas.core.interchange.from_dataframe
import time:        77 |       1445 |         pandas.api.interchange
import time:       203 |       2052 |       pandas.api
import time:        84 |         84 |             pandas._testing._random
import time:       373 |        373 |               tempfile
import time:        97 |        469 |             pandas._testing.contexts
import time:       137 |        689 |           pandas._testing._io
import time:        96 |         96 |           pandas._testing._warnings
import time:       426 |        426 |               cmath
import time:       795 |       1221 |             pandas._libs.testing
import time:       175 |       1395 |           pandas._testing.asserters
import time:        76 |         76 |           pandas._testing.compat
import time:       795 |       3050 |         pandas._testing
import time:       170 |       3219 |       pandas.testing
import time:       201 |        201 |       pandas.util._print_versions
import time:       185 |        185 |         pandas.io.clipboards
import time:        41 |         41 |               backports_abc
import time:      1729 |       1770 |             pandas._libs.parsers
import time:       230 |        230 |             pandas.io.excel._util
import time:       295 |        295 |                   pandas.io.parsers.base_parser
import time:       147 |        441 |                 pandas.io.parsers.arrow_parser_wrapper
import time:       125 |        125 |                 pandas.io.parsers.c_parser_wrapper
import time:       190 |        190 |                 pandas.io.parsers.python_parser
import time:      1070 |       1825 |               pandas.io.parsers.readers
import time:       186 |       2010 |             pandas.io.parsers
import time:       124 |        124 |             pandas.io.excel._odfreader
import time:       276 |        276 |             pandas.io.excel._openpyxl
import time:       155 |        155 |             pandas.io.excel._pyxlsb
import time:       221 |        221 |             pandas.io.excel._xlrd
import time:       858 |       5641 |           pandas.io.excel._base
import time:       442 |        442 |             pandas._libs.json
import time:       121 |        562 |           pandas.io.excel._odswriter
import time:       115 |        115 |           pandas.io.excel._xlsxwriter
import time:       101 |       6418 |         pandas.io.excel
import time:       127 |        127 |         pandas.io.feather_format
import time:       181 |        181 |         pandas.io.gbq
import time:       439 |        439 |         pandas.io.html
import time:       101 |        101 |             pandas.io.json._normalize
import time:        93 |         93 |             pandas.io.json._table_schema
import time:       865 |       1059 |           pandas.io.json._json
import time:        92 |       1150 |         pandas.io.json
import time:       110 |        110 |         pandas.io.orc
import time:       238 |        238 |         pandas.io.parquet
import time:       227 |        227 |           pandas.compat.pickle_compat
import time:       266 |        493 |         pandas.io.pickle
import time:       283 |        283 |           pandas.core.computation.pytables
import time:       896 |       1179 |         pandas.io.pytables
import time:       332 |        332 |           pandas.io.sas.sasreader
import time:       161 |        492 |         pandas.io.sas
import time:        81 |         81 |         pandas.io.spss
import time:       485 |        485 |         pandas.io.sql
import time:       883 |        883 |         pandas.io.stata
import time:       547 |        547 |         pandas.io.xml
import time:       233 |      13232 |       pandas.io.api
import time:       213 |        213 |       pandas.util._tester
import time:        75 |         75 |       pandas._version
import time:       890 |     431606 |     pandas
import time:       213 |        213 |         kedro
import time:       179 |        179 |               cachetools.keys
import time:       497 |        675 |             cachetools
import time:       166 |        166 |             kedro.utils
import time:       466 |       1307 |           kedro.io.core
import time:       205 |        205 |           kedro.io.memory_dataset
import time:       264 |       1775 |         kedro.io.cached_dataset
import time:       407 |        407 |           difflib
import time:       288 |        694 |         kedro.io.data_catalog
import time:       147 |        147 |         kedro.io.lambda_dataset
import time:       153 |        153 |         kedro.io.partitioned_dataset
import time:       245 |       3225 |       kedro.io
import time:         8 |       3232 |     kedro.io.core
import time:       171 |        171 |     kedro_datasets._io
import time:       421 |     539564 |   kedro_datasets.pandas.csv_dataset
import time:        40 |         40 |     deltalake
import time:       199 |        238 |   kedro_datasets.pandas.deltatable_dataset
import time:       151 |        151 |   kedro_datasets.pandas.excel_dataset
import time:       111 |        111 |   kedro_datasets.pandas.feather_dataset
import time:        37 |         37 |       google
import time:        10 |         47 |     google.cloud
import time:       237 |        283 |   kedro_datasets.pandas.gbq_dataset
import time:       203 |        203 |   kedro_datasets.pandas.hdf_dataset
import time:       242 |        242 |   kedro_datasets.pandas.json_dataset
import time:       134 |        134 |   kedro_datasets.pandas.parquet_dataset
import time:       402 |        402 |           sqlalchemy.util.compat
import time:       559 |        559 |           sqlalchemy.cimmutabledict
import time:       687 |       1647 |         sqlalchemy.util._collections
import time:       195 |        195 |         sqlalchemy.util._preloaded
import time:        36 |         36 |           greenlet
import time:        80 |         80 |           sqlalchemy.util._compat_py3k
import time:       199 |        315 |         sqlalchemy.util.concurrency
import time:       712 |        712 |             sqlalchemy.exc
import time:       493 |       1205 |           sqlalchemy.util.langhelpers
import time:       122 |       1327 |         sqlalchemy.util.deprecations
import time:      1024 |       4505 |       sqlalchemy.util
import time:       552 |        552 |                     sqlalchemy.sql.roles
import time:       409 |        409 |                     sqlalchemy.sql.visitors
import time:       374 |        374 |                       sqlalchemy.sql.operators
import time:       287 |        287 |                       sqlalchemy.inspection
import time:       508 |       1168 |                     sqlalchemy.sql.traversals
import time:      1409 |       3536 |                   sqlalchemy.sql.base
import time:       773 |        773 |                     sqlalchemy.sql.coercions
import time:       548 |        548 |                               sqlalchemy.sql.type_api
import time:       249 |        249 |                               sqlalchemy.sql.annotation
import time:      1812 |       2608 |                             sqlalchemy.sql.elements
import time:        95 |         95 |                                     sqlalchemy.event.legacy
import time:       104 |        104 |                                     sqlalchemy.event.registry
import time:       324 |        521 |                                   sqlalchemy.event.attr
import time:       335 |        856 |                                 sqlalchemy.event.base
import time:       374 |       1229 |                               sqlalchemy.event.api
import time:       202 |       1430 |                             sqlalchemy.event
import time:       560 |        560 |                               sqlalchemy.cprocessors
import time:       327 |        886 |                             sqlalchemy.processors
import time:      1479 |       6402 |                           sqlalchemy.sql.sqltypes
import time:       252 |       6653 |                         sqlalchemy.types
import time:        88 |         88 |                             sqlalchemy.util.topological
import time:      1113 |       1200 |                           sqlalchemy.sql.ddl
import time:      3683 |       3683 |                             sqlalchemy.sql.selectable
import time:      1553 |       5235 |                           sqlalchemy.sql.schema
import time:       443 |       6878 |                         sqlalchemy.sql.util
import time:      1107 |      14637 |                       sqlalchemy.sql.dml
import time:       334 |      14971 |                     sqlalchemy.sql.crud
import time:      1652 |       1652 |                     sqlalchemy.sql.functions
import time:      1631 |      19026 |                   sqlalchemy.sql.compiler
import time:       377 |        377 |                     sqlalchemy.sql.lambdas
import time:      1330 |       1706 |                   sqlalchemy.sql.expression
import time:       616 |        616 |                   sqlalchemy.sql.events
import time:       198 |        198 |                   sqlalchemy.sql.naming
import time:       253 |        253 |                   sqlalchemy.sql.default_comparator
import time:      3060 |      28393 |                 sqlalchemy.sql
import time:         8 |      28401 |               sqlalchemy.sql.compiler
import time:       393 |      28794 |             sqlalchemy.engine.interfaces
import time:       103 |        103 |             sqlalchemy.engine.util
import time:       231 |        231 |             sqlalchemy.log
import time:      1402 |      30528 |           sqlalchemy.engine.base
import time:      1426 |      31954 |         sqlalchemy.engine.events
import time:       192 |        192 |             sqlalchemy.dialects
import time:       266 |        458 |           sqlalchemy.engine.url
import time:       208 |        208 |           sqlalchemy.engine.mock
import time:       371 |        371 |               sqlalchemy.pool.base
import time:       832 |       1202 |             sqlalchemy.pool.events
import time:       142 |        142 |                 sqlalchemy.util.queue
import time:       185 |        327 |               sqlalchemy.pool.impl
import time:       155 |        481 |             sqlalchemy.pool.dbapi_proxy
import time:       198 |       1880 |           sqlalchemy.pool
import time:       492 |       3036 |         sqlalchemy.engine.create
import time:       311 |        311 |               sqlalchemy.cresultproxy
import time:       477 |        788 |             sqlalchemy.engine.row
import time:       766 |       1554 |           sqlalchemy.engine.result
import time:       539 |       2093 |         sqlalchemy.engine.cursor
import time:       549 |        549 |         sqlalchemy.engine.reflection
import time:       379 |      38009 |       sqlalchemy.engine
import time:       117 |        117 |       sqlalchemy.schema
import time:        78 |         78 |       sqlalchemy.events
import time:       202 |        202 |         sqlalchemy.engine.characteristics
import time:       601 |        803 |       sqlalchemy.engine.default
import time:       452 |      43961 |     sqlalchemy
import time:       354 |      44315 |   kedro_datasets.pandas.sql_dataset
import time:       137 |        137 |   kedro_datasets.pandas.xml_dataset
import time:       121 |        121 |   kedro_datasets.pandas.generic_dataset
import time:      1944 |     595694 | kedro_datasets.pandas

Note the import of everything in kedro_datasets.pandas. I happen to have SQLAlchemy installed in this test env (testing #281), so all those imports also trigger. You can easily infer that this is going to be even slower if you have a lot of extra dependencies in your env, that let you successfully import each of these dataset modules, even though they're not needed.

Invalid import behavior

On perf/datasets/lazy-loader branch:

Output of python -X importtime -c'import kedro_datasets.spark;kedro_datasets.spark.SparkDataSet':

import time: self [us] | cumulative | imported package
import time:       203 |        203 |   _io
import time:        30 |         30 |   marshal
import time:       289 |        289 |   posix
import time:       716 |       1236 | _frozen_importlib_external
import time:       621 |        621 |   time
import time:       175 |        795 | zipimport
import time:       150 |        150 |     _codecs
import time:       972 |       1121 |   codecs
import time:      1198 |       1198 |   encodings.aliases
import time:      2652 |       4970 | encodings
import time:       768 |        768 | encodings.utf_8
import time:        76 |         76 | _signal
import time:        52 |         52 |     _abc
import time:       627 |        678 |   abc
import time:       690 |       1368 | io
import time:        36 |         36 |       _stat
import time:       393 |        428 |     stat
import time:      1164 |       1164 |     _collections_abc
import time:       395 |        395 |       genericpath
import time:       708 |       1103 |     posixpath
import time:       814 |       3508 |   os
import time:       700 |        700 |   _sitebuiltins
import time:      1434 |       1434 |   _distutils_hack
import time:       141 |        141 |   sitecustomize
import time:        34 |         34 |   usercustomize
import time:      4645 |      10459 | site
import time:       485 |        485 |   kedro_datasets
import time:        62 |         62 |       itertools
import time:       432 |        432 |       keyword
import time:        95 |         95 |         _operator
import time:       462 |        556 |       operator
import time:       407 |        407 |       reprlib
import time:        34 |         34 |       _collections
import time:      1504 |       2994 |     collections
import time:       501 |        501 |     collections.abc
import time:       636 |        636 |         types
import time:        32 |         32 |         _functools
import time:       690 |       1357 |       functools
import time:       592 |       1949 |     contextlib
import time:       982 |        982 |       enum
import time:        35 |         35 |         _sre
import time:       497 |        497 |           sre_constants
import time:       503 |       1000 |         sre_parse
import time:       495 |       1529 |       sre_compile
import time:        39 |         39 |       _locale
import time:       383 |        383 |       copyreg
import time:       665 |       3597 |     re
import time:      1554 |      10592 |   typing
import time:        39 |         39 |       _ast
import time:       915 |        953 |     ast
import time:       708 |        708 |       warnings
import time:       520 |       1227 |     importlib
import time:       327 |        327 |       importlib._abc
import time:       445 |        772 |     importlib.util
import time:       750 |        750 |           _opcode
import time:       436 |       1186 |         opcode
import time:       554 |       1739 |       dis
import time:       389 |        389 |       importlib.machinery
import time:       516 |        516 |           token
import time:       668 |       1183 |         tokenize
import time:       469 |       1652 |       linecache
import time:       997 |       4776 |     inspect
import time:       472 |       8198 |   lazy_loader
import time:      2236 |      21510 | kedro_datasets.spark
import time:       637 |        637 |       _json
import time:       453 |       1089 |     json.scanner
import time:       379 |       1468 |   json.decoder
import time:       537 |        537 |   json.encoder
import time:       397 |       2402 | json
import time:       323 |        323 |   traceback
import time:       369 |        369 |     _weakrefset
import time:       424 |        793 |   weakref
import time:        17 |         17 |     _string
import time:       573 |        590 |   string
import time:       977 |        977 |   threading
import time:        19 |         19 |   atexit
import time:      1471 |       4170 | logging
import time:        39 |         39 |       org
import time:         9 |         48 |     org.python
import time:        16 |         64 |   org.python.core
import time:       414 |        477 | copy
import time:       242 |        242 | fnmatch
import time:        39 |         39 |     _winapi
import time:        36 |         36 |     nt
import time:        35 |         35 |     nt
import time:        33 |         33 |     nt
import time:        32 |         32 |     nt
import time:       432 |        605 |   ntpath
import time:        27 |         27 |   errno
import time:       488 |        488 |     urllib
import time:       675 |       1163 |   urllib.parse
import time:      1084 |       2877 | pathlib
import time:       709 |        709 |       _csv
import time:       800 |       1509 |     csv
import time:       468 |        468 |     email
import time:      1959 |       1959 |       binascii
import time:       752 |        752 |         zlib
import time:       352 |        352 |           _compression
import time:      1753 |       1753 |           _bz2
import time:       417 |       2520 |         bz2
import time:      2105 |       2105 |           _lzma
import time:       272 |       2377 |         lzma
import time:       552 |       6199 |       shutil
import time:       691 |        691 |         _struct
import time:       296 |        987 |       struct
import time:       694 |       9838 |     zipfile
import time:       792 |        792 |     textwrap
import time:       265 |        265 |         uu
import time:       192 |        192 |         quopri
import time:       800 |        800 |             math
import time:       560 |        560 |               _bisect
import time:       457 |       1016 |             bisect
import time:       827 |        827 |             _random
import time:       773 |        773 |             _sha512
import time:       449 |       3863 |           random
import time:      1218 |       1218 |             _socket
import time:       711 |        711 |               select
import time:       539 |       1250 |             selectors
import time:      1034 |       1034 |             array
import time:       735 |       4236 |           socket
import time:       782 |        782 |             _datetime
import time:       814 |       1596 |           datetime
import time:       586 |        586 |               locale
import time:       594 |       1179 |             calendar
import time:       374 |       1553 |           email._parseaddr
import time:       306 |        306 |               base64
import time:       284 |        589 |             email.base64mime
import time:       304 |        304 |             email.quoprimime
import time:       412 |        412 |             email.errors
import time:       303 |        303 |             email.encoders
import time:       393 |       1999 |           email.charset
import time:       342 |      13585 |         email.utils
import time:       741 |        741 |           email.header
import time:      1393 |       2134 |         email._policybase
import time:       382 |        382 |         email._encoded_words
import time:       201 |        201 |         email.iterators
import time:       406 |      17161 |       email.message
import time:       261 |        261 |         importlib.metadata._functools
import time:       297 |        557 |       importlib.metadata._text
import time:       440 |      18157 |     importlib.metadata._adapters
import time:       360 |        360 |     importlib.metadata._meta
import time:       374 |        374 |     importlib.metadata._collections
import time:       158 |        158 |     importlib.metadata._itertools
import time:       780 |        780 |     importlib.abc
import time:       858 |      33288 |   importlib.metadata
import time:       728 |        728 |   fsspec._version
import time:       295 |        295 |       concurrent
import time:       364 |        364 |       concurrent.futures._base
import time:       317 |        975 |     concurrent.futures
import time:       754 |        754 |           _heapq
import time:       405 |       1158 |         heapq
import time:       405 |        405 |         _queue
import time:       297 |       1859 |       queue
import time:       398 |       2257 |     concurrent.futures.thread
import time:       356 |       3587 |   fsspec.caching
import time:       276 |        276 |   fsspec.callbacks
import time:       328 |        328 |       __future__
import time:      9614 |       9614 |         _hashlib
import time:       380 |        380 |         _blake2
import time:       453 |      10446 |       hashlib
import time:       359 |      11131 |     fsspec.utils
import time:       504 |        504 |       glob
import time:      1002 |       1002 |         configparser
import time:       113 |       1114 |       fsspec.config
import time:       106 |        106 |       fsspec.dircache
import time:       346 |        346 |       fsspec.transaction
import time:       546 |       2614 |     fsspec.spec
import time:        50 |         50 |     isal
import time:       500 |        500 |     gzip
import time:        43 |         43 |     lzmaffi
import time:        36 |         36 |     snappy
import time:        32 |         32 |       lz4
import time:         7 |         38 |     lz4.frame
import time:        30 |         30 |     zstandard
import time:       216 |      14655 |   fsspec.compression
import time:       257 |        257 |     fsspec.registry
import time:      3177 |       3433 |   fsspec.core
import time:      1033 |       1033 |           signal
import time:      1208 |       1208 |           fcntl
import time:        42 |         42 |           msvcrt
import time:       637 |        637 |           _posixsubprocess
import time:       473 |       3390 |         subprocess
import time:      2770 |       2770 |           _ssl
import time:      1453 |       4222 |         ssl
import time:       414 |        414 |         asyncio.constants
import time:       195 |        195 |             asyncio.format_helpers
import time:       208 |        403 |           asyncio.base_futures
import time:       152 |        152 |           asyncio.log
import time:       324 |        879 |         asyncio.coroutines
import time:       804 |        804 |             _contextvars
import time:       382 |       1186 |           contextvars
import time:       348 |        348 |             asyncio.exceptions
import time:       258 |        258 |             asyncio.base_tasks
import time:       491 |       1096 |           _asyncio
import time:       368 |       2648 |         asyncio.events
import time:       232 |        232 |         asyncio.futures
import time:       329 |        329 |         asyncio.protocols
import time:       402 |        402 |           asyncio.transports
import time:       319 |        721 |         asyncio.sslproto
import time:       329 |        329 |             asyncio.mixins
import time:       397 |        397 |             asyncio.tasks
import time:       599 |       1324 |           asyncio.locks
import time:       362 |       1685 |         asyncio.staggered
import time:       351 |        351 |         asyncio.trsock
import time:       807 |      15675 |       asyncio.base_events
import time:       226 |        226 |       asyncio.runners
import time:       321 |        321 |       asyncio.queues
import time:       299 |        299 |       asyncio.streams
import time:       241 |        241 |       asyncio.subprocess
import time:       177 |        177 |       asyncio.threads
import time:       226 |        226 |         asyncio.base_subprocess
import time:       394 |        394 |         asyncio.selector_events
import time:       477 |       1096 |       asyncio.unix_events
import time:      1806 |      19838 |     asyncio
import time:       100 |      19937 |   fsspec.exceptions
import time:       250 |        250 |   fsspec.mapping
import time:     18385 |      94535 | fsspec
import time:        47 |         47 | hdfs
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/miniconda3/envs/kedro-test/lib/python3.10/site-packages/lazy_loader/__init__.py", line 77, in __getattr__
    submod = importlib.import_module(submod_path)
  File "/opt/miniconda3/envs/kedro-test/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/Users/deepyaman/github/kedro-org/kedro-plugins/kedro-datasets/kedro_datasets/spark/spark_dataset.py", line 15, in <module>
    from hdfs import HdfsError, InsecureClient
ModuleNotFoundError: No module named 'hdfs'

On main branch:

Output of python -X importtime -c'import kedro_datasets.spark;kedro_datasets.spark.SparkDataSet':

import time: self [us] | cumulative | imported package
import time:       383 |        383 |   _io
import time:        32 |         32 |   marshal
import time:       576 |        576 |   posix
import time:      1034 |       2024 | _frozen_importlib_external
import time:       524 |        524 |   time
import time:       140 |        664 | zipimport
import time:       252 |        252 |     _codecs
import time:       863 |       1115 |   codecs
import time:       992 |        992 |   encodings.aliases
import time:      3075 |       5181 | encodings
import time:       727 |        727 | encodings.utf_8
import time:       213 |        213 | _signal
import time:       166 |        166 |     _abc
import time:       570 |        736 |   abc
import time:       835 |       1570 | io
import time:        30 |         30 |       _stat
import time:       391 |        421 |     stat
import time:      1302 |       1302 |     _collections_abc
import time:       385 |        385 |       genericpath
import time:       662 |       1047 |     posixpath
import time:       774 |       3541 |   os
import time:       778 |        778 |   _sitebuiltins
import time:      1376 |       1376 |   _distutils_hack
import time:       121 |        121 |   sitecustomize
import time:        31 |         31 |   usercustomize
import time:      5750 |      11594 | site
import time:       406 |        406 |   kedro_datasets
import time:       476 |        476 |       itertools
import time:       428 |        428 |       keyword
import time:        62 |         62 |         _operator
import time:       438 |        499 |       operator
import time:       401 |        401 |       reprlib
import time:        37 |         37 |       _collections
import time:      1041 |       2879 |     collections
import time:       622 |        622 |       types
import time:        33 |         33 |       _functools
import time:       628 |       1282 |     functools
import time:       619 |       4779 |   contextlib
import time:       880 |        880 |           enum
import time:        87 |         87 |             _sre
import time:       436 |        436 |               sre_constants
import time:       472 |        908 |             sre_parse
import time:       490 |       1484 |           sre_compile
import time:        39 |         39 |           _locale
import time:       358 |        358 |           copyreg
import time:       747 |       3507 |         re
import time:      1108 |       1108 |           _json
import time:       518 |       1625 |         json.scanner
import time:      2321 |       7452 |       json.decoder
import time:       496 |        496 |       json.encoder
import time:      4259 |      12205 |     json
import time:       507 |        507 |             token
import time:       677 |       1184 |           tokenize
import time:       520 |       1703 |         linecache
import time:       428 |       2130 |       traceback
import time:       783 |        783 |       warnings
import time:       393 |        393 |         _weakrefset
import time:       379 |        772 |       weakref
import time:       439 |        439 |       collections.abc
import time:        19 |         19 |         _string
import time:       582 |        600 |       string
import time:       669 |        669 |       threading
import time:        20 |         20 |       atexit
import time:      1596 |       7005 |     logging
import time:        41 |         41 |           org
import time:        10 |         51 |         org.python
import time:         9 |         60 |       org.python.core
import time:       424 |        483 |     copy
import time:       254 |        254 |     fnmatch
import time:        42 |         42 |         _winapi
import time:        37 |         37 |         nt
import time:        34 |         34 |         nt
import time:        34 |         34 |         nt
import time:        33 |         33 |         nt
import time:       513 |        692 |       ntpath
import time:        30 |         30 |       errno
import time:       513 |        513 |         urllib
import time:       677 |       1189 |       urllib.parse
import time:      1120 |       3029 |     pathlib
import time:      1384 |       1384 |     typing
import time:       333 |        333 |         importlib
import time:      1061 |       1061 |           _csv
import time:       757 |       1818 |         csv
import time:       499 |        499 |         email
import time:      1912 |       1912 |           binascii
import time:       313 |        313 |             importlib._abc
import time:       392 |        705 |           importlib.util
import time:       465 |        465 |             zlib
import time:       329 |        329 |               _compression
import time:      1539 |       1539 |               _bz2
import time:       445 |       2311 |             bz2
import time:      1512 |       1512 |               _lzma
import time:       296 |       1807 |             lzma
import time:       620 |       5202 |           shutil
import time:       434 |        434 |             _struct
import time:       313 |        747 |           struct
import time:       646 |       9209 |         zipfile
import time:       786 |        786 |         textwrap
import time:       226 |        226 |             uu
import time:      1282 |       1282 |             quopri
import time:       616 |        616 |                 math
import time:       413 |        413 |                   _bisect
import time:       489 |        902 |                 bisect
import time:       820 |        820 |                 _random
import time:       442 |        442 |                 _sha512
import time:       458 |       3235 |               random
import time:       937 |        937 |                 _socket
import time:       718 |        718 |                   select
import time:       519 |       1236 |                 selectors
import time:       569 |        569 |                 array
import time:       761 |       3502 |               socket
import time:       622 |        622 |                 _datetime
import time:       796 |       1417 |               datetime
import time:       586 |        586 |                   locale
import time:       723 |       1308 |                 calendar
import time:       427 |       1735 |               email._parseaddr
import time:       267 |        267 |                   base64
import time:       293 |        559 |                 email.base64mime
import time:       373 |        373 |                 email.quoprimime
import time:       523 |        523 |                 email.errors
import time:       381 |        381 |                 email.encoders
import time:       418 |       2251 |               email.charset
import time:       360 |      12497 |             email.utils
import time:       631 |        631 |               email.header
import time:       428 |       1058 |             email._policybase
import time:       324 |        324 |             email._encoded_words
import time:       177 |        177 |             email.iterators
import time:       454 |      16016 |           email.message
import time:       290 |        290 |             importlib.metadata._functools
import time:       280 |        570 |           importlib.metadata._text
import time:       409 |      16994 |         importlib.metadata._adapters
import time:       392 |        392 |         importlib.metadata._meta
import time:       381 |        381 |         importlib.metadata._collections
import time:       159 |        159 |         importlib.metadata._itertools
import time:       305 |        305 |           importlib.machinery
import time:       440 |        745 |         importlib.abc
import time:       883 |      32194 |       importlib.metadata
import time:       734 |        734 |       fsspec._version
import time:       335 |        335 |           concurrent
import time:       423 |        423 |           concurrent.futures._base
import time:       330 |       1087 |         concurrent.futures
import time:       704 |        704 |               _heapq
import time:       234 |        938 |             heapq
import time:       680 |        680 |             _queue
import time:       252 |       1869 |           queue
import time:       355 |       2223 |         concurrent.futures.thread
import time:       369 |       3678 |       fsspec.caching
import time:       262 |        262 |       fsspec.callbacks
import time:       282 |        282 |           __future__
import time:     25141 |      25141 |             _hashlib
import time:       402 |        402 |             _blake2
import time:       879 |      26422 |           hashlib
import time:       394 |      27097 |         fsspec.utils
import time:       482 |        482 |           glob
import time:       956 |        956 |             configparser
import time:       114 |       1070 |           fsspec.config
import time:       115 |        115 |           fsspec.dircache
import time:       215 |        215 |           fsspec.transaction
import time:      1158 |       3038 |         fsspec.spec
import time:        40 |         40 |         isal
import time:       449 |        449 |         gzip
import time:        39 |         39 |         lzmaffi
import time:        34 |         34 |         snappy
import time:        32 |         32 |           lz4
import time:         6 |         38 |         lz4.frame
import time:        31 |         31 |         zstandard
import time:       204 |      30967 |       fsspec.compression
import time:       231 |        231 |         fsspec.registry
import time:       166 |        396 |       fsspec.core
import time:       542 |        542 |               signal
import time:      1130 |       1130 |               fcntl
import time:        36 |         36 |               msvcrt
import time:       781 |        781 |               _posixsubprocess
import time:       427 |       2914 |             subprocess
import time:      2869 |       2869 |               _ssl
import time:      1358 |       4227 |             ssl
import time:       413 |        413 |             asyncio.constants
import time:       125 |        125 |                   _ast
import time:       698 |        823 |                 ast
import time:       371 |        371 |                     _opcode
import time:       244 |        614 |                   opcode
import time:       525 |       1139 |                 dis
import time:       886 |       2847 |               inspect
import time:       200 |        200 |                 asyncio.format_helpers
import time:       247 |        447 |               asyncio.base_futures
import time:       156 |        156 |               asyncio.log
import time:       294 |       3742 |             asyncio.coroutines
import time:       807 |        807 |                 _contextvars
import time:       383 |       1189 |               contextvars
import time:       374 |        374 |                 asyncio.exceptions
import time:       269 |        269 |                 asyncio.base_tasks
import time:       453 |       1095 |               _asyncio
import time:       331 |       2614 |             asyncio.events
import time:       243 |        243 |             asyncio.futures
import time:       332 |        332 |             asyncio.protocols
import time:       435 |        435 |               asyncio.transports
import time:       271 |        706 |             asyncio.sslproto
import time:       326 |        326 |                 asyncio.mixins
import time:       393 |        393 |                 asyncio.tasks
import time:       272 |        990 |               asyncio.locks
import time:       393 |       1382 |             asyncio.staggered
import time:       615 |        615 |             asyncio.trsock
import time:       675 |      17857 |           asyncio.base_events
import time:       287 |        287 |           asyncio.runners
import time:       326 |        326 |           asyncio.queues
import time:       302 |        302 |           asyncio.streams
import time:       239 |        239 |           asyncio.subprocess
import time:       183 |        183 |           asyncio.threads
import time:       230 |        230 |             asyncio.base_subprocess
import time:       354 |        354 |             asyncio.selector_events
import time:       522 |       1105 |           asyncio.unix_events
import time:       350 |      20645 |         asyncio
import time:        92 |      20736 |       fsspec.exceptions
import time:       255 |        255 |       fsspec.mapping
import time:     13816 |     103035 |     fsspec
import time:        47 |         47 |     hdfs
import time:       518 |     127957 |   kedro_datasets.spark.spark_dataset
import time:       352 |        352 |       _compat_pickle
import time:      1128 |       1128 |       _pickle
import time:        37 |         37 |           org
import time:         7 |         43 |         org.python
import time:         7 |         49 |       org.python.core
import time:       566 |       2094 |     pickle
import time:        35 |         35 |       pyspark
import time:         7 |         42 |     pyspark.sql
import time:       214 |       2348 |   kedro_datasets.spark.spark_hive_dataset
import time:        35 |         35 |       pyspark
import time:         7 |         41 |     pyspark.sql
import time:       197 |        238 |   kedro_datasets.spark.spark_jdbc_dataset
import time:        33 |         33 |       delta
import time:         6 |         39 |     delta.tables
import time:       195 |        234 |   kedro_datasets.spark.deltatable_dataset
import time:        33 |         33 |       pyspark
import time:         7 |         39 |     pyspark.sql
import time:        90 |        129 |   kedro_datasets.spark.spark_streaming_dataset
import time:      2911 |     138998 | kedro_datasets.spark
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'kedro_datasets.spark' has no attribute 'SparkDataSet'

Note that kedro_datasets.spark.spark_dataset, kedro_datasets.spark.spark_hive_dataset, kedro_datasets.spark.spark_jdbc_dataset, kedro_datasets.spark.deltatable_dataset, and kedro_datasets.spark.spark_streaming_dataset were all attempted to be imported. Also note that the resulting error (AttributeError: module 'kedro_datasets.spark' has no attribute 'SparkDataSet') is less helpful, and arguably inaccurate.

Checklist

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the relevant RELEASE.md file
  • Added tests to cover my changes

@deepyaman deepyaman marked this pull request as ready for review July 29, 2023 16:44
@deepyaman deepyaman self-assigned this Jul 29, 2023
@deepyaman deepyaman requested a review from merelcht July 29, 2023 16:45
@noklam
Copy link
Contributor

noklam commented Jul 30, 2023

Trying to link issues properly.

Wonder how different is this approach compare to kedro-org/kedro#2702. Does this also affect Conceal tracebacks for managed exceptions#2401?

In addition, Kedro's CLI is quite slow especially with plugins installed. Potentially helps with this too? kedro-org/kedro#1476

I also wonder could it be useful for lazy loading pipelines?

Sorry for lots of questions, I haven't had time to play with the library yet, just some quick thoughts coming out top of my head.

@noklam

This comment was marked as duplicate.

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This library is a great find. I think the error message has also massively improved 👍

@noklam
Copy link
Contributor

noklam commented Jul 31, 2023

kedro-org/kedro-viz#1159
Do you think this may solves the problem here?

@deepyaman
Copy link
Member Author

Trying to link issues properly.

Wonder how different is this approach compare to kedro-org/kedro#2702.

It's quite similar. The code for lazy_loader is actually quite small; you can see how it basically provides an out-of-the box of doing what you describe: https://github.com/scientific-python/lazy_loader/blob/main/lazy_loader/__init__.py

Does this also affect Conceal tracebacks for managed exceptions#2401?

Not as far as I can understand from that writeup, but I could be missing something.

In addition, Kedro's CLI is quite slow especially with plugins installed. Potentially helps with this too? kedro-org/kedro#1476

Potentially yes, I think so.

I also wonder could it be useful for lazy loading pipelines?

I think we need to be careful around how exactly we implement this, since we may not want to delay flagging potential problems by lazy importing.

Sorry for lots of questions, I haven't had time to play with the library yet, just some quick thoughts coming out top of my head.

@deepyaman deepyaman merged commit 3aad425 into main Jul 31, 2023
13 checks passed
@deepyaman deepyaman deleted the perf/datasets/lazy-loader branch July 31, 2023 13:00
@deepyaman
Copy link
Member Author

kedro-org/kedro-viz#1159
Do you think this may solves the problem here?

Quite possibly can help, yes, without thinking too much about it. :)

PtrBld pushed a commit to PtrBld/kedro-plugins that referenced this pull request Aug 27, 2023
* perf(datasets): lazily load datasets in init files (api)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (pandas)

Signed-off-by: Deepyaman Datta <[email protected]>

* fix(datasets): fix no name in module in api/pandas

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (biosequence)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (dask)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (databricks)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (email)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (geopandas)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (holoviews)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (json)

Signed-off-by: Deepyaman Datta <[email protected]>

* fix(datasets): resolve "too few public attributes"

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (matplotlib)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (networkx)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (pickle)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (pillow)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (plotly)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (polars)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (redis)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (snowflake)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (spark)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (svmlight)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (tensorflow)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (text)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (tracking)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (video)

Signed-off-by: Deepyaman Datta <[email protected]>

* perf(datasets): lazily load datasets in init files (yaml)

Signed-off-by: Deepyaman Datta <[email protected]>

* Update RELEASE.md

---------

Signed-off-by: Deepyaman Datta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants